Hypermedia APIs: Swagger is not user friendly

As a developer designing and implementing APIs for the past five years, integrating external services has always been a key component of developing the products.

As you sit in your design meeting, scrum, or stand-up the moment a new integration is mentioned, you can see the pause spread throughout the room and trepidation go through each colleague’s mind.  It is painfully obvious everyone is sharing variations on the same thoughts and questions.  How good is the documentation for this service? How long will it take to wade through the idiosyncrasies and bugs to a stable implementation?  Without knowing the specifics, everyone in the room is instantly aware of the landmines waiting for them.

These common concerns are entirely with cause, the quality range for services you may have to integrate provides a near limitless combination of difficulties.  The service being entirely undocumented isn’t even the worst case, as untrustworthy but thorough documentation can be much worse than discovery by trial and error.

Unfortunately, when implementing our own services, we often overlook or deprioritize the ease of use of our designs for the end user.  It’s an easy trap to fall into with deadlines and deliverables, that is precisely why it is so important we use designs and tools which make this simple.  Through the specification wars of the last 5 years, the CRUD-REST industry has settled on the Swagger specification (Open API Specification – OAS) as the standard for application design. While this represents real improvement over snowflake services, the use of a vocabulary driven hypermedia approach gives us all the beneficial properties of OAS as well as the long term benefits of flexibility, adaptability, and easing the burden on initial design perfection.

There are two primary problems with the solution provided by OAS namely, it tightly couples clients to the service through URLs, and requires orchestrating client changes in step with the service changes.

The first problem is easier to understand, by hardcoding the resource heirarchy to a URL and specific representation you now require tight and explicit versioning for clients to safely consume the service.  Any developer familiar with SOAP web services should be able to notice the similarities to OAS as the WSDL for a SOAP-like service without an envelope, using curly braces, and 3 extra HTTP methods.  The same arguments against the tight binding of the interface in SOAP services are becoming increasingly relevant when discussion the cons of OAS services.  The ramifications for this are immediately felt, but similar tooling has silenced detractors enough to satisfy the majority into adoption of this specification.

The second problem is much more nuanced, but far more frustrating to contend with as it is not immediately felt.  The design of SOAP and OAS lend themselves well to situations where the same group or company has control over both the service and the client.  If you distribute an SDK to wrap your service calls, or you distribute your own mobile applications, or support web applications under your control then the negative effects of the style aren’t felt until you need to perform the first major upgrade to these clients.  In this situation you can manage the negatives to a degree.  This difficulty is entirely unnecessary, but resisting the temptation to wait and deal with that problem when it comes up is hard to do.  You certainly are aware the process will be difficult while consuming resources and time, but the time and resources you are committing to the change management are in the future and your current deadlines are fast approaching.

The worst effects of this ill-advised tradeoff is felt when you are not in control of any portion of your APIs consumers.  This will be felt in cases as small as an internal microservices architecture or as large as your companies external APIs, and it will hit your bottom line directly.  If you deploy microservices which are tightly coupled to URLs and representations, you will need to manage the service dependency trees to fully deploy changes.  Assuming no changes made in one service results in a break in another, you have invited the complexity of a massive organization like Netflix to solve relatively small problem.  If this does result in a breaking change, you have lost a large portion of the benefits of a microservices architecture in tightly coupling two or more services which should be independent.  The benefits of the architectural style to the development team are obvious, but you may lose more time and resources managing the DevOps than you gain from development.  If your public facing APIs are forced to change, frequently requiring your consumers to modify their clients to meet your needs then you shouldn’t be surprised to see some of those clients explore or exit to your competitors.  When breaking changes are introduced this forces a slow release pattern, and requires your consumers as well as your team internally to manage multiple versions of your API.  As the difficulty of maintaining an integration with your service increases, the likelihood of your clients looking for alternative providers goes up from a real chance to a near certainty.

The obvious question you are probably asking is how is hypermedia any different?  If my clients bind to a domain vocabulary hasn’t this just moved the binding point with the same result?

The answer is no.  When transitioning from a CRUD API to a hypermedia API, you have moved from the realm of statically binding consumers to services to dynamic binding.  Hypermedia APIs by their nature should to be discovered at each use.  Hypermedia consumer clients should only ever have the root URL of the service statically bound.  The vocabularies can and should change over time to support changes to the understanding of the domain, or actual changes to the domain itself.  However, it is now possible to gracefully support clients as they migrate themselves at their own pace to newer portions of the vocabulary.  The client is no longer responsible for managing which version or effective version of your service they are interacting with on per call basis, the service handles this for the consumer.  Architecturally it may be necessary or easier for deployment to include multiple effective versions of a service to support this graceful transition, the key takeaway is the consumer is completely unaware of these URL changes.  The consumer is simply discovering, caching, and composing resource representations with metadata through links by their interaction with the service.  Any changes made would propagate to all clients by the end of the maximum caching period delay set by the service.  Any interactions with the service with now malformed or expired resource representations or moved resources can be managed by the ETag headers and HTTP 3xx response codes.  Clients are bound to the vocabulary, which means they are simply looking for resources and link rel-names they know, while caching information to reduce extra calls to the service for resource and service metadata.

This is a slightly more complex integration model, but the development of libraries to manage the increased complexity can release consumers from even more of their burdens, allowing them to focus on their true goal whether it is creating a UI or consuming the service for some useful purpose.

Leave a Reply