Setting your API reviewers up for success

The API Handyman AKA Arnaud Lauret, recently posted an excellent writeup discussing the difficulties API designers or design reviewers encounter when they venture into the realm of preference and opinion.

This post is a response and extension of his. If you haven’t then go read it, it’s short and good, I promise it’s worth your time.

Like Arnaud, I’ve performed this function in many organizations large and small for years and there’s a critical point he (likely purposefully) left out.

Errors and ambiguities in your design requirements, guidelines, and common practices prevent objective review process, setting teams and reviewers up for opinion oriented discussion.

Here’s the important thing to understand as an API designer; ambiguous design requirements and responsibilities are the primary genesis of these situations. The team’s seeking your help aren’t at fault, they simply followed your requirements in ways you didn’t anticipate. Take these occurrences as the gifts they are, they have shown you where your guidelines, processes, and procedures need further refinement.

As the old joke goes naming things is one of the hardest tasks in computer science. People have vastly divergent backgrounds and will interpret requirements differently and create wildly varying output.

Try to look at your guidelines through the fresh eyes of your newest users’. Do your best to avoid looking at with your experience and knowledgeable perspective.

Preferences and opinions have a place in API Design practices, there’s quite often many ‘right’ answers. Ultimately, it comes down to the API owner making a choice. When things works out that way, try out Arnaud’s advice. I think you’ll find it helpful.

Just make sure you don’t forget to look at each occurrence and question if the guidelines and requirements you’ve established led to an opinionated decision that should and could have been made based on facts too.

Watch for patterns of behavior from different teams and people that run counter to your expectations. Always look for places to develop your coaching strategy to get more mature designs from your teams. While you’re helping teams through your process, don’t forget that you’ve also got a lot to learn, so you shouldn’t waste the precious feedback your process produces.

Incentives, Metrics, and Goodhart’s Law

I’ve been thinking about metrics a lot lately. Since I joined my new organization, I’ve been spending even more of my time on the psychological and sociological aspects of software development. One wall I find myself constantly running into is the comfort and familiarity senior leadership has with metrics and the counterintuitively negative effect implementing a measure has on the desired outcome. To be crystal clear the wall I keep running into has a name: Goodhart’s Law.

When a measure becomes a target, it ceases to be a good measure.”

Marilyn Strathern

Jump to the tl;dr below:

Leadership loves measures, they just do. However the moment you set a measure as a target, an OKR for a current popular example, it ceases to be a good read on the pulse of the organization or product. The incentives are diametrically opposed. Individuals are best served by taking whatever shortcuts are available to meet the numbers. The problem is these values are only proxies for the real progress towards an outcome leadership desires to measure. Yet savvy individuals remain just as incentivized as ever to game the numbers for their own benefit.

Despite the best efforts of countless intelligent people, many highly valuable attributes of a system are still qualitative thus immeasurable. With many attributes being immeasurable, metrics can only be set for some range of unknown knowns and known unknowns attributes and conditions. We’ve nothing in our toolbox that helps gauge or measure the impact of unknown unknown’s on the organization, and those are the attributes which simultaneously represent the biggest opportunity and risk to any endeavor. Yet by their very nature, we can’t plan for them so it’s impossible to directly measure an attribute which can help.

So, why not measure things indirectly? What happens if we try to align the incentives?

Since I usually go on and on about APIs, as a change of pace lets look at an API example.

A simplistic API OKR example

Leadership has decided we need to lower our error rates to increase reliability of our services across the board for our internal and external consumers. They have set an OKR for each team to measure their error rates (non auth related 4xx codes) and reduce them by half within 2 months.

Sounds like a great goal.

Narrator: Everything went wrong.

Team’s Solution

GET /widget/this-widget-does-not-exist
 200 OK
{“code”:”404”,”message”:”Not Found”}
PUT /widget/this-widget-is-out-of-date
{“name”:”foo”,”status”:”bar”}
200 OK
{“code”:”409”,”message”:”Conflict”}
GET /widget/a-dependency-is-literally-on-fire
200 OK
{“code”:”502”,”message”:”Bad Gateway”}

Met our objective? Exceeds expectations. Atrocious design? You bet!

The outcome leadership wanted was to make our services more flexible and resilient, to provide service in partially degraded states, to cut the unrecoverable error rate for consumers in half. We’ve completely eliminated the error rates, but the measure is just a proxy for the real value. Despite our success, we’ve made the situation objectively worse. What this sets up is a cat and mouse game where leadership sets a metric, and teams find their way around them because the real costs for the desired outcomes isn’t something leadership will support. So we gamed the system. Leadership got exactly the opposite of their desired outcome.

Let’s look at this problem from another angle, can we phrase the goal in a way where the incentives for the team are aligned with the leadership’s true objective? What’s the real thing we’re trying to improve? The real goal we’re trying to accomplish is to halve the rate of errors preventing our consumers from accomplishing their goals.

This poses a few questions:

  • How does this align incentives?
  • How do you account for the unknown unknowns?
  • How can you measure the success of your consumer for goals you don’t know or understand?

Aligning Incentives

We’re starting with the aligned incentives, as it will demonstrate the value of finding answers or approximations for the next two. Simply put you can’t fake consumer success, if the consumer accurately succeeds in their objective then whatever you’ve done is objectively correct. If they don’t succeed or it’s not accurate there’s no hiding the fact that your implementation is in some way objectively wrong. Success is a discrete boolean value, the consumer will be successful or not. The only way to game this metric is to make the implementation better for the consumer. 

On the Unknown Unknown

The risk the unknown unknown presents to our endeavor is not knowing our system is failing consumers despite all signs to the contrary. The unknown is uncomfortable, however one thing we have accomplished is to contain and mitigate a sizable portion of the risk by measuring the only thing which can’t be gamed – consumer success. Clearly there’s plenty of other things which could catch us unaware, but we’ve done all we can to prepare for them. The far more valuable outcome is in learning where your consumers are failing and making improvements or new offerings to remedy these failures. Every failure we see now has the potential to uncover an opportunity for improved consumer success and increased value delivery. Once our incentives are aligned and the basic resiliency goals are met, the failures now are valuable insight into consumer needs.

Measuring Consumer Success

So how do we measure consumer success for goals we don’t know or understand? Step one and two is simply to ask and learn. Take that feedback to create metrics which can identify these occurrences. In some cases this could be little to or no benefit, while others could see huge improvement from this simple activity. Regardless, the insight gained is actionable and immediately useful. This due diligence lays the foundation for the next steps where we create hypothesis metrics to guess at consumer success for different scenarios. Services don’t exist in a vacuum, and in modern microservices architectures most organizations will have records for plenty of consumers in their own environment. You may look for cases where a saga is consistently rolled back in certain circumstances despite multiple successful mutations to a resource, or one service is consistently unavailable or fails in certain context because your processing runs just longer than a downstream timeout. 

If all we’re doing is creating metrics, how does this help? We can’t create true metrics for unknown conditions and state, just like we can’t prove a negative case in logic. However, these metrics are created to mine and expose hints of an undesirable outcome not to obtain a value directly. What we’re measuring is the effects of any arbitrary metric on consumer success, so while the value of the metric could be seemingly nonsensical on its own, the knowledge of its effect may be of significant value. There’s no silver bullet or one size fits all set of metrics or circumstances to investigate, but by looking at consumer success we’ve constrained a wide range of qualitative attributes into something we can indirectly measure.

tl;dr

By establishing consumer success as our only measure we force organizational incentives to be in alignment with individual incentives. We also gain the ability to indirectly measure the effects of qualitative attributes and complex stateful conditions on consumer success. As we are only looking at impacts to consumer success, we also reduce our risk from, and increase the potential opportunity of, unknown unknowns.

The Relationship Maturity Model

I gave a talk at RESTFest Midwest 2018 about this concept of a maturity model for relationships, and this post is intended to be the formalized version of those points. Through a lot of recent discussions on various Slacks, and over the course of our conversation in Grand Rapids, it has become pretty clear there is a wide range of definitions for relationship tags on web links.  I believe it’s crucial we develop a shared understanding of relationships in general so we can move forward determining how to create and enable affordance driven APIs.

At the highest level a relationship is merely the meaning which connects two concepts or contexts. A graph is a simple example, the relationships are the edges between two nodes. In the context of Web APIs relationships are the ‘rel’ attribute of RFC 8288 web links. The role of `rel` is to convey the semantics which join two contexts. In most discussions of hypermedia APIs links take a prevalent role, but their utility is often assumed or discussed in a narrow scope.

There is one concern which is mostly unaddressed in both models, which is the inevitable question a team will ask when considering implementing hypermedia driven APIs, WHY? 

Ok, I have links, now what?

We have two models which help us begin the story of the relationship, the Richardson (RMM) and Amundsen (AMM) Maturity models.  Zdenic Nemec wrote a (fantastic post)[https://blog.goodapi.co/api-maturity-fb25560151a3] comparing the two models, if you haven’t read it yet you may want to before continuing.  However for our purposes Amundsen himself provides us a useful and succinct summary: The RMM focuses on response documents; AMM focuses on API description documents.  A link therefor provides the foundation which binds the description to the response. Without the link we are dependent on static interactions, but without the `rel` we are teetering on the edge of a cliff shrouded in fog; we know something is on the other side of but we have no idea how to get there and what we will find.

Why ‘do’ hypermedia, what does it matter?

The key to understanding the power of hypermedia, is to understand the context of these links regardless of the serialization or mediaType. This clarity comes from the `rel` answering the question “How is the current context related to the target context?” There are many ways to answer this question, each one building upon the last adding quite a bit of power to the humble link. Why do we do hypermedia? So we can reveal to consumer agents the relationships between two contexts without sharing knowledge ahead of time, and if we do share our vocabulary ahead of time enabling consumers to rapidly build very rich interactions.

The Relationship Maturity Model

Level 0 – Anonymous Relationships

Level 1 – Generic Relationships

Level 2 – Named Relationships

Level 3 – Stateful Relationships

Anonymous Relationships

The least helpful, and unfortunately most commonly demonstrated relationship by an overwhelming majority is the anonymous relationship.  This empty structure string or array element provides no context to the consumer.  Most often this is the type of link shown when creating a demonstration to showcase hypermedia, and the frequent response is to question what exactly does a link like this provide? Nearly nothing. A link of this level provides no real additional benefit, it adds to system chattiness and may even add risk to the use of the application.

Generic Relationships

This is the foundation of all link contexts, these generic relationships can be found in the (IANA Link Relation Registry)[https://www.iana.org/assignments/link-relations/link-relations.xhtml].  The use of these simple and expressive relation types provides a wealth of general context to the consumer, not only do they provide some understanding of the relationship, but they begin to demonstrate the capabilities of an affordance centric API.  A generic agent now has the capacity to understand with `rel=“collection”` that the URL points to a resource collection root. If you receive a link with `rel=“item”` you know this addresses a single item within a resource collection. This won’t enable the richest interactions, but generic clients like the HAL-Browser use this level of detail to create GUIs for services the consumer has never uniquely integrated.

Named Relationships

Building upon the foundation of generic relationships with the use of custom rel names a designer can introduce new relationships which provide additional context to the links. User created relationships are required to be URIs, which allows a number of strategies from using the tag scheme through referenceable URIs. Using referenceable URIs provides the opportunity to include and control unique application context in every response.  This enables you to move from identifying a resource `item` to being able to identify _this_ resource is a person, and finally the relationship between the current context and the resource is `http://example.org/vocabulary/school/class/student`. By providing referenceable human and machine readable documentation as the `rel`, I have added a vast capacity for conveying meaning to a client.  As a consumer I now have a very rich understanding of the application’s vocabulary, it’s resources, and how they might relate to one another. These named relationships can provide consumers with hints on the composition of complex resource representations and an out-of-band vocabulary to safely use in creating rich resource interactions.

Stateful Relationships

Each previous relationship level is focused on conveying increasing detail of the current state of the resources, however to create a fully functional affordance centric API requires the ability to communicate resource affordances.  The capability to understand the state of a system with a generic client is powerful, but the ability to alter the state of a system with a generic client is truly revolutionary.  Revisiting the role of `rel` above, you’ll notice it doesn’t include mention of resource or affordance, because it simply adds the semantic context to determine how the two contexts are related. An affordance of `rel=“http://example.org/vocabulary/school/class/addStudent”` can be discovered and bound just like the Named Relationship and interpreted as “self context has the addStudent affordance, which is performed at the target context.” By adding affordance in a standard way outside of a specific mediaType, you have added the power of hypermedia with the flexibility and versatility of raw JSON or XML.

Bringing it together

The relationship maturity model is about understanding the nature of relationships between contexts, and removing preconceived notions on how they can relate.  It is easiest to accept a link between two contexts as simply a link between two resources, but the real power was in the intent of the standards to create a bridge to communicate all types of relationships.

Additional reading

While you’re at it, check out Jason Desrosiers fantastic hypermedia maturity model to learn another way to understand your API designs.

How to (not) give your first conference talks.

Full disclosure and TL;DR

My talks at RESTFest were not the fairy tale ending of a Cinderella story.  They were really, really bad.  Skip to the bottom end if you would like to watch them, without the results of my retrospective process.

“Luck is what happens when preparation meets opportunity.” – Seneca

If we go by that definition, I was not lucky at RESTFest this year.  My preparation game was seriously lacking.  I had the opportunity, but through hubris, indecision, and a touch of nerves I spoke impassionately and poorly about topics which deeply engage me.

What happened?

During my preparation for the conference I rode the fence and was indecisive about topics I wanted to present.  The conference added a great workshop by Shelby Switzer on hypermedia APIs and clients, which made me feel like my hypermedia talk would be largely redundant.  In short, I squandered the preparation time by I talking myself out of presenting a topic I felt passionate about but I was never fully convinced.  When I arrived, the environment was far more welcoming and supportive than I had dared anticipate was possible and I was convinced to develop and present a long form talk despite conventional wisdom against this type of action.  I wasn’t prepared for speaking in this environment, and my lack of preparation came from an entirely unexpected (by me) vector.

This is where the hubris comes in. For most of my school and professional life I’ve found success in even the most important of speaking scenarios by knowing the material well, compiling a light list of touch points and allowing myself to flow freely through the material.  I was sure, despite all evidence I read to the contrary, I would have no difficulty in ‘shooting from the hip’ in this way.  Ha ha ha… nope.

Speaking at a conference with an unknown audience is very different from any other forms of public speaking I’ve encountered, it requires you as a presenter to have strong confidence in your delivery structure in the absence of a rich interpersonal feedback loop.  I was unwittingly relying on an undeveloped and unknown muscle for reading the audience and adapting my tactics to the audience.  I assumed my confidence in my mastery of the material would be enough to power my presentation with zest to inspire all to immediately pick up the hAPI banners and charge forth.  Ha ha ha … nope.

This perfect vision was about as far away from reality as possible, while still delivering any talk at all.

What the audience got instead was a couple tone deaf lectures, presented with obvious discomfort lacking any semblance of passion or inspirational energy.

To the attendees of RESTFest 2017:  Please accept my deepest apologies for putting you through such a difficult talk.  Please also accept my sincerest gratitude for your benefit of the doubt in allowing me to finish the experience, and the tremendous support you all showed after my talks.  I did my material a disservice, gave an uncomfortable talk, yet you still welcomed me with honest, constructive, and yet reserved feedback.

Seriously, you guys rock.

Sorry Seneca, I’m not buying.

I’m choosing to reject Seneca’s definition in this case, because it doesn’t afford me any obvious path’s forward.  I obviously have future talks, but I had this opportunity and failed to truly capitalize.  Instead I’m looking at this as a win, not for my pride but because I can learn from this win and use it as a base to move forward.

Being such an introspective person, I’ve been repeatedly beating myself up over this failure since I walked away from the podium.  It’s only gotten worse since I have seen proof my initial analysis was accurate.  As little as one year ago I lacked any motivation whatsoever to speak at conferences, yet in September found myself at the very same conference which formed the foundation of my understanding of hAPI architecture.  Then in front of my proxy teachers, colleagues at large, and even someday perhaps my peers I stood and presented my ideas.

Thanks to Ronnie Mitra for offhandedly diagnosing my condition as “Imposter Syndrome” to explain the sudden nerves which nearly froze me in place.  It was a very new experience for me to feel nervous to voice my opinions or ideas.  The monotone lecture way I spoke came as such a stark contrast to my typical passion (bordering on fervor), to put a positive spin on it, that I usually speak with as I discuss anything I care deeply about.  Yet with the support of my family, this community, and my commitment to these goals I’m not running away.  I have more talks in the near future, and it is my hope there are yet more still to come.

Why?

Despite my poor performance at RESTFest these topics are things I’ve become very passionate about, they are all connected and worthy of my time.  Working to help refocus the tech industry on providing value to people; making it easier for developers to help people; expanding the definition of developer to include more people; recruiting more people to this cause of helping people; the connecting thread is a deep seeded calling to help people I’ve uncovered in the last year and a half.  I refuse to look at my poor performance as a failure, because it would be an anchor with a strong pull to stop or change course.  Yet, I’m not standing at an inflection point; I’m standing at a fork in the road between the hard path towards my goals or the easy path towards some consolation destination.  This failure of mine is actually an opportunity to prove my resolve and grow.

“If not now, when? If not you, who?” – Hillel the Elder

I’m choosing to view this as a win, because a lot more somebodies have to do it and having seen the opportunities I can’t willfully abandon them.  I was lucky at RESTFest, Seneca’s definition is not the only one.  I may have struck out, but at least I got the chance to bat in the first place.  Obviously this is a rocky start down this path, but I’m choosing to own it – it’s my rocky start.

I usually like to refrain from discussing things as intimate as this since my thinking sometimes comes off as alternatively grandiose or convoluted, but the recordings are available and I can’t change the past.  I can only control how I respond to it and what I do next.  I’m using this raw disclosure as a way to provide some excuse free context to the videos and a guiding light to keep myself on course.  I’m claiming responsibility for the lack of preparation and defining a path to grow into this speaking world I find myself in.  Sure, I haven’t given myself easy or short term goals, but I now do have a way to objectively track my progress and observe any deviation on the long path to my goals.

Epilogue

If for some reason you have read this far, and you still have the desire to view my talks I’ve included the links below.

Last warning – As of the writing of the post, I’ve only been able to suffer through the first short talk and about 9 minutes of the second.

Stop burning your customers and users.

Human Conversation Services.

A pragmatic review of OAS 3

Disclaimer

Before I go any further I want to address the elephant in the room. Obviously I consider myself a hypermedia evangelist and I’m aware it is easy to make ivory tower arguments from this perspective. I am also an application architect which requires frank pragmatism where today’s OK solution is generally much preferred to next year’s better one.  In most of my previous posts I’ve focused my discussions on the distance between where we are as an industry, where I think we should go, and why it’s important.

Getting started

As part of my process of preparing for my upcoming talks at APIStrat on API Documentation and Hypermedia Clients, I’ve been reviewing the specification in depth for highlights and talking points.

On one of my first forays into the new world of twitter, I rather tongue-in-cheekily(https://twitter.com/hibaymj/status/865054487119089665) pointed out as a hypermedia evangelist my issue with the specification.  Going back, I probably would express the thought differently, but the crux of the issue is OAS does not support late binding.

I’ll get back to this point later, because first I want to talk about the highlights of the specification to acknowledge and applaud the hard work put into such a large undertaking.  Looking back on the state of the art of APIs only 10 years ago, it’s easy to see the vast improvements our current standards and tooling provide.

At this point I’m going to assume most have googled for the changes to the format in OAS 3.  My aim with this post is not to focus on changes, but evaluate OAS as it exists in the current version.

The Great Stuff

Servers Object

This is a very powerful element for the API designer which allows design time orchestration constraints to be placed on the operation of the services. This can greatly enhance the utility of OAS for use in many scenarios, including but not limited to: API Gateways, Microservices orchestration, and enabling implicit support for CQRS designs on separate infrastructure without intermediary.

Components

My previous experience with OAS 1.2 lead to a lot of redundancy, which the components structure of the current version very elegantly eliminates.  The elegance stems from the design choice of composition over definition allowing for reuse without redundancy.  It simplifies the definition of the bodies, headers, request, and response components as reuse becomes a matter composition.  The examples section is a developer experience approval multiplier, which is welcome and should be strongly encouraged.

Linking

As a hypermedia evangelist, my approval of this section should be not come as a surprise.  It mirrors in concept many of the beneficial aspects of an external profile definition like ALPS and is a welcome addition to the spec.

Callbacks

The standardization of the discovery or submission of webhook endpoints within the application contract itself is a very good step in supporting increased interoperability, internally and between organizations.

Runtime Expressions

With the inclusion of this well-defined runtime expression format, OAS removes a large amount of ambiguity for consumers and tool developers. This allows the API designer to add a lot of value enhancing the ease of use for consumers and integrators.

A Mixed Bag

These items are included simply because a tools utility isn’t determined when it is created.  The optional nature of the definition or use cases of the response object and the discriminator open them up the potential of unnecessary ambiguity and misuse.

Responses Object

All of the benefits I mentioned in the components section also apply to the responses object. My concern centers around the enumeration of the different expected responses.  The authors deserve credit in immediately pointing out this shouldn’t be relied on as the full range of possible responses.  My experience has shown that designers, tool developers, and end consumers are prone to missing the fine print or assumption, subsequently over relying on these types of features.

Discriminator

For the purpose it serves I think the discriminator as defined is a very elegant solution which helps to differentiate OAS from standard CRUD.  It allows for the use of hierarchical and non-hierarchical polymorphism alike, for more concise and reusable designs.  However, it still fundamentally ties the API to design time defined data formats.

Room for Improvement

The Extension Mechanism

With obvious resemblance to the now long deprecated format of custom HTTP headers, this section should follow the specs own well designed components format.  This upgrade could use the composition rules defined within the spec to allow much better support from tooling developers, and more consistent interoperability.

It’s All Static

While the authors have done an excellent job removing a lot of static portions out of the spec, it is still fundamentally static at its core.  Fortunately the static nature of the format is largely limited to a small section of the document thus allowing designers and developers much more room to innovate after design time.

Intertwined Protocol and Application Design

In computer science it is always immensely difficult to know precisely where to create boundaries for improved separation of concerns.  The OAS specification was not created from an ivory tower bubble.  It was created to solve real problems in real time.  Unfortunately, it still bears scars from this period by mixing protocol design concerns with application design concerns.  Each application design component is also able to declare protocol properties in a mix which wouldn’t allow for protocol portability.  If protocol concerns like HTTP headers and response codes were abstracted to external definitions or formats, then the reuse of OAS could bridge nearly all relevant protocols.  However, there would be one thing left to prevent the specification portability – the path.

Path Is The Base Abstraction

Getting back to the point raised in my cheeky tweet.  By using the URL path as the primary abstraction the specification creates the possibility of many future; operational, developmental, and maintenance issues.  Recently even the quickly growing GraphQL community has joined voices with hypermedia proponents to point out how this subtle design flaw can develop into severe issues.

Bringing It All Together

The purpose of this post isn’t pointing out all the flaws in OAS but to give a pragmatic review of the state of the specification.  If you want to see a more in depth analysis take a look at Swagger isn’t user friendly.

In the end, if you’re going to opt for an alternative to hypermedia then OAS is about as close as you can get at this point.  The ecosystem fits extremely well in the wide berth between a single user service and massive scale where every byte counts.  If your service design hasn’t been updated in the last 10 years or is nonstandard, it’s very likely OAS 3 would be a massive improvement and represents a today’s best ‘good enough’ solution.

Some of these necessary improvements are easy to handle, others will require more finesse to mitigate if they are addressed at all.  One thing is clear if your project is still using custom API designs, or spend too much time managing older service designs, and you don’t have time to contribute to a hypermedia alternative then OAS is worth your serious consideration.

A RESTed thank you!

Last week from Thursday through Saturday I had the privilege to attend RESTFest to speak, listen, and learn.  Much thanks has already gone out to the organizers Benjamin, Mike, Shelby, and Ryan they deserve it and more for their efforts to organize, finance, and run the event as smoothly as they did.

About a year ago I started down this path towards a deeper understanding of REST services and API design, and while this is the first time I’ve attended rest fast the talks from the past provided a very strong educational foundation. This understanding allowed me to direct the next stages of my research through specifications and papers. I have a strong belief in the responsibility of the successful to strengthen the ladder they climb behind them to allow more follow.  Over the next few weeks I’m sure to glean even more knowledge from my time in Greenville, however it’s obvious the organizers and veteran attendees of RESTFest have built a very strong ladder.

In hindsight it was probably a mistake that I decided to rewrite my talks to fit extra-session conversations on different API design approaches.  While the material I wanted to present was the same or similar, I didn’t have the opportunity to rehearse, scrutinize the potential reactions to the slides, or practice delivery.  I do believe there was room for improvement on the delivery, organization of slides, and focus of the talks.  However despite the potential shortcomings I faced nothing but helpful and considered feedback from others to improve my delivery and presentation.

The uniquely safe environment they created gave someone like me the opportunity to learn firsthand the differences between speaking at a lunch and learn and a conference among many other things.  Lesson’s ranged from a touch of speaker wisdom to limitations of AWS managed services. In my mind all of these constructive critiques and utter lack of negativity only help to further demonstrate the safe educational environment at RESTFest.

Thank you those who financially, logistically, vocally, or in any other capacity have helped make RESTFest past and present happen.  I can say unequivocally I wouldn’t have the knowledge I have now without your effort and support.

RESTFest was fun, it was educational, it was productive, it was nerdy, and I liked it.

See ya next year Greenville!

Don’t iterate the interaction design of your API.

Recently I have been encountering an increase in how a misunderstood best practice is misapplied to justify a bad decision.  I’ve seen a general uptick as I’ve gained experience due to my increased knowledge, but also the increasing diversity of knowledge and experience of developers in the field. One practice in particular has stood out as particularly damaging to the API ecosystem namely the agile tenet of simplicity.  This tenet advises the practitioner to add only the functionality and complexity required for the current known requirements.  If one were to step back to think about it this would seem like it should be both obvious and harmless practice to follow.  How could creating a design with the lowest possible complexity, or cyclomatic complexity for the computer scientist, ever be a bad decision?

We will cross that bridge when we get to it!

I’ve been consistently hearing this argument in regards to adding functional or design complexity to new API development.  The practice of simplicity until necessary is generally sound, but fails utterly when applied to API design.  The reason is quite simple there is only one chance to design an API, ever.  But wait you cry, we can version the API! I’ve previously addressed the poor choice of versioning, nevertheless if you pursue this option the ill-advised use of versioning is a tacit admission of this fact.  If there is only one opportunity to define the design of an API, you simply cannot make it any less complex than it will need to be to satisfy the eventual end goals of the API as it evolves.

When best practices go wrong!

The problem comes from the fundamental misunderstanding of the definition of best practices as rules of thumb, not hard and fast rules.  Advocates and evangelists loudly tout around the benefits of their process, but often fail to acknowledge the existence of any scenario where their best practice simply isn’t.  The argument consistently boils down to, this solution is too complex for now, we will go back and fix it later when we have time.  But there is a few subtle built in fallacies which become this approaches Achilles heel.
The first is the belief that with the introduction of this technical debt the price to repay will not grow over time, or at worst will grow linearly.  There are certainly situations where this might be the case but it would be the exception not the rule. The term technical debt was coined because of the tendency for the debt to grow like compounding interest or worse.  Worse still it is very common that the weight of the legacy system once released would actually prevent you from ever returning to address the problem at all.
The second is the naive assumption that the future will be less busy, the team will maintain a desire fix the flaws, and their fortitude to expend capital to meet the requirements will grow.  Case study after case study has proven this is overly optimistic and simply not true.  As the cost to fix an implementation or design flaw escalates, the cost benefit tradeoffs with leaving the code in place become ever more biased in the favor of not touching ‘what isn’t broken’.
At the end of the day this is simply the lies told by designers, developers, and stakeholders to themselves and others to justify an increasingly more expensive sub-optimal deliverable.
Assuming your team is stellar and defies the odds by prioritizing the rework process, it following through is still completely dependent upon having the opportunity and control of all dependencies to seamlessly perform the work.  If there is even a single client outside of your teams’ immediate control, your ability to complete this work quickly is severely degraded.

Agile: The buzzwordy catalyst and amplifier

There is nothing earth shattering here, but I haven’t even touched on the whole story.  In the same paper as ‘cyclomatic complexity’ Arthur McCabe also introduces the concept of essential complexity, or the complexity innately required for the program to do what it intends to accomplish.  Under the guise of the tenet of simplicity, the essential complexity is often left unsatisfied because the agile methodology places a burden of proof on additional complexity which is unforgiving and ultimately unsatisfiable.  In order to reach the known essential complexity of a program, you first have to prove adding the complexity is actually essential.  It’s a classic ‘chicken or an egg’ problem with no answer.  Ultimately this will most often result in the process directing your actions to failing to meet essential requirements through a failure to define, justify, or evaluate essentiality of the added complexity.
The business decision, and business imperative to do only the required work for now is deaf to technical concerns outside of the short term, regardless of the costs or savings.  This isn’t to say developers should always be in control of these decisions, but it is very important to be aware of the increased importance of communicating technical pitfalls and their costs outside of the technical audience as the process is heavily biased against technical concerns.  The adoption of agile practices has actually increased the importance of a highly knowledgeable technical liaison who can push back when shortsighted goals will provide a quick positive payout saddled with a negative longer term value.  This is where it all comes back to the misunderstanding of best practices.
These teams are more often being led by practitioners without truly understanding the best practices business purpose.  Rigid adherence to, and often weaponization of, ‘best practice’ in these design discussions has only served to hide the inevitable costs associated with poor design until a later date with the debt relentlessly compounding unimpeded.

You can’t put design off, so don’t!

I started this off by saying you can’t iterate away the interaction design, so I want to be very clear what parts of the API design can and cannot be iterated.  The design of an API is actually composed of two relatively straightforward and separate concerns, what I will call the interaction design and the semantic design.  The interaction design is the complete package of the way a client will interact with your service.  It includes security, protocol concerns, message responses, and required handling behavior which cuts across multiple resources among many others.  The semantic design encompasses everything else and this can and should be created and enhanced over time as domain requirements change.
Knowing the interaction design of the API is permanent once completed, it’s important to not only get it right, but to ensure the design defines the capability for expansion of specific functionality which will need to change over time, for example the use of a new authentication scheme, or filtering strategy.
It is impossible to list the requirements which will fall under the interaction design of your API, but I provide some questions I’ve used which will help you go through the initial design period of your API to exclude the design and implementation of features which can wait.
  •  Does this feature change the way a consumer interacts with the API?
  •  Does this feature change the flow of an interaction with the API?
  •  Could later introduction of this feature break consumer clients?
  •  Could later introduction of this feature break cached resource resolution?
With a rigorous initial design session, utilizing these questions you should be able to determine the essential complexity of your API interaction design with much higher accuracy, and prevent cost increases and consumer adoption pain from adding new value to your services in the future.

Unleashing generic hypermedia API clients

A true restful API has been called many things, hypermedia web APIs, ‘the rest of REST’, HATEOAS – the world’s worst acronym, or perhaps the newest hAPIs.  Regardless of what you call it, this concept has long been proclaimed to solve nearly all of your most difficult design problems when building a web service interface.  There is plenty of evidence to support the claims made by hypermedia evangelists over the years, however one glaring omission is likely the cause for the slow adoption of hypermedia on restful services.  How do you consume this service, and what do all of these link relations mean?  Building an effective hypermedia client is more complex a task than consuming a CRUD API, an extremely difficult question to answer has been when do the benefits outweigh the cost of complexity?  Once past this hurdle, how does a consumer know how to interact with the service?

It is no wonder adoption of a superior design is so slow when a more complex design leads to more complex clients.  The primary selling points for this style are longevity, scalability, and flexibility, however the benefit from these traits is seen over a long period of time making the complexity a difficult tradeoff to evaluate at the start.

We are all very familiar with good, seemingly simple hypermedia clients.  In fact, you are likely using your favorite one right now to read this.  If we know so much about building good hypermedia clients, why are hypermedia APIs still not the de facto standard?

The key to enabling adoption of hypermedia APIs is very simple, make them easier to consume.  The Open API Initiative through the swagger specification has demonstrated the power and appeal of standard formats to enable rapid adoption of best practices in accelerated development cycles. I often will call out the shortcomings of the specifications, but it is critical to understand the cause of the successful proliferation to the web at large. The trick is to apply the lessons learned from this success to driving the adoption of semantic hypermedia.  To make a hypermedia API easier to consume you create generic clients to encapsulate the complexity by establishing and adhering to a strict http behavior profile.  Then you subscribe to or publish a semantic profile of the application adding domain boundaries to the messages and actions.  Finally, allowing clients to tailor their hypermedia through requested goals of supported interaction chains.

Often hypermedia is used to augment CRUD services using binding formats like OAS.  In this scenario it simply can’t be relied on to drive the interaction with the service as it has no guaranteed, or an unbounded, range of responses.  Establishing a range for the hypermedia domain semantics is critical to transition the role of hypermedia from augmentation to the vehicle for application state and resource capabilities.

The takeaway here is simple, if you want to have the robust flexibility offered by hypermedia APIs then your focus should be on enabling strong generic hypermedia clients.  To build strong generic hypermedia clients, you need to adhere to strict service behavioral profiles to isolate the domain from the underlying protocol behavior.

Hypermedia APIs: Use extensive content negotiation

In my last post I touched on how important it was to insulate consumers from the immediacy of a breaking change.  Nothing you can do as a designer will allow you to create the perfect API which will never require change on the first try.  What you can and should do is reduce the likelihood of the occurrence of a breaking change as much as is feasible, and then allow consumers to gradually adopt to the changes on their own schedule.  In this post I’ll discuss the need for extensive content negotiation.

It has been stated, in the comments on these very guidelines no less that there is a striking similarity between the 9th and this the 11th guidelines, as both rely on or discuss content negotiation.  Much like the first guideline to embrace the http protocol, the benefits, constraints, and reasons for content negotiation are sufficiently board to merit multiple discussions to be properly addressed.  It is imperative a designer avoids hypermedia formats which prescribe URL patterning because this could lessen the proper attention being given to resource representation and affordance design.  The goal of this discussion is to address the rest of the content negotiation constraints to prepare your designs for interaction with real traffic volume and diverse consumer demands.

As the API designer, your job is to provide the simplest service you possibly can to your consumers.  CRUD APIs like OAS (swagger) often struggle with complex designs when domain functionality doesn’t map to 4 methods very well.  Other solutions like GraphQL  provide excellent solutions to captive audiences and internal services, but for external consumers often result in the same poor consumer experience. Quite simply the act of consuming the service correctly requires too much knowledge about how the service is built.  So how do you avoid making these same mistakes with hypermedia APIs?  You allow your consumers to interact with your service just about any way they want.  The fact is you will never be able to guess all the particular ways a consumer would want to interact with your service or tailor their requests, so don’t try.  The solution is to build your service as generic as possible and allow the consumers to choose the interaction mediums will be used.

What all should be negotiated?  The short answer is everything you can reasonably support which adds to the consumer experience.  A longer non exhaustive list of potential negotiated points:

  • Hypermedia Format (Content-Type)
  • Filter Strategy
  • Query Strategy
  • Pagination Strategy
  • Cache Control Strategy
  • Goals
  • Vocabulary
  • Sparse Fieldsets
  • Representation or Document Shaping

It’s a long list, does your service really need to support all of those negotiation points?  It should aim to support all of these and more if they are reasonable and feasible to your service domain.  Yes, this adds a lot of complexity but it’s crucial to focus on the consumer experience, and the long term payoff of creating a service which will happily satisfy the consumer needs for years to come.

These negotiation points are all critical to supporting a wide breadth of consumers, but they are also central to providing service flexibility over time.  A service designed from the beginning to be generic, and support a wide range of many different properties already has the capability to support one more option in any particular property.  When a new hypermedia format comes out, or a new standard filter strategy, your service already provides multiple options for this properties and supporting the change is nothing more than plugging in the appropriate functionality.  You can’t know what formats will be wanted in 5 years, but your service has been designed to account for changes over time, and the required upkeep is vastly lower than any alternative presented to date.

Design your API to negotiate with your consumers as much as possible, and you will have an enduring service your consumers will love to use for years.

Hypermedia APIs: Use flexible non-breaking design

In my last post in the series of hypermedia API guidelines, I discussed the need to decouple the design and implementation details of your API from the constraints of any particular format.  You likely aren’t designing your own format, but it is a good decision to avoid formats which require URL patterns, as they can provide confusion and increase the odds a consumer will make calls directly to URLs.  In this post I’d like to go through the follow up guideline to don’t version anything, which will fill in the remaining gaps in dealing with resource and representation change.  To support long term API flexibility, your design should leverage a strict non-breaking change policy, with a managed long lived deprecation process.

As time passes, an APIs design can lose relevance to the piece of reality it is built to model.  Processes change, properties change, and priorities change so it is crucial to maximize flexibility for change over time.  When using hypermedia APIs, it is important to understand the three types of changes you can make to your profile, and the appropriate way to manage each kind.  Optional changes will make modifications to the representations and their actions without any effect on current consumers and their bindings.  Required changes will make additions to the profile which can be gracefully handled by the generic client.  Breaking changes, or removing items from the profile, will require a client update to maintain compatibility.

In traditional statically bound API styles the handling of the optional changes would likely lead directly to consumer client changes as the representations of resources are strongly coupled to the consumer.  However, a generic hypermedia client is intentionally dumb when it comes to the properties of resources, so the addition of any unknown resource simply behaves in the default manner.

The story of required changes is much the same as the optional changes.  The highly coupled service and consumer relationship requires constant maintenance and attention to continue to function.  A hypermedia API consumer client will manage the required changes by standard approaches, generic fields which are required can be flagged to the consumer as invalid without requiring any strong bindings to the consumer client.

In this way, the two changes which represent any difficulty for hypermedia APIs are the required and breaking changes.  In the case of the required changes, a previously valid representation is no longer valid because a new property has been added.  Alternatively, there is a new action has been added to a representation which was not previously expected without a client binding to the action.  The breaking change is a representation or action being removed from the profile which is has previously been required or bound by consumers.  With these definitions, it’s clear the real difficulty is in addressing the breaking changes.  The solution to breaking changes again can be found in the very first guideline I discussed, use the HTTP protocol to advertise change.

Previously in these discussions I have noted how the hypermedia API will manage the range of bounded contexts available to consumers.  Diving into this concept a little further, the primary benefits in supporting a range of bounded contexts is to allow transparent incremental versioning and consumer preference in the resource representations to be utilized.  Many leading tech organizations and methodologies stress the importance of versioning the API, unaware or uncaring of the fact that doing so has sown the seeds of future breaking changes.  By tracking the changes of your representations in the supported vocabularies, your service is able to leverage the HTTP 3xx response code family to inform consumers that change is imminent while still respecting their interaction in the vocabulary they know.  This allows consumers to upgrade gracefully on their own schedule, and greatly reduces the occurrence of high stress deadlines caused by your services’ evolution.  Through nuanced activity tracking and API orchestration, you will have an accurate view on exactly when particular representations or portions of the API are no longer in use.  Allowing you to confidently sunset old functionality knowing it will not likely result in a rude awakening to one of your extremely valuable customers.

By leveraging the protocol in the standard way, we can avoid breaking changes from immediately impacting consumers and requiring their full attention.  As I’ve mentioned elsewhere, creating the good consumer experience is critical to the success of your API, and a great way to keep consumers happy with your service is to not break their clients at 3am on Saturday night.