github linkedin
API design and OOP
May 18, 2024
16 minutes read

The words “Object Oriented” and I met for the first time when I was twenty. In the faculty where I got my software engineering degree, Object Oriented Programming (OOP) was awarded an entire semester-long class. It was presented to us, students, as a miraculous panacea for the correct practice of writing software. I liked the ideas back then, and don’t dislike them today. OOP theory, today, is more controversial: A lot of people think it’s not that great, after all.

Whilst I haven’t made my mind yet in that regard, I have worked for the past five years with web services—the stuff so often called API—on both ends: as the designer as well as the client. Every now and again I come across a web API that smells like OOP.

In fact, a common trend among modern web-services is to offer an interface that feels object-oriented. This interface, designed more often than not by product managers, provides a view on what entity types the service interacts with. For example, an API that services data from a bookshop’s site can say that the response of a given GET request is a Book, and that means—in that service’s world—the data elements contained in the response text match an arbitrary definition of what the data elements of a book are, such as Title and Author. This arbitrary definition is the interface. It is arbitrary in the sense that is defined by the API owners, and they may choose, for instance, to not include the ISBN in the definition of a Book object.

The moment I write that down, it makes so much sense that, all of a sudden, I can’t think of an alternative equally sensible. That’s a novelty bias. There are alternatives, and they all make sense. The choice, as it often happens in real things, isn’t binary.

Hands on the matter

If you (dear reader) are anything like me, you are thinking that what really counts in the end is the ability to make solid design choices in a given situation, and quickly so, and, therefore, only one question matters: What are the guidelines when designing an API? That’s why I am going to begin this discussion with that question, and leave the motivations, further analysis, and examples to the second part of this article.

My recipe is based on a classic advice given to framework designers: Always restrict user choices, by making functions private and only giving a narrow path to users to perform operations. “Framework”, in this context, refers to any module, package, or library that is built to be used by other pieces of software; In other words, the “server” in the classic paradigm client-server, as it was called in the golden days of software engineering.

Another way to formulate the same piece of advice would be: Make your API narrow and tight. Users must walk a narrow path to get what they want, when using an API. Two are the reasons for restricting user choices and giving them a narrow path. First, it makes their life easier. Some users, and I bet all of the product managers, and the non-technical founders will complain that the product overview seems too light. That’s a good thing! Most users will appreciate it and enjoy the beauty of navigating a complex system via a simple interface.

Second, and most important, when the number of pieces of the system that are externally accessible is small then it’s easier to make changes to the internals of the systems. That is, in my opinion, one of the most important concepts in software engineering.

⋄ ⋄ ⋄

Back to the story about my faculty. Professors there often spoke about contracts. They didn’t mean job contracts. They meant the contract that implicitly exists between the owner of a server and every user of that server. That contract is the application protocol interface, the API. Today, the term API has mingled with what is technically a web-service. An API is a contract between a server and its clients. The contract states that certain operations are available for the client to use, and gives details about the I/O of these operations.

Making amendments to a contract is, in life as in software, a big deal. If the API is not narrow and tight, every time one wants to make a change to an internal process he will be forced to make a cascading change to the API. That means changing the contract! What would the user say?

⋄ ⋄ ⋄

I maintain a production system that has been running for the past five years, and that is a web service for financial data. People on the internet would say a fintech API. This timespan is relatively long for a startup, therefore the service has gone through several versions. They were all internal versions. Not once I have had to ask clients to make changes to their programs, because of a change in our API. Not once. How is that possible? It’s because the API is very tight, with a few key operations available to clients, and data elements accessed individually. A word that I like to use in this context is atomic, though it’s, of course, unrelated to atomic transactions.

On the other hand, I see other companies’ services moving from version to version faster than I can run a marathon, and I mean public versions. At this point, one bit of information you (dear reader) need, is that practically all financial systems need to integrate with other financial systems. Unless you are running the kernel operations of a big bank, or Visa/Mastercard, then your system needs to rely on other financial service providers, no matter what it does. The simplest example is to take payments: unless your system is a payment processor, you will rely on Stripe or a similar one. And in fact, even if your system is a payment processor, it may still rely on another, bigger one!

So like most fintech APIs out there, the one I run also relies on several other services. Some of these have actually been in production for less time than mine. Yet, I have seen the service owners starting with /v1, then adding /v2, then /v3, to their public endpoint interface. That’s two contract violations. Each time they do that, all of their clients must make changes to their program—and I mean their code, that also runs in production!—to accommodate the new version. I am one of those clients. I have to make changes and extra work because of them. Even when there are no actual changes, clients must be careful and take the time to check compatibility of the new version. There is a school of thought which believes making frequent version changes is good, but, in my opinion, it often is a symptom of a flawed design.

A closer look

To give examples is a tricky business when the discussion is about generic concepts of software engineering, because no two systems are the same, and it’s challenging to make a point without looking at five thousand lines of code. But I will try with an over-simplified one. Please bear with me; I have a point!

Consider a bookstore and a web service for their catalog. Assume this web service is used by Amazon and other two sites, so the store can sell books on three websites.

The bookstore service developers decide that their web service will return books information in a Book object. That object, they decide, is made of three data elements: title (a string), author’s full name (a string), and ISBN code (also a string). The latter is also interpreted as Book ID. They decide that users have one way to check all of the data in the bookstore, and that is by getting all of the Book objects. So they make a public interface, an API, that allows one operation: GET /books, which returns a list of Book objects, in JSON format, each object containing the three data elements. The returned list will be very long, but for the sake of this example allow me to not discuss list pagination.

This is a perfectly sensible design. Now I will force the logic a little bit to show a potential problem.

After a few months that the service has been running, all three shopping sites successfully integrate the catalog by running about five million API calls every day (total). One of the three websites contacts the bookstore developers and makes a good point: a book can have multiple authors.

What to do? No problem!, they say. They jump in the database, type a fancy ALTER command and, voilĂ , the database column that was a string now is a JSONB type, containing a list of strings. Then they change the definition of Book object in the API, so that the data element author is now a list. They send an email to their three customers letting them know that now the author data element contained in a Book object is a list of strings.

The contact person of the website that asked for the change is amazed at their efficiency. The owners of the second site don’t care because they aren’t using the author element anyway. Amazon does, and their integration is now broken. Amazon is not going to change their code. Amazon removes the store from their site. Sales go down. End of the story.

To be sure, there’s one obvious mistake in this story. The bookstore developers should have made a new column in the database, not altering the existing one, and then added a /v2 endpoint in the public API, and kept things in sync between old column, new column, v1, v2, and whatnot.

But even then, things become ugly very soon.

Preventing this kind of problem is often not possible. Thus, the question is what design choices would have mitigated the business complexity arising from the need to change the way authors’ information are stored?

Probably many, and here’s one that makes my point. Books are always uniquely identified by their ISBN codes, and we know ISBN codes are always going to be strings. The basic, and most needed, public interface GET /books could return all of the ISBNs in the bookstore. That endpoint’s response would then be of type list of strings, and that type is very unlikely to change. Thus, the most used interface of the system, GET /books, is very tight and extremely unlikely to change. Clients of the system will appreciate its stability.

Clients still need the other pieces of information about books. To take that into account, the initial design of the system can include additional endpoints, one of each atomic piece of data. There would then be other public endpoints, such as GET /book//title and GET /book//author.

When the time comes to change the author format from a string to a list, clients that do not use that atomic data element will not be bothered. The impact of the change is clearly lower than in the original case.

I can hear the objection: all of the clients will use GET /author and will be impacted by the change. Yes, in this overly simplified example that is true. My claim is that in real-world examples, and when dealing with more complex data structures, that is not always true. Software is never still, it is always changing, so some things cannot be prevented. Again, the key word is mitigation (and, perhaps, resilience).

It’s almost time to look at the real-world I have so often mentioned. I like that world, because it’s like a wild forest where unpredictable things happen. When it comes to that, I would always start with looking at well-known, robust systems that have been running for many years and have an enterprise’s reputation behind. I have selected for this article examples from Stripe and Google.

Just before moving on to that, let me summarize what I think are the most important practical takeaways because, if you are anything like me, you are itching to build something.

Thoughts to bring in the forest

API means interface, not web-service. The web-service is the system, the API is the contract between the system’s owner and users. Think twice before changing a contract.

Updates are inevitable in software. In fact, they are frequent. In practice, all APIs will have to change every now and again, hence the impact of each change must be mitigated. Users have made an effort to learn how to utilize the system, and are probably paying for it. Be respectful of their time.

The impact of any change is minimal when the scope of the change is minimal. And the scope can be minimal if the algorithms and data structures, that are publicly accessible, are accessible nearly atomically. Make the API narrow and tight.

The real world

I find the complexity of financial data web services very peculiar. It’s not in the efficiency, or the SLA challenge, or the uptime. Quite the opposite, in fact, because things are painfully slow. The complexity, from where I stand, lies in the fact that a handful of institutions hold the data, and the data has been accumulating for decades, and so it happens that not many people know what actually is in the data. A couple of organizations out there exist with the only purpose to get those institutions to sit at the same table, but that’s a story for another time.

If you run a web service which requests data from a bank’s web service, chances are your system will get the expected response from the bank’s about 70% of the time. The remaining 30% accounts for—again, in my experience—planned maintenance, unplanned bugs, and more simply the data itself not being accessible because the owner of the account decided to shut the access off.

In the beginning of our service’s existence, and we already had paying customers, it responded with an error message whenever data was not available from the banks. That made it impossible to distinguish different types of problems within the above mentioned 30%. We understood quite soon that the system needed to have additional information, and make it available to clients to know whether the problem was technical (maintenance, bug), or privacy-related (access revoked by the user).

The problem was that adding that information could mean changing some interface. In fact, we didn’t even know which interface: where to add the new information, and in what form? Then a teammate had a brilliant idea (I wish I had it), which will appear familiar to you (dear reader) as you’ve read the bookstore example above. We added a flag, just a boolean flag, in the API without changing any of the definitions. The flag can be accessed atomically and will tell whether there’s a temporary problem, or the data for that account are permanently inaccessible.

We didn’t have to release a new version. We didn’t have to send clients a notification (even though we did, making clear no action was needed). The change was transparent to clients, and they did not need to make any change in their applications, unless they wanted to.

We still haven’t made a /v2.

The bigger real world

The only way to do this is to jump head on. I look at how the Customers API is defined at Stripe, on this page. At the time when I am writing this, the first thing on the page is the list of all endpoints available to clients. The second is their arbitrary definition of a “Customer object”, presented as a JSON object with many elements. Then, it goes on to list all operations that can be requested in regard to Customer objects. That looks, smells, and talks like OOP. It is.

Once I see that page, it makes so much sense that it looks like the only plausible way to do it. What would an alternative be? Here is one: the famous Google Places API, on this page. As I write this, the Places API documentation shows and explains a lot of data fields, but there doesn’t seem to be a formal definition of a “Place” object.

I use the term formal, in the sentence above, and I need to stress its importance. To be sure, the definition of a Place is somewhere. If it wasn’t, how could the developers of the service work with it? The key point is that the definition is not on the page, and that is by choice. They (the service owners) do not seem to think that it would be useful for clients. In fact—and I find this fascinating—they go as far as to provide clients with a technique to decide what fields to return. Thus, not only there isn’t a unique, encompassing definition of a Place, but in fact clients can pick and choose what data elements they want, before looking at the entire object. In other words, there is no object at all.

When two giants like Google and Stripe make choices so conceptually different, one can bet things are not so clear cut. Using OOP in API makes sense in light of the CRUD trend. CRUD is the acronym for Create, Read, Update, Delete. They are the four most common operations clients do with things within a system. CRUD design is everywhere these days, and I have nothing against it except that I don’t like to obsess with just one concept.

That said, OOP for a CRUD API is a sensible choice. But how many APIs are really, truly CRUD?

Not so many, I dare to say. An API that is 100% CRUD is like a subset of database operations. There will be those who say that APIs are exactly that: just interfaces on top of databases. That may be theoretically correct, but I believe these people haven’t run a production service in their profession. Things get messy when different services interact with each other, and clients are not supposed to know about it. And that is what happens all the time, for all production services.

As an example, let’s go back to Stripe and their PaymentIntent object definition. Scrolling down, a data element called customer can be seen. The description says that is the ID of a Customer, if one exists. It doesn’t say what happens if I put there a string that doesn’t match any actual Customer ID. In a CRUD system, that should raise an error, but this is just a little nuance. More importantly, the documentation says that the customer data element is “expandable”, which means it is generally a string (or null), but clients can send the request with a flag to get the entire Customer object, instead of just the ID string. That means that the response element customer from the very same endpoint may be null, or a string, or a complex object. One may like this feature or not, but don’t call it CRUD.

My point is that, in the real-world, things cannot be just CRUD. And the Stripe API designers know it well, as one of the main operations available to clients is the search operation. I like it quite a bit, especially the fact that they defined a query language for it.

But it is a search, and there is no S in CRUD.

More thoughts

I hate to use the header Conclusions, but these are going to be the final thoughts for this article.

Real-word is complicated. That is the takeaway for me here. If a service owner, and his developers, come up with an abstraction that represents the entire world, then OOP can work nicely for a web API. That’s why the Stripe service is architected like it is: they own the data, and therefore can shape it as they want, and then they can abstract it in code. After all, that’s what OOP is, an abstraction of the entities involved.

On the other hand, when the real-world is very real, nobody can say they own the data, and it all comes down to providing a good service and navigating it to live another day. That’s why, I believe, the Google Places service is designed like it is.

There’s no OOP science then. It’s raw, messy data at its best.


Tags: engineering

Back to posts