Skip to content

Analyzing APIs at Scale

Tim Burks edited this page Oct 4, 2022 · 3 revisions

Why do we need an API registry?

APIs are big

APIs are big these days, and by all accounts they’re getting bigger.

According to the Apigee State of API Economy 2021 Report, “API traffic for Apigee customers increased 46% year-over-year to 2.21 trillion calls between 2019 and 2020”. There’s growth in every sector, but the biggest that we saw was in healthcare, where there was a 400% increase in API traffic in 2020.

A lot of people are working with APIs, too. The 2020 Postman State of the API Report surveyed over 13,500 of them, and 60% of the developers surveyed said that they spend at least ten hours a week working with APIs. Many of these developers are also making new APIs. The Programmable Web online directory now lists over 24,000 APIs and in recent years has reported an accelerating growth rate. Google has hundreds of public APIs and many more internal ones and is not alone. Many large companies have thousands of APIs that they use internally, share with partners, or publish to the public.

But unfortunately, working with large numbers of APIs brings some challenges. The Postman State of the API Report observed that “the number one obstacle to consuming APIs: lack of documentation (which also led other factors by a wide margin). In fact, less than 5% of individuals give the APIs they work with a 9 out of 10 or higher when rating how well documented these APIs are.” Much of this documentation is needed because of complexity, particularly complexity caused by variations in how different APIs work.

There’s power in consistency

Have you ever noticed that stop signs and traffic lights aren’t documented? After some basic drivers’ education, everyone knows how they work, and despite occasionally rolling through their stops, people generally do the right thing at intersections because the rules are the same everywhere.

Like stop signs, APIs are more than just hardware (and software!) - they’re social inventions that require a shared understanding of their meaning and a shared participation in their operation. Consistency is how they work.

Even Supreme Court justices have noticed the value of API consistency. When the Court ruled that a reimplementation of an API interface was fair use, it noted the effect of consistency on developer productivity: "Without that copying, programmers would need to learn an entirely new system to call up the same tasks." Earlier in this opinion, Justice Breyer wrote "Congress and the courts have limited the scope of copyright protection to ensure that a copyright holder’s monopoly does not harm the public interest.” Consistent interfaces and developer productivity are in the public interest!

To build consistency, look for patterns

If we can make APIs easier to use by making them more consistent, we should spend time looking at existing APIs. But APIs are described in different ways, and some have hardly any descriptions at all. So if we’re going to look at APIs to improve their consistency, it would help to have a consistent way of looking at APIs.

One way to look at APIs consistently is to put their descriptions in a registry. We define an “API Registry” as a structured collection of API descriptions and metadata. This open source project supports API registries by providing a simple API that stores API descriptions in a top-level collection of “APIs” where APIs contain collections of “Versions” and versions contain collections of “Specs”. This lets us describe APIs as they change over time (in new API versions) and represent API versions with a variety of description formats (stored as API specs). In our registry, specs can be of any format, and we describe some of the ones that we know with Media Types.

Our API Registry helps us collect API descriptions and the metadata that we derive from analyzing them. We can store this metadata in “Artifacts” which belong to resources at every level, including a top-level “Project” that owns everything.

With an API Registry, we have a structured way to look at APIs and answer questions about them.

  • How big are these APIs?
  • How are these APIs versioned?
  • What do the titles and descriptions of these APIs say about them?
  • How are names used in these APIs?
  • How are these APIs secured?
  • Are these API descriptions accurate?
  • Which of these APIs are duplicates or are solving the same problem?