How Upwork Effectively Manages A Complex API Infrastructure
How has Upwork created and maintained a complex public API (application programming interface) infrastructure that provides data and services to a number of partners, search engines, third-party applications, and even organizations like Oxford and MIT?
In this article from Upwork Engineering, written by product architect and software engineer Maksym Novozhylov, hear firsthand how this endeavor has not only solidified Upwork’s reputation for top-notch engineering but also turned our data and resources into valuable assets we can safely and reliably share with the world.
Read on to learn about some common issues the Upwork Engineering team encounters while supporting the platform’s API infrastructure and some of the potential problems faced in light of the platform’s recent modernization.
From static to dynamic
Back in early 2008, when Upwork was formerly oDesk, the site was only static pages—there was no dynamic interactivity made possible by technology like AJAX calls. At that time, the engineering team decided we were ready to experiment and start applying new technologies.
Our first API was developed as an experimental project designed to support AJAX calls for internal use. To build the back-end support for the new WorkDiary and Message Center, a couple of engineers embarked on the big re-construction, paying close attention to best practices for RESTful applications.
Common issues faced
Nowadays, that API is one of the core resources used by the Upwork site, Time Tracker, numerous third-party tools, and client applications. All of these require stable and robust API services with clear, complete and up-to-date documentation. Therefore, we are persistently modernizing it. Some of the issues we deal with in this renewal process include:
- Synchronizing changes between teams
- Sharing the internal API with third-party tools for integration
- Monitoring security of the API
- Modernizing the API without breaking backwards compatibility
- Keeping documentation up to date for users despite having tons of changes
Let’s dig a little deeper into the details of each of these issues.
Syncing changes between teams
To organize what assets are shared with whom, we divided our API into three versions:
- Public API: The open resources and Upwork services described in the Upwork API Documentation that can be accessed by third-party tools and integrated into client processes. A note on this: Upwork also uses some of these assets internally, for historical reasons.
- Restricted API (a.k.a the corporate API): These are resources limited to internal use by Upwork services.
- Enterprise API: A new set of public resources that provide extended functionality and integrations to Upwork’s enterprise clients, which at the time of this article is still under development.
Sharing the API and prepping third-party users for changes
All together, that can add up to hundreds of APIs developed by different organizations accessing Upwork’s API. While the corporate API can be easily updated at any time according to the needs of the company, when it comes to the public APIs, the process gets a bit more complicated.
Because you never want to break someone else’s API or cause any downtime for their services, here are some rules we follow to make changes to the public API happen as quickly and smoothly as possible.
- If we need to deprecate or modify any API, we analyze the traffic using access logs for N days/weeks using the powerful Kibana interface. This allows us to quickly find all active consumers of the API and contact them directly about the changes.
- We announce any changes before they happen. This gives API consumers some lead time to prepare their software for the upcoming changes.
- We prepare new versions of public libraries for seven different programming languages.
- We use as many integration tests as possible to cover API responses.
- We release a day later than announced, to stay on the safe side. This is another insurance policy against breaking a client’s software. In other words, we try to stay on the safe side and estimate all possible consequences that a change may cause then give clients plenty of time to prepare.
Monitoring API security
When you’re sharing something with the world, beware: There’s always going to be someone out there who will play with your tool, experiment with it, and/or try to hack it. With that in mind, we always apply the strongest security policies to our API infrastructure to protect our assets. There is an external team we work with who is specialized in security, and any reports submitted regarding possible security issues are sent to us following the CWE Dictionary and given the highest priority in our JIRA task management software.
Looking forward: Modernizing with less risk
If you missed our two previous engineering articles, “Upwork Modernization: An Overview” and “Modernizing Upwork with Micro Frontends”, definitely take a look to get up to speed on what we’ve been up to with modernizing our platform. As far as the APIs go, we are in the active phase of the modernization, as well. You can see some of what we have achieved by visiting the new API Center, based on the modernized micro-frontend architecture.
For the grouping of APIs for the next iteration of the platform, we created a from-scratch Agora service called “API Gateway”* that allows us to quickly solve the following tasks:
- Load balancing
- Logging and metrics
- User authentication and authorization
- Resource access control
- Rate limiting
- Service discovery and routing
- Availability and scalability
In the following diagram, you can really see the benefits of the flexibility we’ve achieved after migrating from the legacy software to the new, modernized infrastructure.
Our “API Gateway” Agora service is a large and complex topic in its own that probably deserves a separate article—stay tuned!
*“API Gateway” is an Agora service that is deployed like any other Agora service. It has different addresses that can be found in Eureka. This is why we need a separate “proxy” layer with a static DNS name, Prana.
Sanitization helps make complex documentation easier
When you have so many API resources like Upwork does, many of which are shared with other organizations who don’t have visibility into all of the technical details of your infrastructure, you have to be sure you’re giving them clear and precise documentation. That documentation should describe the many parameters of your API and the meaning of each returned field.
Collecting the changes and keeping them up-to-date is a real challenge unless you have a good solution for this.
After struggling with this problem for awhile, we decided that we didn’t need a person or even a team to update the documentation manually—it could be automated. Any changes developers make to the API could be automatically published in the documentation. To automate this process we did the following:
- We created a carcass using Sphinx, which contains the static API information.
- We added pre-build output filters that are encapsulated in the documentation within the build step.
- We created an endpoint that communicates with Sphinx using JSON, bringing in the latest changes to the documentation. We actually took the Swagger format and adapted JSON to our specific needs.
Below is an example a JSON file for a public Authenticated User API.
"summary": "Get authenticated user info.",
"notes": "This API call returns detailed information about the currently authenticated user.",
"scope": "Access your basic info",
"description": "Response format.",
"enum": ["json", "xml"],
While JSON files describe the API and its parameters, output filters do a couple of other jobs:
- They control what we expose externally, limited only to whitelisted fields. These filters are big arrays and it would be very stressful on the system to apply them on the fly. Thus, specific software translates them into the native programming language and encapsulates them into the code using a closure technique.
- Other software analyzes the filters and collects the list of unique fields, while respecting their place in the structure (we use a dot symbol to mark levels, e.g. “milestones.id” means that “milestone“ contains “id”). Then, they transform the filters into the list of “Possible output fields” (a section with the same name is available in the documentation for an API resource).
Needless to say, Upwork has a huge infrastructure and any poorly planned change can cause major stress to the system. On the other hand, to succeed we need to be constantly and persistently moving toward new and better goals. To accomplish this while minimizing risk, we’ve worked to provide a tool for integrating with the system that doubles the responsibility. At Upwork, we do our best to provide robust and modern software to consumers, and we are excited to change the way the world works!
Maksym Novozhylov started as a developer with oDesk, continued as a senior engineer with Elance-oDesk, and continues to work with Upwork as a product architect.View Maksym Novozhylov’s other articles