The sheer proliferation of APIs makes it easy to build API-driven apps. But when you deploy it, that’s when the rubber hits the road. From then on, a lot can go wrong.
If your app has intermittent spikes in traffic or steadily goes viral, scalability becomes an important consideration. It is non-deterministic because it depends on how the users perceive and interact with your app. You have no control over it.
It is difficult to entirely predict the issues at scale unless you are operating at scale. While wisdom and experience surely help, it is challenging to anticipate scalability requirements. Therefore it is vital to take measures to manage scale from the onset.
This blog post addresses some of the universal strategies for managing scale for API driven apps. These are primarily architectural suggestions based on foreseeable patterns observed in a system under stress due to scale.
Rakuten RapidAPI Enterprise Hub is a complete API management suite offered by Rakuten RapidAPI. Whether you are using internal APIs or accessing third-party APIs, the Enterprise Hub lets you govern and control the access to multiple APIs for your application.
Table of Contents
What is scalability?
Scalability challenges are always a result of resource crunch. In APIs, resource crunch can manifest itself in a lack of computing, memory, or network resources. But instead of getting into the specifics, let’s look at it at a high level.
For any application, a typical API interaction looks somewhat like this.
The general pattern of API-driven application development follows two types of interactions.
- The interaction between the frontend and the API interface
- Interaction between the API interface and backend.
API interface acts as the common middleware that manages the interactions. Scalability issues arise when many users start interacting with the common API backend.
In this case, the frontend application is hosted on the user’s device, but the API interface and backend are deployed on a centralized location. Unless the API deployment is expanded to serve more frontend instances, it cannot handle scale.
How to Tackle Application Scalability Challenges?
Scalability challenges present themselves as architectural anomalies. It is a structural issue that goes beyond the business logic and software modules. It is a sign of a system whose subsystem arrangement is rigid.
Therefore, the hallmark of a truly scalable system is its ability to accommodate its internal subsystem architecture to expand or shrink as per user demands. The API interface and backend must transform itself based on the number of hits for an API.
To make this possible, there are a few crucial considerations while defining the internal subsystem blueprint of the application. These are in the form of three architectural strategies that enable system design to scale from the ground up.
1. Vertical Partitioning (Divide and Conquer)
This approach is the most commonly used technique to achieve scale. This technique entails splitting the system into multiple instances that work in parallel. The work gets divided between the instances to conquer the surge in traffic.
In the case of an API-driven app, the API backend infrastructure is scaled vertically to serve more users.
2. Horizontal Partitioning (Split and Decouple)
This idea comes from the single responsibility principle in software design. Every software module should have responsibility for a single part, which collectively becomes the whole system’s functionality. When the entire system is split that way, horizontal partitioning is achieved.
For API-driven apps, this is achieved by partitioning the system into multiple functional modules across the frontend and backend.
3. Segregation of Information Dissemination (Distribute and Decentralize)
Like a typical information system, APIs also exist to serve information. With every API request, a response is sent back. It traverses a series of subsystems before being consumed by the frontend.
When the source of information is centrally located, it adds to the subsystem’s burden, which is responsible for retrieving the data. This happens with the API backend where the primary database is located, which is the primary source of all information. A delay at the source has a domino effect on the entire downstream chain of information traversal across the system. .
By establishing intermediate information dissemination points, it is possible to manage the scale of information retrieval.
Apart from easing the burden on the primary information source, this approach also speeds up the retrieval process by moving the information source closer to the information sink.
The Enterprise Hub supported integration with Kubernetes for scalable deployment. Additionally, it offers flexible deployment configurations for a mix of public and private cloud along with on-premise options.
Strategies for Application-wide Scalability
The architectural suggestions described above are universal. Let’s get a bit more specific now.
Depending upon whether you are using internally developed APIs or external, third-party APIs, you may or may not have control over the entire system behind your application. Consequently, your ability to scale is limited to either the whole or a part of the system.
Considering a typical web-based, API-driven application, the architectural suggestions can be expanded to specific techniques and acceptable norms for managing scale.
You can apply these strategies by splitting the application into the logical components:
- Frontend: The user-facing component of the application, usually in the form of web apps or mobile applications.
- Middleware: The intermediary components that handles the common functionality across the application and also orchestrate the interaction between frontend and backend.
- Backend: The components that reside at the server end, and are responsible for business logic execution and data handling.
Scaling the Frontend
Frontend components do not have much scope to improve scalability because they are the information sink. However, by adopting some smart strategies, application architecture can be optimized to reduce the burden on the API backend.
- Micro-frontend: Micro-frontend is increasingly being adopted as a preferred architectural pattern to achieve horizontal partitioning in a frontend application. As the name suggests, micro-frontend is about splitting the frontend application into a set of logically separated micro-apps. Apart from achieving modularity in the frontend design, this approach fosters a better aggregation of related frontend and backend components into isolated subsystems.
- Caching at the UI layer: Caching is the preferred way of achieving the segregation of information dissemination. By enabling a caching mechanism under the hood, the UI layer is no longer dependent on API calls for every action triggered by the user. A smart caching strategy for mobile apps also can enable offline mode in case of disruption in Internet connectivity.
Scaling in Middleware
One of the critical considerations for scalability is response time. How fast the system responds and adjusts to changes is a measure of how scalable it is. Middleware components are the most susceptible to these changes because they handle the interconnections between the frontend and the backend.
Components such as API gateways, message brokers, and web servers fall into this category. They act as the conduits for channeling requests and responses.
Here are some ways in which you can scale the middleware.
- Microservices: Microservices offer a greater degree of modularization. Combined with micro-frontends, they aid in achieving end-to-end horizontal partitioning of the system. Additionally, when deployed through containers, they are easily deployed as containerized applications that scale vertically. Containers are also faster to deploy compared to virtual machines.
- Intelligent Message Routing: When building a geographically distributed application, the middleware must be capable of serving users from across the world. This requires a more intelligent form of information dissemination mechanism that goes up one level to serve users from proximity to their reported geolocation. Edge optimized routing achieves that with the help of a regional point of presence (POP). A POP located on the USA’s west coast can serve users located in this region instead of routing them to a distant POP.
- Communication Patterns: Handling scale involves the dynamic allocation of resources. In a static system, the interactions between the subsystems are fixed. In a dynamically changing system, managing subsystem interactions is an additional chore. Therefore the patterns for subsystem communication must evolve and be flexible. The publish/subscribe pattern is one of the most widely adopted architectural patterns. It decouples the subsystems such that they do not have to know each other and yet communicate. Applications that are geographically distributed, and meant to operate at internet-scale, cannot do without it.
Scaling the Backend
If you are building internal APIs to support your application, then scaling the backend is your responsibility.
Most scalability challenges result in a bottleneck at the backend. Think of databases, I/O operations, and network sockets. Your API backend is always dependent on one of these, and the chances are that you will face challenges as your API scales.
Irrespective of these bottlenecks, the same architectural strategies can very well be applied to tackle scalability problems.
- Replication: For backend, replication involves having multiple database servers synced to contain the same copy of data. Databases, being the single source of data, pose the biggest threat in scalability. They deal with the scale of the incoming database operations as well as the volume of stored data. Therefore there is a need for a multi-pronged divide and conquer approach. One of the conventional ways of handling scale in databases is bifurcating the read and writing operations into separate database replicas. This way, the operations are divided based on the type of operation. Configuring additional servers on top of operational partitioning spreads the load across multiple database replicas. Beyond replication, and for handling massive data volumes, a more granular division is achieved by sharding the databases. Additionally, replication is also essential for building a fault-tolerant system.
- Caching: Like the frontend, caching saves precious time in accessing data frequently from the database. It follows the same strategy of segregating the information dissemination points to avoid costly I/O operations on the database. Combined with replication, it offers excellent performance for database read operations, which constitutes the bulk of API operations for a typical SaaS application.
- Hybrid Cloud: Hybrid cloud deployment consists of a private cloud as well as a public cloud. The private cloud is owned and managed by the organization hosting the API. A third-party cloud service provider manages the public cloud. An application hosted over a hybrid cloud will have both the cloud deployments linked to each other somehow. However, they remain independent and unique. This offers an opportunity to employ a horizontal separation strategy to spread the application deployment across the hybrid cloud entities, based on a logical grouping. For example, if the organization deploys a set of APIs for internal purposes, accessible from the intranet, such APIs can be hosted on the private cloud. Other APIs for external consumption can be deployed on a public cloud.
Planning for Scale
No matter with strategies and techniques you follow, everything boils down to the specifics.
The scalability performance will ultimately depend on the individual tools you select, precise configuration settings you provision, and the underlying application hosting infrastructure you choose.
There are tons of such tools and platforms to implement each of the architectural strategies. You must take into account the time for planning, which includes evaluating the tools, deploying, and optimizing them for optimum scalability. This is a separate topic in itself. It diverges into multiple tool-specific subtopics.
As an aside, if you are planning to use Kubernetes to scale your application backend, then you can check out how to apply the architectural strategies through a K8 cluster to deploy microservices. On similar lines, if you are going to use Redis as the caching layer at the backend, then it’s time for you to head over to Redis official documentation to explore more.
So are you ready for scale thinking?
We hope this post kindles your mind to incorporate these strategies along with design thinking. The next time you are architecting an API-driven app, keep them in mind, and your app will surely scale the heights of success.
Rakuten RapidAPI Enterprise is a one-stop solution for CXOs to plan and administer API-led strategies for application deployment. It is also integrated with the Rakuten RapidAPI’s API marketplace which offers thousands of proven third-party APIs that can be leveraged to expedite application development.
Leave a Reply