Few thumb rules for speeding up onload API response times. And some caveats.


Performance tuning is always one of my favorite software engineering activity. But it is filled with wrong turns and potholes for the unwary. Also, in my opinion, benchmarking the system the right way doesn’t seem to be an average engineers strong suit.

Anyways, below are the few thumb rules to improve onload API response times. Since software engineering is all about tradeoff’s please take the below suggestions with a grain of salt and always validate them for your own environment. So, YMMV1.

Enough rambling! Here we go:

  1. If two or more API’s are independent of each other, then initiate those calls asynchronously from the browser.


    Modern browsers initiate parallel calls while loading static assets and async has become the de facto mode to make Ajax calls. But this also means the web API tier is going to be hammered with more calls per second than before. For directly exposed API’s one can solve this by using rate limits. For internal API’s we rate limit where it makes sense in addition to performance testing to make sure if the increased load plus buffer can be handled by the system. But even if web layer can handle all that load, it’s downstream could still be affected by heavy load and cascade that failure back to web layer. In this case, the recommended strategy is to use Circuit Breaker 2 at web layer to protect itself from any downstream system failures.

  2. If two or more API’s are dependent on each other so that one call’s response becomes another call’s request, then just make one call to a facade (simplified API interface) from UI and let the web tier take care of orchestrating the multiple dependent calls before returning the response.


    Now, UI layer doesn’t have to worry about all the messy details of orchestrating multiple API calls at the expense of complicating the web API layer. Web API layer would have to spawn threads for each of the API to be called and wait on all the response before sending back the response to UI layer. Using threads implies using thread pools to make sure we are not over utilizing the OS resources. Also the Web layer must protect itself from the cascading failure of it’s downstream using Circuit Breaker but this time any of the multiple threads could fail. So now we also have to handle those extra types of exceptions like InterruptedException etc.

  3. Preload the page’s data in a cache before the user lands on the page.


    The downside is, the web API layer has to depend upon an extra caching layer which introduces its own set of complexity. Web API layer could call the caching layer which in turn call downstream when data is not present in cache and populate it before returning the same data to web API layer. Even though this is a common pattern, we cannot help but notice and handle the new failure scenario’s that are exposed by this technique.


Performance tuning is a rewarding work if you know what you are doing. Else, it is the software engineering’s equivalent of shooting yourself in the foot.


1 Your Miles May Vary

2 Circuit Breaker