Cache maintained by caller app or by provider app in microservices inter-app communication?

Question

Suppose there are app1 and app2 in a microservices. app2 needs to call app1's RESTful APIs. Cache is needed since the call will be frequent. So after a successful RESTful call, there will be cache available for app2.

My question is who maintains the cache, app1 or app2? Any comparison between the two? Is there industry best practices?

In addition, are there any differences if there's a app3 also needs to call the app1's APIs?

Bart van Ingen Schenau · Accepted Answer · 2019-07-18T05:25:55.277

If there is only one requester (one instance) that uses app1, then it doesn't really matter where the caching is done. As soon as a second requester comes into the picture (either a new app or a second instance), it will be most efficient to have the cache either in app1 or in a separate caching server in front of app1, because

only a single copy of the response needs to be cached, rather than a copy in each requester
when two different requesters ask for the same data, the second one can be served the cached copy. That would not be possible if the cache was on the requester side.

As I mentioned above, the caching doesn't even have to be within the code of app1, but a dedicated caching server could be used. Then app2 and app3 would send their requests for app1 to the caching server. If the server has a recent-enough response in their cache, then that response will be returned and otherwise the request is forwarded to app1 and the response from app1 is added to the cache. If the microservices use HTTP or HTTPS to communicate with each other, then you can use an off-the-shelf caching server.

If you use caching outside app1, then you have to determine for how long it is acceptable, in the worst case, that the cache serves old information while new information has just become available in app1. If the data at an endpoint can change without a write action occurring that is seen by the cache implementation, then you must put some time limit on how long the cache may regard the information as recent enough to serve it from the cache. In addition to this time limit, caching servers will also mark information as obsolete when they see a request to the same endpoint that has the semantics of (potentially) changing the data at that endpoint.

Yes, you are right. I also want to use a dedicated caching server like redis between `app1` and `app2`. But one thing I don't quite figure out is if `app2` maintains the cache, when `app1` updates some info, how will `app2` know the latest info? — Wei Huang, Jul 18 '19 at 04:19
However, from one of my earlier search, [this discussion](https://dev.to/aldex32/comment/2d1l) seems not recommend shared cache. — Wei Huang, Jul 18 '19 at 12:03
@WeiHuang: In that discussion, the term "cache" is used with the meaning of "remember something locally to correlate asynchronous requests and responses with each other". That is a completely different kind of caching that can, by definition, only be done within the service that needs to do the correlation. — Bart van Ingen Schenau, Jul 18 '19 at 13:14

Lie Ryan · Answer 2 · 2019-07-19T09:24:13.810

As a rule of thumb, a cache maintained and running close to the client/caller side scales much better for read-heavy service, while a cache maintained and running close to the server/callee side is easier to maintain for frequently updated data. In many scenarios, you may need both types of caches.

Broadly speaking, there are three types of cache invalidation strategy: expiry, token check (e.g. ETag), and push invalidation. Push invalidation is best for fast invalidation, but also the most resource intensive as both sides have to maintain an open connection at all time. Expiry is simple and scales very well, but quick invalidation is tricky.

In most cases, the server should be in control of caching, even when using a client-side cache. This can be done by the server advising the clients how and what can be cached and what invalidation mechanism to use explicitly on each request and response. In HTTP, this is usually done by using the Expires and Cache Control header (and a couple other headers to specify the exact caching semantic) or the ETag/If-Match/If-None-Match headers or by invitation to use web socket. HTTP provides a very rich vocabularies for the server to describe the caching requirements of a resource to the clients and any intermediary servers. As the other answer mentioned, if you use HTTP, you'll benefit from the widespread support of off the shelf caching components.

Another option is for the server to document how long the client are allowed to cache and how they invalidate and all clients just have to assume that, this may be ok if you control all endpoints and can co-ordinate a cache invalidation. But in very large networks with a diverse types of clients and intermediaries, this may be tricky to do correctly, so you may want to consider negotiating caching explicitly anyway. The benefit of implicit caching is that your communication protocol is cleaner, as it doesn't need to waste bytes negotiating caches; if your microservice network exchanges large number of small packets, where these negotiation overhead are problematic, implicitly negotiated caching may be the only sensible way to go.

Thank you Lie! I really learnt a lot from your reply especially the cache invalidation part. — Wei Huang, Jul 19 '19 at 09:11

Cache maintained by caller app or by provider app in microservices inter-app communication?

2 Answers2