As a rule of thumb, a cache maintained and running close to the client/caller side scales much better for read-heavy service, while a cache maintained and running close to the server/callee side is easier to maintain for frequently updated data. In many scenarios, you may need both types of caches.
Broadly speaking, there are three types of cache invalidation strategy: expiry, token check (e.g. ETag), and push invalidation. Push invalidation is best for fast invalidation, but also the most resource intensive as both sides have to maintain an open connection at all time. Expiry is simple and scales very well, but quick invalidation is tricky.
In most cases, the server should be in control of caching, even when using a client-side cache. This can be done by the server advising the clients how and what can be cached and what invalidation mechanism to use explicitly on each request and response. In HTTP, this is usually done by using the Expires and Cache Control header (and a couple other headers to specify the exact caching semantic) or the ETag/If-Match/If-None-Match headers or by invitation to use web socket. HTTP provides a very rich vocabularies for the server to describe the caching requirements of a resource to the clients and any intermediary servers. As the other answer mentioned, if you use HTTP, you'll benefit from the widespread support of off the shelf caching components.
Another option is for the server to document how long the client are allowed to cache and how they invalidate and all clients just have to assume that, this may be ok if you control all endpoints and can co-ordinate a cache invalidation. But in very large networks with a diverse types of clients and intermediaries, this may be tricky to do correctly, so you may want to consider negotiating caching explicitly anyway. The benefit of implicit caching is that your communication protocol is cleaner, as it doesn't need to waste bytes negotiating caches; if your microservice network exchanges large number of small packets, where these negotiation overhead are problematic, implicitly negotiated caching may be the only sensible way to go.