2

I have 6 web servers which are giving me problems due to cache inconsistency. I am thinking of building a cache invalidation service such that there is a topic on which all the servers can publish a message to invalidate an object. I am considering using the Amazon SNS for the making the topic.

Now for the servers to receive the invalidation messages, I am confused between the following:

  1. Should I be using the SQS queues for the servers to receive messages.
  2. Should I be using HTTP endpoints and then build an api on that route that invalidates the cache.

Could you please highlight the pros and cons of both of these approaches or any other approach that might benefit me.

----------UPDATE---------

If I use NGINX to redirect any request to one of the 6 servers. If I am using the HTTP endpoint, the topic will end up hitting only one of the servers. I also am not sure which port my application would be running on. Could you suggest me a way to by-pass the NGINX server or find out the port on the fly and then hit the servers.

tanvi
  • 123
  • 1
  • 5
  • You have 6 servers each using their own cache, but the cached data is actually common across all the servers? Is that the situation? – Andy Jun 17 '16 at 07:53
  • The cached data is not common. I need an invalidation service to invalidate the objects that are different across the servers. – tanvi Jun 17 '16 at 08:09
  • Wouldn't Amazon ElastiCache help? I mean, why would you have your own caching mechanism which would rely on Amazon anyway (if you use SNS), while you can get the benefits of centralized caching directly from Amazon with no additional effort? – Arseni Mourzenko Jun 17 '16 at 08:40
  • 1
    If the data is not common, shouldn't then each server be responsible for invalidating its own data, considering the data does not belong to any other server? For example if an update is requested, current cache is invalidated and the new (updated) value is cached. – Andy Jun 17 '16 at 08:49
  • @DavidPacker I misunderstood what you meant by common. The data across the servers is common and I want it to be consistent. – tanvi Jun 17 '16 at 09:11
  • @MainMa I am using the redis implementation of elasticache. But I still face problems because I also use the objects cached in the RAM of the server. – tanvi Jun 17 '16 at 09:13
  • @tanvi: why would you do that? Instead of having two caching mechanisms—Redis and in-memory cache—use only one, i.e. Redis only. Unless you have identified, through profiling, that Redis is actually a bottleneck, you should assume that Redis is fast enough. – Arseni Mourzenko Jun 17 '16 at 09:18
  • If you really want to have shared cache among all nodes (application servers) and to keep it consistent, you have no other choice than to build a cache server, run a caching mechanism like redis (which you mentioned) on that one and have all your nodes point to it as their cache source. If you realize, your caching server is taking a heavy load from all the 6 nodes, you will need another balancer and network of caching servers behind it. Everything outside this cache server (ie. in memory) will then not be considered as common cache and cannot be treated that way. – Andy Jun 17 '16 at 09:18

1 Answers1

1

According to the comments, you have a caching system which relies on:

  • A common Redis cache service used by all the servers,

  • An in-memory cache on every server which somehow duplicates some of the Redis data. Information stored in this cache is subject to inconsistency, and you are looking for a way to prevent this inconsistency.

First and foremost, you will probably only need the Redis cache. If configured correctly, it is fast enough to serve all six servers. If you only guess that it will make things faster by having an additional in-memory caching, then you're doing premature optimization. Instead, get rid of in-memory cache and think about it only if and when a profiler will identify Redis as a bottleneck.

If you already profiled your app and identified Redis as the bottleneck, you have a choice:

  • You may be able to change Redis configuration or modify the network or replace the hardware, depending on the actual issue you are encountering.

  • You may put Redis on several servers and let existent products deal with subjects such as failover, distributed cache invalidation, etc.

  • You may ask others, such as Amazon, to handle caching for you. This gives you a huge flexibility in terms of scale you need at a given moment.

  • You may finally use memcached or similar distributed caching solutions which would be installed on each of the six servers. Make sure you do use an existent product, and not reinvent your own, since creating your own could be very tricky; cache invalidation is one of the tricky parts of it, as you have noticed.

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • Thanks a lot @MainMa. You got the problem I am facing and probably wasn't able to express correctly. I will do some more research and get back. – tanvi Jun 20 '16 at 08:06