Shared Cache - Invalidation Best Practice

Question

I'd like to know what would be a better approach to invalidate/update cache objects.

Prerequisites

Having remote memcached server (serving as cache for multiple applications)
All servers are hosted by azure (affinity regions, same data centers)
Cache object size ranges from 200 bytes up to a 50 kilobytes

Approach 1 (store in cache asap)

Object A is created -> store in database and store in cache
Object A requested by client -> check cache for existence, otherwise fetch from database and store in cache
Object A gets updated -> store in database, store in cache

Approach 1 seems to be more straightforward. If something is created, put in the cache asap. Regardless of someone will need it.

Approach 2 (lazy cache store)

Object A is created -> store in database
Object A requested by client -> check cache for existence, otherwise fetch from database and store in cache
Object A gets updated -> store in database, delete key in cache

Approach 2 seems to be more memory-aware. In this approach only requested items go to cache.

Question 1: In mind of performance, what would be a better approach? Memory nor CPU do not count yet.

Question 2: Are my thoughts a kind of premature optimization?

Question 3: Any other thoughts? Other approaches?

score 12 · Answer 1 · answered Jan 14 '13 at 11:48

Is unanswerable, except to say it depends. There are a lot of factors which will determine which approach is going to be the best in your case, e.g.: Is it normal for created objects to be retrieved shortly after they are created? What's the ratio of updates to accesses?
Re. deciding you need a cache: If you're optimising without data then yes, it's technically premature optimisation. I say technically since experience/conventional wisdom may tell you you're going to need a cache of some sort. Re. deciding how the cache will best work: yes, it's definitely premature optimisation.
- Optimisation often isn't about finding the best/most optimal solution. It should go as follows:
  1. Find the bottlenecks in the system.
  2. Find where you can make the biggest difference with the least amount of work.
  3. Do the least amount of work!
  4. Is it fast enough yet? If not, go to #1.
  5. Done!
- Honestly, neither of the approaches you describe sound complicated. Why not implement both and see which works best?
- Step 3 in approach #2 could be changed to "Object A gets updated -> store in database, update entry in cache".

Baqueta, thanks for your answer. I do appreciate it. – lurkerbelow Jan 20 '13 at 07:38 — lurkerbelow, Jan 20 '13 at 07:38
@lurkerbelow Glad to help. – vaughandroid Jan 21 '13 at 09:29 — vaughandroid, Jan 21 '13 at 09:29

score 2 · Answer 2 · answered Jan 14 '13 at 13:03

memcached manages objects with its own policy, which cached object would expire if no one accesses it or the memcached run out of memory. Therefore, your first approach is not a good idea as your object in memcached would keep being invalidated due to out-of-memory when you are creating objects.

Q1. Approach 2 would be better in terms of performance because it doesn't send object to memcached, although the performance improvement is very little.

Q2. It is hard to say. Assume you know the bottleneck and draft the approaches it would not be a premature.

Q3. There are other approach such as cache in memcached only.

Shared Cache - Invalidation Best Practice

2 Answers2