14

I'd like to know what would be a better approach to invalidate/update cache objects.

Prerequisites

  • Having remote memcached server (serving as cache for multiple applications)
  • All servers are hosted by azure (affinity regions, same data centers)
  • Cache object size ranges from 200 bytes up to a 50 kilobytes


Approach 1 (store in cache asap)

  1. Object A is created -> store in database and store in cache
  2. Object A requested by client -> check cache for existence, otherwise fetch from database and store in cache
  3. Object A gets updated -> store in database, store in cache

Approach 1 seems to be more straightforward. If something is created, put in the cache asap. Regardless of someone will need it.


Approach 2 (lazy cache store)

  1. Object A is created -> store in database
  2. Object A requested by client -> check cache for existence, otherwise fetch from database and store in cache
  3. Object A gets updated -> store in database, delete key in cache

Approach 2 seems to be more memory-aware. In this approach only requested items go to cache.


Question 1: In mind of performance, what would be a better approach? Memory nor CPU do not count yet.

Question 2: Are my thoughts a kind of premature optimization?

Question 3: Any other thoughts? Other approaches?

Joseph Quinsey
  • 143
  • 2
  • 17
lurkerbelow
  • 1,009
  • 1
  • 10
  • 19

2 Answers2

12
  1. Is unanswerable, except to say it depends. There are a lot of factors which will determine which approach is going to be the best in your case, e.g.: Is it normal for created objects to be retrieved shortly after they are created? What's the ratio of updates to accesses?
  2. Re. deciding you need a cache: If you're optimising without data then yes, it's technically premature optimisation. I say technically since experience/conventional wisdom may tell you you're going to need a cache of some sort. Re. deciding how the cache will best work: yes, it's definitely premature optimisation.
    • Optimisation often isn't about finding the best/most optimal solution. It should go as follows:
      1. Find the bottlenecks in the system.
      2. Find where you can make the biggest difference with the least amount of work.
      3. Do the least amount of work!
      4. Is it fast enough yet? If not, go to #1.
      5. Done!
    • Honestly, neither of the approaches you describe sound complicated. Why not implement both and see which works best?
    • Step 3 in approach #2 could be changed to "Object A gets updated -> store in database, update entry in cache".
vaughandroid
  • 7,569
  • 4
  • 27
  • 37
2

memcached manages objects with its own policy, which cached object would expire if no one accesses it or the memcached run out of memory. Therefore, your first approach is not a good idea as your object in memcached would keep being invalidated due to out-of-memory when you are creating objects.

Q1. Approach 2 would be better in terms of performance because it doesn't send object to memcached, although the performance improvement is very little.

Q2. It is hard to say. Assume you know the bottleneck and draft the approaches it would not be a premature.

Q3. There are other approach such as cache in memcached only.

neo
  • 226
  • 2
  • 7