2

I have a specific programming need where I need to efficiently store large sorted sets in memory, query them for ranges, and intersect them against other sets that are also queried for ranged.

I am looking at Redis, but I can't see a range slice command. MongoDB can only use 1 index, so it has to perform row-level scans, whereas I wish to process using columns that are intersected.

I'm also looking at Counchbase, but can't easily determine from the documentation if it is suited to this. I know it uses Memcached, which is AFAIK not suited to this usage.

Could anyone share potential solutions for this specific problem?

EDIT For example, I need to perform the following:

Get the IDs of all cars where the price is between 2000 and 3000, and intersect that will all cars where the engine capacity is between 3000 and 4000.

IamIC
  • 151
  • 6
  • 1
    What amounts of data are we talking about? (And why not a plain old RDBMS?) – Mat Jul 08 '12 at 14:10
  • 1
    Not really my area, but I think you are trying to solve a relational problem with a non relational storage. That's not really going to work, it can, but... – yannis Jul 08 '12 at 14:12
  • Redis has [sorted sets](http://redis.io/commands#sorted_set). – dan_waterworth Jul 08 '12 at 14:47
  • The data is not suited to a RDBMS. I built this in SQL Server, PostgreSQL, and MongoDB. MongoDB was 6X faster. – IamIC Jul 08 '12 at 14:51
  • This is a relational problem in that the relations are multidimensional. Not something SQL Server enjoys since it is only solvable by EAV. – IamIC Jul 08 '12 at 14:52
  • I found the Redis Range command that will do what I want. Now I need to compare it to Couchbase. – IamIC Jul 08 '12 at 14:52

1 Answers1

1

Java provides Treemap. In that language, I'd put the cars in one Map keyed by price and one keyed by engine size. Pull the desired set from one of the "tables" and put it in an easily searchable collection. (A TreeMap, TreeSet, or HashSet might do.) Then scan the desired range of the other Treemap and see which of its cars are in your searchable collection. (Or you could just run retainAll() against your searchable collection, passing it the desired subset of cars from the other collection. But check that the Sun/Oracle code is as fast as anything you'd write.)

C# seems to have all the pieces to do this also. Endless options, in fact. (But it's harder to guess at efficiency.) I'd expect other languages to have the same. Whether they're better than Redis would be a question. In fact, you might be able to use my Java Collections Framework tricks with Redis.

Note: in memory, a straight linear search might not take too long, especially if the data is in an array, an array-based collection, or some kind of linked list. It might not be worth it to get fancy.

You'll want to blend in the data structures you use for this solution with whatever else you're doing in memory.

RalphChapin
  • 3,270
  • 1
  • 14
  • 16