Problems with evaluating results of search engine by comparison

Question

We're building a search engine at a client's place. To evaluate the results, the client is comparing top N results of our search engine to top N results of a competitor. And they want me to get at least some "X" percent common results with competitor. And that's the only metric they're using to decide success of the project.

I tried telling them that that's not a good idea because of one simple fact that data we've would be significantly different from data they have. But they're conveniently ignoring this fact. (On a sample results, 50% of their documents are NOT indexed in our database).

Is it a good idea to evaluate results by comparing with another search engine? If yes, how do we handle fact that their dataset will naturally be different from our dataset. If it's not a good idea, why? What are the problems with that approach?

EDIT: 1. Just took top "N" results for a fixed number of queries. It appears that around 50% documents are not indexed in our database!

We're only comparing presence or absence of results, we're not concerned with ranking as of now. But I still feel it doesn't make it a good metric.
Out of our top N results, only 19% of the results are there in their top N results. In other words, intersection our results and their results yield 19% match.

Which search engine are you comparing with? If it's Google, you're competing with an army of PhD's. — Robert Harvey, Aug 07 '18 at 23:20
Fortunately it's not against google! It's a domain specific search engine! — tired and bored dev, Aug 08 '18 at 00:59

Lewis Pringle · Answer 1 · 2018-08-08T14:38:20.167

Comparison with an oracle is always a good idea, when available. Comparison with a principle competitor is also a good idea.

Your metric for comparison doesn't appear well thought out (what if the results come in a much different order? What if they come in a slightly different order?).

If you are indexing one set of URLs (documents) and your oracle (comparison) another set of documents (they overlap, but are not identical), and you wish to compare them - that's easy: just throw away any results not in the intersection. So in other words, before comparing, for each of your search engines results, check if its in the other search engine at all, and if not, throw it away (that may or may not be possible given how the other search engine works). But more easily, you can throw away any results from the other engine that don't appear in your results. You can then take enough extra results from each engine to get you up to "N" on both sides to compare.

That should give you a more fair apples to apples comparison of search engine results (not perfect because the engine may throw away some results because they look just like earlier results but the best I can think of given your constraints).

I've added extra information needed. Obviously we cannot know what their entire database is. But we could find only 50% of their results indexed in our database in a sample of results. — tired and bored dev, Aug 08 '18 at 01:10

Problems with evaluating results of search engine by comparison

1 Answers1