We're building a search engine at a client's place. To evaluate the results, the client is comparing top N results of our search engine to top N results of a competitor. And they want me to get at least some "X" percent common results with competitor. And that's the only metric they're using to decide success of the project.
I tried telling them that that's not a good idea because of one simple fact that data we've would be significantly different from data they have. But they're conveniently ignoring this fact. (On a sample results, 50% of their documents are NOT indexed in our database).
Is it a good idea to evaluate results by comparing with another search engine? If yes, how do we handle fact that their dataset will naturally be different from our dataset. If it's not a good idea, why? What are the problems with that approach?
EDIT: 1. Just took top "N" results for a fixed number of queries. It appears that around 50% documents are not indexed in our database!
We're only comparing presence or absence of results, we're not concerned with ranking as of now. But I still feel it doesn't make it a good metric.
Out of our top N results, only 19% of the results are there in their top N results. In other words, intersection our results and their results yield 19% match.