Evaluating the retrieval effectiveness of Web search engines using a representative query sample

Lewandowski, D. (2015). Evaluating the retrieval effectiveness of web search engines using a representative query sample. Journal of the Association for Information Science and Technology, 66(9), 1763–1775.

Download: Preprint / Publisher Version


Search engine retrieval effectiveness studies are usually small-scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google’s and Bing’s results based on this sample. Jurors were found through crowdsourcing, data was collected using specialised software, the Relevance Assessment Tool (RAT). We found that while Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3% of cases whereas Bing only found the correct answer 76.6% of the time. We conclude that search engine performance on navigational queries is of great importance, as users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines.