Why we need an independent index of the web

Lewandowski, D.: Why We Need an Independent Index of the Web. In: König, R.; Rasch, M. (Hrsg.): Society of the Query Reader. Amsterdam : INC, 2014. S. 50-58.

Download: Publisher Version

Introduction

Search engine indexes function as a ‘local copy of the web’, forming the foundation
of every search engine. Search engines need to look for new documents constantly,
detect changes made to existing documents, and remove documents from the index
when they are no longer available on the web. When one considers that the web comprises
many billions of documents that are constantly changing, the challenge search
engines face becomes clear. It is impossible to maintain a perfectly complete and
current index. The pool of data changes thousands of times each second. No search
engine can keep up with this rapid pace of change. The ‘local copy of the web’ can
thus be viewed as the Holy Grail of web indexing at best – in practice, different search
engines will always attain a varied degree of success in pursuing this goal.