wiki:DespoinaLog_Index_search

Version 1 (modified by antonak, 14 years ago) (diff)

--

Index search

"The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query.Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power. For example, while an index of 10,000 documents can be queried within milliseconds, a sequential scan of every word in 10,000 large documents could take hours. The additional computer storage required to store the index, as well as the considerable increase in the time required for an update to take place, are traded off for the time saved during information retrieval."

Documentation

http://www.molgenis.org/wiki/IndexBasedSearch

Index Design Factors

Merge factors

Which features are selected to enter the index. This is configurable through the configuration file in the plugin ,

Storage techniques
How to store the index data , that is, whether information should be data compressed or filtered.
Index size
How much computer storage is required to support the index.
Lookup speed
How quickly a word can be found in the inverted index. The speed of finding an entry in a data structure, compared with how quickly it can be updated or removed, is a central focus of computer science.
Maintenance
How the index is maintained over time. [6]
Fault tolerance