Friday, February 4, 2011

Unit 4: Matching Models, Ranked Boolean and Vector Space

Boolean searches are powerful tools in information retrieval that can provide improved results for the end user.

The task for the creator of the IR system is to have the system perform the Boolean search as efficiently as possible. Without using ranked indexes it is simple enough to implement Boolean search with a basic algorithm, but accuracy may be sacrificed. Users using the AND operator will get focused results but users using the OR may get many irrelevant results.

This is where ranking terms will help accuracy. Often times a document which contains frequent use of a term may not be more important than a document that uses the term a single time. This problem can be solved by using vector space and giving a weight to terms.

No comments:

Post a Comment