Hide/Show Apps

Large-scale cluster-based retrieval experiments on Turkish texts

Download
2007-11-30
Altıngövde, İsmail Sengör
Ozcan, Rifat
Ocalan, Huseyin Cagdas
Can, Fazli
Ulusoy, Özgür
We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.