Faster Dynamic Pruning via Reordering of Documents in Inverted Indexes


Yafay E., Altingövde I. S.

46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023, Taipei, Tayvan, 23 - 27 Temmuz 2023, ss.2001-2005 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1145/3539618.3591987
  • Basıldığı Şehir: Taipei
  • Basıldığı Ülke: Tayvan
  • Sayfa Sayıları: ss.2001-2005
  • Anahtar Kelimeler: dynamic pruning, inverted index, search engines
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Widely used dynamic pruning algorithms (such as MaxScore, WAND and BMW) keep track of the k-th highest score (i.e., heap threshold) among the documents that are scored so far, to avoid scoring the documents that cannot get into the top-k result list. Obviously, the faster the heap threshold converges to its final value, the larger will be the number of skipped documents and hence, the efficiency gains of the pruning algorithms. In this paper, we tailor approaches that reorder the documents in the inverted index based on their access counts and ranks for previous queries. By storing such frequently retrieved documents at front of the postings lists, we aim to compute the heap threshold earlier during the query processing. Our approach yields substantial speedups (up to 1.33x) for all three dynamic pruning algorithms and outperforms two strong baselines that have been employed for document reordering in the literature.