Cost-Aware Strategies for Query Result Caching in Web Search Engines


Creative Commons License

Ozcan R., Altingovde İ. S., Ulusoy O.

ACM TRANSACTIONS ON THE WEB, cilt.5, sa.2, 2011 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 5 Sayı: 2
  • Basım Tarihi: 2011
  • Doi Numarası: 10.1145/1961659.1961663
  • Dergi Adı: ACM TRANSACTIONS ON THE WEB
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Anahtar Kelimeler: Algorithms, Performance, Experimentation, Query result caching, Web search engines
  • Orta Doğu Teknik Üniversitesi Adresli: Hayır

Özet

Search engines and large-scale IR systems need to cache query results for efficiency and scalability purposes. Static and dynamic caching techniques (as well as their combinations) are employed to effectively cache query results. In this study, we propose cost-aware strategies for static and dynamic caching setups. Our research is motivated by two key observations: (i) query processing costs may significantly vary among different queries, and (ii) the processing cost of a query is not proportional to its popularity (i.e., frequency in the previous logs). The first observation implies that cache misses have different, that is, nonuniform, costs in this context. The latter observation implies that typical caching policies, solely based on query popularity, can not always minimize the total cost. Therefore, we propose to explicitly incorporate the query costs into the caching policies. Simulation results using two large Web crawl datasets and a real query log reveal that the proposed approach improves overall system performance in terms of the average query execution time.