Predicting the size of candidate document set for implicit web search result diversification


Ulu Y. B., Altingövde I. S.

42nd European Conference on IR Research, ECIR 2020, Lisbon, Portekiz, 14 - 17 Nisan 2020, ss.410-417 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Doi Numarası: 10.1007/978-3-030-45442-5_51
  • Basıldığı Şehir: Lisbon
  • Basıldığı Ülke: Portekiz
  • Sayfa Sayıları: ss.410-417
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

© Springer Nature Switzerland AG 2020.Implicit result diversification methods exploit the content of the documents in the candidate set, i.e., the initial retrieval results of a query, to obtain a relevant and diverse ranking. As our first contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. As a second improvement, we propose to automatically predict the size of candidate set on per query basis. Experimental evaluations using our BM25 runs as well as the best-performing ad hoc runs submitted to TREC (2009–2012) show that our approach improves the performance of implicit diversification up to 5.4% wrt. initial ranking.