Predicting the size of candidate document set for implicit web search result diversification

Ulu Y. B., Altingövde I. S.

42nd European Conference on IR Research, ECIR 2020, Lisbon, Portekiz, 14 - 17 Nisan 2020, ss.410-417, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Doi Numarası: 10.1007/978-3-030-45442-5_51
Basıldığı Şehir: Lisbon
Basıldığı Ülke: Portekiz
Sayfa Sayıları: ss.410-417
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

© Springer Nature Switzerland AG 2020.Implicit result diversification methods exploit the content of the documents in the candidate set, i.e., the initial retrieval results of a query, to obtain a relevant and diverse ranking. As our first contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. As a second improvement, we propose to automatically predict the size of candidate set on per query basis. Experimental evaluations using our BM25 runs as well as the best-performing ad hoc runs submitted to TREC (2009–2012) show that our approach improves the performance of implicit diversification up to 5.4% wrt. initial ranking.