Predicting the size of candidate document set for implicit web search result diversification

Ulu Y. B., Altingövde I. S.

42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14 - 17 April 2020, pp.410-417 identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1007/978-3-030-45442-5_51
  • City: Lisbon
  • Country: Portugal
  • Page Numbers: pp.410-417
  • Middle East Technical University Affiliated: Yes


© Springer Nature Switzerland AG 2020.Implicit result diversification methods exploit the content of the documents in the candidate set, i.e., the initial retrieval results of a query, to obtain a relevant and diverse ranking. As our first contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. As a second improvement, we propose to automatically predict the size of candidate set on per query basis. Experimental evaluations using our BM25 runs as well as the best-performing ad hoc runs submitted to TREC (2009–2012) show that our approach improves the performance of implicit diversification up to 5.4% wrt. initial ranking.