Limitations and Improvement Opportunities for Implicit Result Diversification in Search Engines


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Türkiye

Tezin Onay Tarihi: 2019

Tezin Dili: İngilizce

Öğrenci: YAŞAR BARIŞ ULU

Danışman: İsmail Sengör Altıngövde

Özet:

Search engine users essentially expect to find the relevant results for their query. Additionally, the results of the query should contain different possible query intents, which leads to the well-known problem of search result diversification. Our work first investigates the limitations of implicit search result diversification, and in particular, reveals that typical optimization tricks (such as clustering) may not necessarily improve the diversification effectiveness. Then, as our second contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. Third, as our detailed analysis reveals that the candidate set size plays a critical role for implicit diversification, we propose to automatically predict the size of the candidate set on per query basis. To this end, we use a rich set of features based on the inter-similarity of documents and similarity between queries and documents. Finally, we propose caching similarities of document pairs to improve the processing time efficiency of implicit result diversification.