Tezin Türü: Yüksek Lisans
Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Türkiye
Tezin Onay Tarihi: 2019
Tezin Dili: İngilizce
Öğrenci: YAŞAR BARIŞ ULU
Danışman: İsmail Sengör Altıngövde
Özet:Search engine users essentially expect to find the relevant results for their query. Additionally, the results of the query should contain different possible query intents, which leads to the well-known problem of search result diversification. Our work first investigates the limitations of implicit search result diversification, and in particular, reveals that typical optimization tricks (such as clustering) may not necessarily improve the diversification effectiveness. Then, as our second contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. Third, as our detailed analysis reveals that the candidate set size plays a critical role for implicit diversification, we propose to automatically predict the size of the candidate set on per query basis. To this end, we use a rich set of features based on the inter-similarity of documents and similarity between queries and documents. Finally, we propose caching similarities of document pairs to improve the processing time efficiency of implicit result diversification.