Utilizing Word Embeddings for Result Diversification in Tweet Search


Onal K. D. , ALTINGÖVDE İ. S. , KARAGÖZ P.

11th Asia Information Retrieval Societies Conference (AIRS), Brisbane, Avustralya, 2 - 04 Aralık 2015, cilt.9460, ss.366-378 identifier identifier

  • Cilt numarası: 9460
  • Doi Numarası: 10.1007/978-3-319-28940-3_29
  • Basıldığı Şehir: Brisbane
  • Basıldığı Ülke: Avustralya
  • Sayfa Sayıları: ss.366-378

Özet

The performance of result diversification for tweet search suffers from the well-known vocabulary mismatch problem, as tweets are too short and usually informal. As a remedy, we propose to adopt a query and tweet expansion strategy that utilizes automatically-generated word embeddings. Our experiments using state-of-the-art diversification methods on the Tweets2013 corpus reveal encouraging results for expanding queries and/or tweets based on the word embeddings to improve the diversification performance in tweet search. We further show that the expansions based on the word embeddings may serve as useful as those based on a manually constructed knowledge base, namely, ConceptNet.