Characterizing, Predicting, and Handling Web Search Queries That Match Very Few or No Results


Creative Commons License

Sarigil E., Altingovde I. S. , BLANCO R., Barla Cambazoglu B., ÖZCAN R., Ulusoy O.

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, vol.69, no.2, pp.256-270, 2018 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 69 Issue: 2
  • Publication Date: 2018
  • Doi Number: 10.1002/asi.23955
  • Title of Journal : JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY
  • Page Numbers: pp.256-270

Abstract

A non-negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no-answer queries, we devised a large number of features that can be used to identify such queries and built machine-learning models. These models can be useful for scenarios such as the mobile- or meta-search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.