Predicting the Trending Research Topics by Deep Neural Network Based Content Analysis


Creative Commons License

Yukselen M., MUTLU A., KARAGÖZ P.

IEEE ACCESS, vol.10, pp.90887-90902, 2022 (Peer-Reviewed Journal) identifier

  • Publication Type: Article / Article
  • Volume: 10
  • Publication Date: 2022
  • Doi Number: 10.1109/access.2022.3202654
  • Journal Name: IEEE ACCESS
  • Journal Indexes: Science Citation Index Expanded, Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Page Numbers: pp.90887-90902
  • Keywords: Market research, Predictive models, Analytical models, Feature extraction, Data models, Semantics, Operating systems, Trend prediction, keyword popularity prediction, deep learning, document vector, classification, paper embedding

Abstract

Tracking the trends and taking early steps accordingly is important in academia, as well as in other domains such as technology and finance. In this work, we focus on the problem of predicting the trending research topics from a collection of academic papers. Previous efforts model the problem in different ways and mostly apply classical approaches such as correlation analysis and clustering. There are also several recent neural model based solutions, however they rely on feature vectors and additional information for the trend prediction. In this work, given a collection of publications within the observation time window, we predict whether the use of a keyword will increase, decrease or be steady for the future time window (prediction window). As the solution, we propose a family of deep neural architectures that focus on generating summary representations for paper collections under the query keyword. Due to the sequence based nature of the data, Long Short-Term Memory (LSTM) module plays a core role, but it is combined with different layers in a novel way. The first group of proposed neural architectures consider each paper as a sequence of keywords and use word embeddings to construct paper collection representations. In this group, the proposed architectures differ from each other in the way year based and overall summary representations are constructed. In the second group, each paper is directly represented as a vector and the use of different paper embedding techniques are explored. The analyses of the models are performed on a variety of paper collections belonging to different academic venues, obtained from Microsoft Academic Graph data set. The experiments conducted against baseline methods show that proposed deep neural based models achieve higher trend prediction performance than the baseline models on the overall. Among the proposed models, paper embedding based models provide better results for most of the cases.