Event prediction from news text using subgraph embedding and graph sequence mining


Creative Commons License

ÇEKİNEL R. F. , KARAGÖZ P.

World Wide Web, 2022 (Journal Indexed in SCI) identifier identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2022
  • Doi Number: 10.1007/s11280-021-01002-1
  • Title of Journal : World Wide Web
  • Keywords: Graph mining, Sequential rule mining, Frequent subgraph mining, Graph embeddings, News prediction

Abstract

© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.Event detection from textual content by using text mining concepts is a well-researched field in the literature. On the other hand, graph modeling and graph embedding techniques in recent years provide an opportunity to represent textual contents as graphs. Text can be enriched with additional attributes in graphs, and the complex relationships can be captured within the graphs. In this paper, we focus on news prediction and model the problem as subgraph prediction. More specifically, we aim to predict the news skeleton in the form of a subgraph. To this aim, graph-based representations of news articles are constructed and a graph mining based pattern extraction method is proposed. The proposed method consists of three main steps. Initially, graph representation of the news text is constructed. Afterwards, frequent subgraph mining and sequential rule mining algorithms are adapted for pattern prediction on graph sequences. We consider that a subgraph captures the main story of the contents, and the sequential rules indicate the subgraph patterns’ temporal relationships. Finally, extracted sequential patterns are used for predicting the future news skeleton (i.e. main features of the news). In order to measure the similarity, graph embedding techniques are also employed. The proposed method is analyzed on both a collection of news from an online newspaper and on a benchmark news dataset against baseline methods.