BoVSG: bag of visual SubGraphs for remote sensing scene classification

Amiri K., Farah M., LELOĞLU U. M.

INTERNATIONAL JOURNAL OF REMOTE SENSING, vol.41, no.5, pp.1986-2003, 2019 (Peer-Reviewed Journal) identifier identifier

  • Publication Type: Article / Article
  • Volume: 41 Issue: 5
  • Publication Date: 2019
  • Doi Number: 10.1080/01431161.2019.1681602
  • Journal Indexes: Science Citation Index Expanded, Scopus
  • Page Numbers: pp.1986-2003


Remote sensing scene classification is gaining much more interest in the recent few years for many strategic fields such as security, land cover and land use monitoring. Several methods have been proposed in the literature and they can be divided into three main classes based on the features used: handcrafted features, features obtained by unsupervised learning and those obtained from deep learning. Handcrafted features are generally time consuming and suboptimal. Unsupervised learning based features which have been proposed later gave better results but their performances are still limited because they mainly rely on shallow networks and are not able to extract powerful features. Deep learning based features are recently investigated and gave interesting results. But, they cannot be usually used because of the scarcity of labelled remote sensing images and are also computationally expensive. Most importantly, whatever kind of feature is used, the neighbourhood information of them is ignored. In this paper, we propose a novel remote sensing scene representation and classification approach called Bag of Visual SubGraphs (BoVSG). First, each image is segmented into superpixels in order to summarize the image content while retaining relevant information. Then, the superpixels from all images are clustered according to their colour and texture features and a random label is assigned to each cluster that probably corresponds to some material or land cover type. Thus superpixels belonging to the same cluster have the same label. Afterwards, each image is modelled with a graph where nodes correspond to labelled superpixels and edges model spatial neighbourhoods. Finally, each image is represented by a histogram of the most frequent subgraphs corresponding to land cover adjacency patterns. This way, local spatial relations between the nodes are also taken into account. Resultant feature vectors are classified using standard classification algorithms. The proposed approach is tested on three popular datasets and its performance outperforms state-of-the-art methods, including deep learning methods. Besides its accuracy, the proposed approach is computationally much less expensive than deep learning methods.