Fine-grained recognition of maritime vessels and land vehicles by deep feature embedding

Solmaz B., Gundogdu E., Yucesoy V., Koc A., Alatan A. A.

IET COMPUTER VISION, vol.12, pp.1121-1132, 2018 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 12
  • Publication Date: 2018
  • Doi Number: 10.1049/iet-cvi.2018.5187
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.1121-1132
  • Keywords: marine vehicles, image classification, object recognition, learning (artificial intelligence), statistical analysis, traffic engineering computing, video retrieval, fine-grained maritime vessel recognition, fine-grained land vehicle recognition, deep feature embedding, large-scale image analysis, large-scale video analysis, visual surveillance systems, deep learning-based approaches, computer vision problems, fine-grained object recognition, maritime vessel classification, maritime vessel identification, land vehicle classification, land vehicle identification, visual recognition, coarse-grained classification task, fine-grained classification task, coarse-grained retrieval task, fine-grained retrieval task, verification task, multitask learning framework, loss function, global statistics, hierarchical individual sample label, data pairs, MARVEL data set, Stanford Cars data set, IMAGE SIMILARITY
  • Middle East Technical University Affiliated: Yes


Recent advances in large-scale image and video analysis have empowered the potential capabilities of visual surveillance systems. In particular, deep learning-based approaches bring in substantial benefits in solving certain computer vision problems such as fine-grained object recognition. Here, the authors mainly concentrate on classification and identification of maritime vessels and land vehicles, which are the key constituents of visual surveillance systems. Employing publicly available data sets for maritime vessels and land vehicles, the authors aim to improve visual recognition. Specifically, the authors focus on five tasks regarding visual recognition; coarse-grained classification, fine-grained classification, coarse-grained retrieval, fine-grained retrieval, and verification. To increase the performance in these tasks, the authors utilise a multi-task learning framework and present a novel loss function which simultaneously considers deep feature learning and classification by exploiting the available hierarchical labels of individual samples and the global statistics of distances between the data pairs. The authors observe that the proposed multi-task learning model improves the fine-grained recognition performance on MARVEL and Stanford Cars data sets, compared to training of a model targeting a single recognition task.