Mouse face tracking using convolutional neural networks

Akkaya I. B. , HALICI U.

IET COMPUTER VISION, vol.12, no.2, pp.153-161, 2018 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 12 Issue: 2
  • Publication Date: 2018
  • Doi Number: 10.1049/iet-cvi.2017.0084
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.153-161
  • Keywords: object tracking, face recognition, feedforward neural nets, feature extraction, computer vision, medical image processing, image colour analysis, emotion recognition, graphics processing unit, feature adaptation network, high spatial resolution, low-level features, high-level features, semantic features, training dataset, hierarchical feature extraction, MFTN architecture, CNN-based tracker network, computer vision tasks, CNN, deep neural networks, training datasets, medical purposes, pain assessment, facial expressions, biomedical studies, laboratory mice, convolutional neural networks, mouse face tracking network
  • Middle East Technical University Affiliated: Yes


Facial expressions of laboratory mice provide important information for pain assessment to explore the effect of drugs being developed for medical purposes. For automatic pain assessment, a mouse face tracker is needed to extract the face regions in videos recorded in pain experiments. However, since the body and face of mice are the same colour and mice move fast, tracking their face is a challenging task. In recent years, with their ability to learn from data, deep learning provides effective solutions for a wide variety of problems. In particular, convolutional neural networks (CNNs) are very successful in computer vision tasks. In this study, a CNN based tracker network called MFTN is proposed for mouse face tracking. CNNs are good at extracting hierarchical features from the training dataset. High-level features contain semantic features and low-level features have high spatial resolution. In the proposed MFTN architecture, target information is extracted from a combination of low- and high-level features by a sub-network, namely the Feature Adaptation Network (FAN), to achieve a robust and accurate tracker. Among the MFTN versions, the MFTN/c tracker achieved an accuracy of 0.8, robustness of 0.67, and a throughput of 213fps on a workstation with GPU.