Facial expressions of laboratory mice provide important information for pain assessment to explore the effect of drugs being developed for medical purposes. For automatic pain assessment, a mouse face tracker is needed to extract the face regions in videos recorded in pain experiments. However, since the body and face of mice are the same colour and mice move fast, tracking their face is a challenging task. In recent years, with their ability to learn from data, deep learning provides effective solutions for a wide variety of problems. In particular, convolutional neural networks (CNNs) are very successful in computer vision tasks. In this study, a CNN based tracker network called MFTN is proposed for mouse face tracking. CNNs are good at extracting hierarchical features from the training dataset. High-level features contain semantic features and low-level features have high spatial resolution. In the proposed MFTN architecture, target information is extracted from a combination of low- and high-level features by a sub-network, namely the Feature Adaptation Network (FAN), to achieve a robust and accurate tracker. Among the MFTN versions, the MFTN/c tracker achieved an accuracy of 0.8, robustness of 0.67, and a throughput of 213fps on a workstation with GPU.