UTILIZING VIDEO COLORIZATION AS A SELF-SUPERVISED AUXILIARY TASK FOR OBJECT TRACKING


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2021

Tezin Dili: İngilizce

Öğrenci: ENGİN FIRAT

Danışman: Emre Akbaş

Özet:

In this thesis work, we studied combining an object tracker, which uses siamese networks, with another model that is trained by using the self-supervised learning paradigm. We define grayscale video colorization as a pretext task for self-supervised learning and we select the similarity based object tracking as a downstream task. Both the siamese network based object tracker and the colorization network model use the similarity between subsequent video frames. The spatio-temporal coherence between the frames of a video enables the network to learn this similarity. We study different ways of combining the two networks. Since colorization framework uses similarity learning as its basis, we cross correlate output features of colorization network as in siamese network based tracker. Then, we combine two different methods by taking the weighted average of their score maps in order to obtain a combined score map. We search for the optimal value of this weight by conducting several experiments. In addition, we conducted experiments with different neural network architectures for the colorization framework. Our experimental results show that utilizing the self-supervised pretext task improves the overall success rate when the combined network is further trained in a supervised manner. In addition, we also show that self-supervised video colorization network offers an alternative way for using modern and deeper networks in siamese architectures by alleviating the strict translational invariance restriction needed by siamese architectures.