Semantic video analysis for surveillance systems

Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Fen Bilimleri Enstitüsü, Türkiye

Tezin Onay Tarihi: 2018

Öğrenci: KARANİ KARDAŞ

Eş Danışman: AHMET COŞAR, FEHİME NİHAN ÇİÇEKLİ

Özet:

This thesis presents novel studies about semantic inference of video events. In this respect, a surveillance video analysis system, called SVAS is introduced for surveillance domain, in which semantic rules and the definition of event models can be learned or defined by the user for automatic detection and inference of complex video events. In the scope of SVAS, an event model method named Interval-Based Spatio-Temporal Model (IBSTM) is proposed. SVAS can learn action models and event models without any predefined threshold values and generates human readable and manageable IBSTM event models. The thesis proposes hybrid machine learning methods. A set of feature models named Threshold Model, which reflects the spatio-temporal motion analysis of an event, is kept as the first model. As the second model, Bag of Actions (BoA) model is used in order to reduce the search space in the detection phase. Markov Logic Network (MLN) model, which provides understandable and manageable logic predicates for users, is kept as the third model. SVAS has high performance event detection capability due to its interval-based hierarchical approach. It determines related candidate intervals for each main model of IBSTM and uses the related main model when needed rather than using all models as a whole. The main contribution of this study is to fill the semantic gap between humans and video computer systems such that, on one hand it decreases human intervention through its learning capabilities, but on the other hand it also enables human intervention when necessary through its manageable event model method. The study achieves all of them in the most efficient way through its machine learning methods. The proposed system is applied to different event datasets from CAVIAR, BEHAVE, CANTATA and our synthetic datasets. The experimental results show that our approach improves the event recognition performance and precision as compared to the current state-of-the-art approaches.