Robust content-based copy detection and information theoretic indexing strategies


Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Elektrik ve Elektronik Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2015

Öğrenci: AHMET SARACOĞLU

Danışman: ABDULLAH AYDIN ALATAN

Özet:

Today, 100 hours of video is uploaded every minute to YouTube. By the end of 2015, 500 billion hours of video will be viewable from wide range of sources such as on demand video, Internet-based television and social networks. As a result important and unavoidable problems arise; management of the copyrights, numerous duplicates and content discovery. Obviously these problems may generate tremendous loss for content owners and broadcasting/hosting companies while diminishing user satisfaction. Accordingly, efficient duplicate video detection can be utilized for the solution of the aforementioned problems. Content Based Copy Detection (CBCD) emerges as a viable choice against active duplicate detection methodology of watermarking. In this thesis, building blocks of a content-based copy detection system are investigated. A novel spatio-temporal global representation is initially proposed that exploits visual features independent of the spatial information. This system is improved by a local interest point-based detection pipeline and it is shown to outperform global representation approaches through extensive simulations. On the other hand, it is observed that accuracy of local feature approaches is often limited by the presence of uninformative and redundant features extracted from the frame. Moreover, at large scale index size and corresponding amount of memory becomes a significant bottleneck. In order to decrease the index size while increasing the discriminativeness of the reference feature database, a novel information theoretic indexing method is proposed and improved further by the introduced entropy estimator. This estimator is shown to yield more robust results compared to naïve frequentist techniques. Furthermore, in comprehensive experiments using the proposed method, it has been shown that only with a fraction of the reference features same detection performance and even for some transformations 0.00 Normalized Detection Cost Rate (NDCR) is achieved, which was not possible previously with full indexing. Extending this foundation, another method to exploit distributions of local features in a temporal volume is also provided. With this temporal approach, for most of the transformations 31% to 83% improvement on NDCR is observed. Finally, in order to capture the dependence of multiple features in a given frame fundamentals of interaction information is discussed and a visual phrase representation for content-based copy detection is introduced. Experimental evaluations show that the proposed visual phrase representation and multivariate feature selection approaches are competing with the state-of-the-art.