A Flexible and Scalable Audio Information Retrieval System for Mixed-Type Audio Signals

Dogan E., SERT M., YAZICI A.

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, vol.26, no.10, pp.952-970, 2011 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 26 Issue: 10
  • Publication Date: 2011
  • Doi Number: 10.1002/int.20508
  • Page Numbers: pp.952-970


The content-based classification and retrieval of real-world audio clips is one of the challenging tasks in multimedia information retrieval. Although the problem has been well studied in the last two decades, most of the current retrieval systems cannot provide flexible querying of audio clips due to the mixed-type form (e.g., speech over music and speech over environmental sound) of audio information in real world. We present here a complete, scalable, and extensible content-based classification and retrieval system for mixed-type audio clips. The system gives users an opportunity for flexible querying of audio data semantically by providing four alternative ways, namely, querying by mixed-type audio classes, querying by domain-based fuzzy classes, querying by temporal information and temporal relationships, and querying by example (QBE). In order to reduce the retrieval time, a hash-based indexing technique is introduced. Two kinds of experiments were conducted on the audio tracks of the TRECVID news broadcasts to evaluate the performance of the proposed system. The results obtained from our experiments demonstrate that the Audio Spectrum Flatness feature in MPEG-7 standard performs better in music audio samples compared to other kinds of audio samples and the system is robust under different conditions. (C) 2011 Wiley Periodicals, Inc.