In the recent years, there has been a growing interest towards the management of multimedia data. Recently, ISO MPEG organization has finished establishing a new standard, MPEG-7, for describing audio-visual data for interoperable indexing, searching and browsing purposes. Following this standard, a state-of-the-art video management system has been designed and implemented. The system is capable of temporally segmenting video into shots, as well as obtaining a semantically meaningful group of shots, i.e. scenes. The scene decomposition is achieved using a HMM-based formulation by multimodal features. Keyframes are used as shot representatives and their visual descriptions are utilized to make similarity queries. Moreover, these low-level descriptors are also used to reach a number of semantic visual classes using support vector machines. Finally, automatic detection of human faces via skin color filtering and videotext recognition increase the indexing capabilities of the BilVMS system.