Current solutions are still far from reaching the ultimate goal, namely to enable users to retrieve the desired video clip among massive amounts of visual data in a semantically meaningful manner. With this study we propose a video database model that provides nearly automatic object, event and concept extraction. It provides a reasonable approach to bridging the gap between low-level representative features and high-level semantic contents from a human point of view. By using training sets and expert opinions, low-level feature values for objects and relations between objects are determined. At the top level we have an ontology of objects, events and concepts. Objects and/or events use all these information to generate events and concepts. The system has a reliable video data model, which gives the user the ability to make ontology- supported fuzzy querying. Queries containing objects, events, spatio-temporal clauses, concepts and low-level features can be handled.