Presentation by Catalin Calistru about its PhD work in multimedia databases.

O quê Lab Meeting
Quando 2009-02-12
de 14:00 até 15:00
Onde INESC Porto, Auditório
Nome do Contacto Catalin Calistru
Email do Contacto
The wealth of multimedia items and their increasing complexity make data organization and search essential. Without efficient storage, accurate search and informative retrieval it is hard to explore a multimedia repository to its full potential.

The multimedia items present strong connections between the content data and their metadata.

The content can offer by itself, or upon automatic content analysis, important low-level information about colors, shapes or sounds which is called content metadata. But there are more metadata, that are unlikely to be automatically extracted from content which are equally important. The common set called contextual metadata includes title, author, date, origin details or annotations.  It is a fact that both content and metadata are essential in multimedia information retrieval. 

There are plenty of standardization efforts that embed metadata in text wrappers in order to help on its processing throughout the multimedia item life cycle.  By far the most important problem in multimedia retrieval is the so-called “semantic gap”. It expresses the lack of a direct semantic channel between object features such as color, texture or shape and the concepts that one has in mind when formulating a query. Besides the “semantic gap”, the multimedia retrieval systems must also tackle the problems that come from the heterogeneity of the metadata standards and the nature of the datatypes that they introduce. Rising from different
multimedia communities and embodying different perspectives, the metadata standards define sets of concepts for their domains of activity. However, it is not easy to find mappings between the sets of concepts, even in the cases where overlapping exists. Indexing a wide range of metadata datatypes is another challenge. The vectorial datatypes, although embedded in text, are numeric in
nature. For instance, the Scalable Color Descriptor defined by the MPEG-7 standard, is a vector of 256 values that requires high-dimensional indexing methods.

We argue that the problems of managing large and heterogeneous multimedia repositories can be alleviated by bringing improvements in several aspects: storage model, high-dimensional indexing, and retrieval. Three main contributions are proposed: a database model, a high-dimensional indexing method and a faceted retrieval system.

The proposed database model accounts for content organization and its association to metadata. It is a hybrid relational-XML database model. The model allows content segments and subsegments to be arranged in configurable hierarchical structures. The associations between content and metadata are based on a set of concepts from archival description and multimedia standards.

Context metadata together with the structure of the multimedia items are stored in the relational part of the model. The metadata descriptors that contain high dimensional data are stored into the XML part of the model. For search within the high-dimensional descriptors, an indexing method called BitMatrix is proposed. It constructs bit signatures that can be efficiently processed with bitwise operations.
Our experiments have shown that the use of the BitMatrix as a high-dimensional indexing method is beneficial for the retrieval process.

Finally, the MetaMedia retrieval system was developed. Built as a web application, MetaMedia has a user-interface (the client) and a server that hosts the retrieval system itself; the proposed database model and the BitMatrix index are instantiated in MetaMedia.

In a first set of experiments, MetaMedia has been implemented in two case studies. The first one is a historic documentation center of “Santa Maria da Feira” that allows the visualization and search of its documents based on their textual content and contextual metadata. The other one is “Enthrone”, a multimedia distribution framework that has used MetaMedia as a multimedia repository. The main functionalities have been storing and searching multimedia items based on
contextual metadata. MetaMedia has also been evaluated in TRECVID, a well-known video retrieval benchmark, as an automatic and interactive video retrieval system that combined content features such as color, texture, shape and audio features with annotations. The queries included natural language, image
and video components. The comparative results illustrate satisfactory performance. Separate evaluations of the BitMatrix index were performed in a custom-designed multidimensional indexing
evaluation framework.

