METU MMDS An Intelligent Multimedia Database System for Multimodal Content Extraction and Querying

Yazıcı, Adnan
Yılmaz, Turgay
Gulen, Elvan
Koyuncu, Murat
Sert, Mustafa
Managing a large volume of multimedia data, which contain various modalities (visual, audio, and text), reveals the need for a specialized multimedia database system (MMDS) to efficiently model, process, store and retrieve video shots based on their semantic content. This demo introduces METU-MMDS, an intelligent MMDS which employs both machine learning and database techniques. The system extracts semantic content automatically by using visual, audio and textual data, stores the extracted content in an appropriate format and uses this content to efficiently retrieve video shots. The system architecture supports various multimedia query types including unimodal querying, multimodal querying, query-by-concept, query-by-example, and utilizes a multimedia index structure for efficiently querying multi-dimensional multimedia data. We demonstrate METU-MMDS for semantic data extraction from videos and complex multimedia querying by considering content and concept-based queries containing all modalities.
22nd International Conference on MultiMedia Modeling, MMM 2016 (4 January 2016 through 6 January 2016)


Multimodal query-level fusion for efficient multimedia information retrieval
Sattari, Saeid; Yazıcı, Adnan (2018-10-01)
Managing a large volume of multimedia data containing various modalities such as visual, audio, and text reveals the necessity for efficient methods for modeling, processing, storing, and retrieving complex data. In this paper, we propose a fusion-based approach at the query level to improve query retrieval performance of multimedia data. We discuss various flexible query types including the combination of content as well as concept-based queries that provide users with the ability to efficiently perform mu...
Multimedia Information Retrieval Using Fuzzy Cluster-Based Model Learning
Sattari, Saeid; Yazıcı, Adnan (2017-07-12)
Multimedia data, particularly digital videos, which contain various modalities (visual, audio, and text) are complex and time consuming to model, process, and retrieve. Therefore, efficient methods are required for retrieval of such complex data. In this paper, we propose a multimodal query level fusion approach using a fuzzy cluster-based learning method to improve the retrieval performance of multimedia data. Experimental results on a real dataset demonstrate that employing fuzzy clustering achieves notab...
End-to-end learned image compression with conditional latent space modelling for entropy coding
Yeşilyurt, Aziz Berkay; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2019)
This thesis presents a lossy image compression system based on an end-to-end trainable neural network. Traditional compression algorithms use linear transformation, quantization and entropy coding steps that are designed based on simple models of the data and are aimed to be low complexity. In neural network based image compression methods, the processing steps, such as transformation and entropy coding, are performed using neural networks. The use of neural networks enables transforms or probability models...
Named Entity Recognition with Conditional Random Fields on Turkish News Dataset: Revisiting the Features
Çekinel, Recep Fırat; Karagöz, Pınar (2019-04-24)
Named entity recognition is a natural language processing problem that aims to mark entity names, such as person, place, organization, date, time, money and percentage, from different types of text. Various applications such as location estimation, event time estimation, determination of important people in the text can be possible with the solutions to this problem. The number of named entity recognition studies on Turkish texts is quite limited compared to those on English. In this study, the use of the t...
Güneyi, Eylem Tuğçe; Vural, Elif; Department of Electrical and Electronics Engineering (2021-9-8)
Graph models provide efficient tools for analyzing data defined over irregular domains such as social networks, sensor networks, and transportation networks. Real-world graph signals are usually time-varying signals. The characterization of the joint behavior of time-varying graph signals in the time and the vertex domains has recently arisen as an interesting research problem, contrasted to the independent processing of graph signals acquired at different time instants. The concept of wide sense stationari...
Citation Formats
A. Yazıcı, T. Yılmaz, E. Gulen, M. Koyuncu, and M. Sert, “METU MMDS An Intelligent Multimedia Database System for Multimodal Content Extraction and Querying,” Miami; United States, 2016, vol. 9517, p. 354, Accessed: 00, 2021. [Online]. Available: