Indexing both content and concept for high-dimensional multimedia data

Download
2018
Arslan, Serdar
While understanding the semantic meaning of multimedia content is immediate for humans, it's far from immediate for a computer. This problem is commonly known as the semantic gap which is difference between human perception of multimedia object and extracted low-level features and it is one of the main problems in multimedia retrieval. Thus, in order to achieve better retrieval performance, low-level content features should be combined with semantic features in an efficient way. Another critical task in this domain is efficient similarity search of multimedia object in large collections. According to various studies in the literature, using query by content and concept approaches together may not only enhance performance, but also functionality of the overall system. In this study, we focus on the retrieval process of multimedia data by combining semantic information with the content of the data in order to try to solve the semantic gap problem in an efficient way. The low-level content features are extracted and mapped from high-dimesional space into low-dimensional space by using a fast dimension reduction algorithm. Thus, we have showed that our approach can reduce the retrieval problem to a spatial-indexing task and accuracy of the retrieval performed in low- dimensional space is shown to be comparable to that of the retrieval performed in the original space. High-level concept descriptors are combined with these low-level content descriptors as a new dimension and indexed together in a single structure. We also propose another index structure which uses spatial indexing method for low-level features in order to show the effectiveness of our novel approach and we proved that our study has performance enhancement in query response time of retrieving big-sized multimedia objects since it indexes content and conceptual data together for fast retrieval.

Suggestions

Fusion of multimodal information for multimedia information retrieval
Yılmaz, Turgay; Yazıcı, Adnan; Department of Computer Engineering (2014)
An effective retrieval of multimedia data is based on its semantic content. In order to extract the semantic content, the nature of multimedia data should be analyzed carefully and the information contained should be used completely. Multimedia data usually has a complex structure containing multimodal information. Noise in the data, non-universality of any single modality, and performance upper bound of each modality make it hard to rely on a single modality. Thus, multimodal fusion is a practical approach...
Optical flow based video frame segmentation and segment classification
Akpınar, Samet; Alpaslan, Ferda Nur; Department of Computer Engineering (2018)
Video information retrieval is a field of multimedia research enabling us to extract desired semantic information from video data. In content-based video information retrieval, visual content obtained from video scenes is utilized. For developing methods to cope with content-based video information retrieval in terms of temporal concepts such as action, event, etc., representation of temporal information becomes critical. In this thesis, action detection is tackled based on a temporal video representation m...
DATA-DRIVEN IMAGE CAPTIONING WITH META-CLASS BASED RETRIEVAL
Kilickaya, Mert; Erdem, Erkut; Erdem, Aykut; İKİZLER CİNBİŞ, NAZLI; Çakıcı, Ruket (2014-04-25)
Automatic image captioning, the process cif producing a description for an image, is a very challenging problem which has only recently received interest from the computer vision and natural language processing communities. In this study, we present a novel data-driven image captioning strategy, which, for a given image, finds the most visually similar image in a large dataset of image-caption pairs and transfers its caption as the description of the input image. Our novelty lies in employing a recently' pr...
Optimizing core signal processing functions on a superscalar SIMD architecture
Uslu, Çağrı; Bazlamaçcı, Cüneyt Fehmi; Department of Electrical and Electronics Engineering (2019)
Digital Signal Processing (DSP) is the basis of many technologies, such as Image Processing, Speech Recognition, Radars, etc. Use of electronic devices such as smart- phones, smartwatches, self-driving cars and autonomous robots that take advantage of these technologies becomes widespread and hence it is more critical than ever for these technologies to be realized with high efficiency on cheaper and less power- hungry devices. Cortex-A15 processor architecture is one of the solutions from ARM to this requi...
UTILIZATION OF SPATIAL INFORMATION FOR POINT CLOUD SEGMENTATION
Akman, Oytun; Bayramoglu, Neslihan; Alatan, Abdullah Aydın; Jonker, Pieter (2010-06-09)
Object segmentation has an important role in the field of computer vision for semantic information inference. Many applications such as 3DTV archive systems, 3D/2D model fitting, object recognition and shape retrieval are strongly dependent to the performance of the segmentation process. In this paper we present a new algorithm for object localization and segmentation based on the spatial information obtained via a Time-of-Flight (TOF) camera. 3D points obtained via a TOF camera are projected onto the major...
Citation Formats
S. Arslan, “Indexing both content and concept for high-dimensional multimedia data,” Ph.D. - Doctoral Program, Middle East Technical University, 2018.