Image classification for content based indexing

Taner, Serdar
As the size of image databases increases in time, the need for content based image indexing and retrieval become important. Image classification is a key to content based image indexing. In this thesis supervised learning with feed forward back propagation artificial neural networks is used for image classification. Low level features derived from the images are used to classify the images to interpret the high level features that yield semantics. Features are derived using detail histogram correlations obtained by Wavelet Transform, directional edge information obtained by Fourier Transform and color histogram correlations. An image database consisting of 357 color images of various sizes is used for training and testing the structure. The database is indexed into seven classes that represent scenery contents which are not mutually exclusive. The ground truth data is formed in a supervised fashion to be used in training the neural network and testing the performance. The performance of the structure is tested using leave one out method and comparing the simulation outputs with the ground truth data. Success, mean square error and the class recall rates are used as the performance measures. The performances of the derived features are compared with the color and texture descriptors of MPEG-7 using the structure designed. The results show that the performance of the method is comparable and better. This method of classification for content based image indexing is a reliable and valid method for content based image indexing and retrieval, especially in scenery image indexing.


Metadata extraction from text in soccer domain
Göktürk, Özkan; Çiçekli, Fehime Nihan; Department of Computer Engineering (2008)
Video databases and content based retrieval in these databases have become popular with the improvements in technology. Metadata extraction techniques are used for providing data to video content. One popular metadata extraction technique for mul- timedia is information extraction from text. For some domains, it is possible to nd accompanying text with the video, such as soccer domain, movie domain and news domain. In this thesis, we present an approach of metadata extraction from match reports for soccer d...
Haroon, Hammad; Pekeriçli, Mehmet Koray; Department of Building Science in Architecture (2022-7-07)
As the use of data collection in the built environment increased, data pertaining to building occupancy has gained considerable importance in realms such as energy optimization and spatial usage analytics. However, many data collection approaches infringe on individuals’ rights to privacy, and subsequently their comfort. This thesis aims to address the tension between the proliferation of smart building technologies and individual privacy and autonomy, specifically focusing on image-based sensing. It explor...
Image compression method based on learned lifting-based dwt and learned zerotree-like entropy model
Şahin, Uğur Berk; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2022-8)
The success of deep learning in computer vision has sparked great interest in investigating deep learning-based algorithms also in many image processing applications, including image compression. The most popular end-to-end learned image compression approaches are based on auto-encoder architectures, where the image is mapped via convolutional neural networks (CNNs) into a transform (latent) representation that is quantized and processed again with CNNs to obtain the reconstructed image. The quantized laten...
Automatic video text localization and recognition
Saracoglu, Ahmet; Alatan, Abdullah Aydın (2006-01-01)
For the indexing and management of large scale video databases an important tool would be the text in the digital media. In this work, the localization performances of the overlay texts using different feature extraction methods with different classifiers are analyzed. Besides that in order to improve the text recognition rate by using multiple hipothesis obtained from multilevel segmentation and using statistical language model are investigated.
Fusion of multimodal information for multimedia information retrieval
Yılmaz, Turgay; Yazıcı, Adnan; Department of Computer Engineering (2014)
An effective retrieval of multimedia data is based on its semantic content. In order to extract the semantic content, the nature of multimedia data should be analyzed carefully and the information contained should be used completely. Multimedia data usually has a complex structure containing multimodal information. Noise in the data, non-universality of any single modality, and performance upper bound of each modality make it hard to rely on a single modality. Thus, multimodal fusion is a practical approach...
Citation Formats
S. Taner, “Image classification for content based indexing,” M.S. - Master of Science, Middle East Technical University, 2003.