A NOVEL BOVW MIMICKING END-TO-END TRAINABLE CNN CLASSIFICATION FRAMEWORK USING OPTIMAL TRANSPORT THEORY

2019-01-01
An end-to-end trainable convolutional neural network (CNN) framework which mimics bag of visual words (BoVW) is proposed for image classification. To this end, a new paradigm for histogram-like image representation is introduced and optimal transport (OT) distance is utilized for the similarity assessment. Any patch of an image is considered as a unique visual word and the image is represented as the uniform histogram of the visual words with the histogram bins associated to embedding vectors according to the semantic meanings of the corresponding visual words. Thus, in the CNN framework, the output of the last convolutional block is considered as the global representation of the image and the embeddings are inherently learned within the classification framework. With the proposed formulation, undesired quantization for the BoVW representation is no more required; moreover, the learned CNN features are naturally interpretable. The experiments on CIFAR-10, CIFAR-100 and SVHN datasets show that the replacement of the global pooling and fully connected layers with the proposed representation together with OT distance improves the baseline CNN framework.

Suggestions

A temporal neural network model for constructing connectionist expert system knowledge bases
Alpaslan, Ferda Nur (Elsevier BV, 1996-04-01)
This paper introduces a temporal feedforward neural network model that can be applied to a number of neural network application areas, including connectionist expert systems. The neural network model has a multi-layer structure, i.e. the number of layers is not limited. Also, the model has the flexibility of defining output nodes in any layer. This is especially important for connectionist expert system applications.
A Novel Bag-of-Visual-Words Approach for Geospatial Object Detection
Aytekin, Caglar; Alatan, Abdullah Aydın (2011-04-29)
A novel bag-of-visual-words algorithm is presented with two extensions compared to its classical version: exploiting scale information and weighting visual words. The scale information that is already extracted with SIFT detector is included as an additional element to the SIFT key-point descriptor, while the visual words are weighted during histogram assignment proportional to their importance which is measured by the ratio of their occurrences in the object to the occurrences in the background. The algori...
A Novel Neural Network Method for Direction of Arrival Estimation with Uniform Cylindrical 12-Element Microstrip Patch Array
Caylar, Selcuk; Dural, Guelbin; Leblebicioğlu, Mehmet Kemal (2008-01-01)
In this study a new neural network algorithm is proposed for real time multiple source tracking problem with cylindrical patch antenna array based on a previous v reported Modified Neural Multiple Source Tracking Algorithm(MN-MUST). The proposed algorithm, namely Cylindrical Microstrip Patch Array Modified Neural Multiple Source Tracking Algorithm (CMN-MUST) implements W-MUST algorithm on a cylindrical microsttip patch array structure. CMN-MUST algorithm uses the advantage of directive pattern of microstrip...
A NOVEL LEARNING-BASED IMAGE MATCHING APPROACH BASED ON MUTUAL NEAREST NEIGHBOR SEARCH WITH RATIO TEST
Efe, Ufuk; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2021-9-09)
This thesis proposes a novel image matching method that utilizes learned features extracted by an off-the-shelf deep neural network to obtain a promising performance. The proposed method simply uses a pre-trained VGG architecture as a feature extractor and does not require any additional training to improve matching. Inspired by well-established concepts in the psychology area, such as the Mental Rotation paradigm, an initial warping step is also performed by the help of a preliminary geometric transformati...
A novel electromagnetic target recognition method by MUSIC algorithm
Secmen, Mustafa; Sayan, Gönül (2006-12-01)
This paper introduces a novel method for aspect invariant electromagnetic target recognition based on the use of multiple signal classification (MUSIC) algorithm to extract late-time resonant target features from the ultra-wideband scattered data. This approach achieves very high accuracy rates even at very low signal-to-noise ratio (SNR) values although it needs scattered data for classifier design at only a few different aspects and makes use of the MUSIC algorithm in a simple and computationally efficien...
Citation Formats
Y. Z. Gürbüz, “A NOVEL BOVW MIMICKING END-TO-END TRAINABLE CNN CLASSIFICATION FRAMEWORK USING OPTIMAL TRANSPORT THEORY,” 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/43450.