Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Improvements on one-stage object detection by visual reasoning
Download
MSc_Thesis_Tolga_Aksoy.pdf
Date
2022-5-09
Author
Aksoy, Tolga
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
415
views
618
downloads
Cite This
Current state-of-the-art one-stage object detectors are limited by treating each image region separately without considering possible relations of the objects. This causes dependency solely on high-quality convolutional feature representations for detecting objects successfully. However, this may not be possible sometimes due to some challenging conditions. In this thesis, a new architecture is proposed for one-stage object detection that reasons the relations of the image regions by using self-attention. The proposed reasoning method considers semantic coherency between image regions and enhances features of these regions. Spatially and semantically enhanced features are fused with original features to improve performance. The proposed approach is applied to the current state-of-the-art real-time one-stage object detectors such as YOLOv3, YOLOv4 and YOLOR, then evaluated on COCO in terms of mAP.
Subject Keywords
Object detection
,
One-stage object detection
,
Visual reasoning
URI
https://hdl.handle.net/11511/97326
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Rescoring detections based on contextual scores in object detection
Zorlu, Ersan Vural; Akbaş, Emre; Department of Computer Engineering (2019)
To detect objects in an image, current state-of-the-art object detectors firstly definecandidate object locations, and then classify each of them into one of the predefinedcategories or as background. They do so by using the visual features extracted locallyfrom the candidate locations; omitting the rich contextual information embedded inthe whole image. Contextual information can be utilized to complement the informa-tion extracted locally and thereby to improve object detection accuracy. Researchershave p...
New models and inference techniques for Gaussian process-based extended object tracking
Kumru, Murat; Özkan, Emre; Department of Electrical and Electronics Engineering (2022-9-09)
In this thesis, we consider the problem of tracking dynamic objects with unknown shapes using point cloud measurements generated by, e.g., lidars, radars, and depth cameras. The point measurements do not only convey information about the object pose, i.e., position and orientation, but they also naturally reveal the characteristics of its latent extent. Aiming to harness the full potential of the available information, we investigate the Gaussian process-based extended object tracking (GPEOT) framework. W...
A Computationally Efficient Appearance-Based Algorithm for Geospatial Object Detection
Arslan, Duygu; Alatan, Abdullah Aydın (2012-04-27)
A computationally efficient appearance-based algorithm for geospatial object detection is presented and evaluated specifically for aircraft detection from satellite imagery. An aircraft operator exploiting the edge information via gray level differences between the aircraft and its background is constructed with Haar-like polygon regions by using the shape information of the aircraft as an invariant. Fast evaluation of the aircraft operator is achieved by means of integral image. Rotated integral images are...
Scale invariant representation of 2 5D data
AKAGUNDUZ, Erdem; ULUSOY PARNAS, İLKAY; BOZKURT, Nesli; Halıcı, Uğur (2007-06-13)
In this paper, a scale and orientation invariant feature representation for 2.5D objects is introduced, which may be used to classify, detect and recognize objects even under the cases of cluttering and/or occlusion. With this representation a 2.5D object is defined by an attributed graph structure, in which the nodes are the pit and peak regions on the surface. The attributes of the graph are the scales, positions and the normals of these pits and peaks. In order to detect these regions a "peakness" (or pi...
A multimodal approach for individual tracking of people and their belongings
Beyan, Çiğdem; Temizel, Alptekin (2015-04-01)
In this study, a fully automatic surveillance system for indoor environments which is capable of tracking multiple objects using both visible and thermal band images is proposed. These two modalities are fused to track people and the objects they carry separately using their heat signatures and the owners of the belongings are determined. Fusion of complementary information from different modalities (for example, thermal images are not affected by shadows and there is no thermal reflection or halo effect in...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
T. Aksoy, “Improvements on one-stage object detection by visual reasoning,” M.S. - Master of Science, Middle East Technical University, 2022.