Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
An analysis of stereo depth estimation utilizing attention mechanisms, self-supervised pose estimators & temporal predictions
Download
Tez_v21.pdf
Date
2022-5-18
Author
Oğuzman, Utku
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
336
views
315
downloads
Cite This
By the recent success of deep learning, real-world applications of stereo depth estimation algorithms attracted the interest of many researchers. Using the available datasets, synthetic or real-world, the researchers begin analyzing their ideas for practical applications. In this thesis, a thorough analysis is performed of such an aim. The state-of-the-art stereo depth estimation algorithms are tried to be improved by incorporating attention mechanisms to the current networks and better initialization strategies in time. For this purpose, different amounts of attention modules are applied to one of the most successful stereo depth estimator networks. The performance of the proposed attention-based neural networks that is trained with the synthetic stereo datasets under a supervised setting is compared against the performance of a baseline algorithm and it yielded superior results. When these neural networks are finetuned using a small annotated real-world dataset, the baseline algorithm had a better performance. Secondly, the temporal information available in the synthetic datasets is leveraged by teaching the proposed neural network how to initialize the current iteration by using the previous predictions. Finally, in order to finetune the neural network better for real-world use with the temporal information, a large unannotated real-world dataset is utilized under a self-supervised training setting using ego-pose estimation and optical flow networks. In general, it is observed that these settings yield better results against state-of-the-art methods in the synthetic-to-real world supervised training settings, and they are comparable after the finetuning operation.
Subject Keywords
Stereo depth estimation
,
Attention modules
,
Self-supervised learning
,
Finetuning
URI
https://hdl.handle.net/11511/97789
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Detection of clean samples in noisy labelled datasets via analysis of artificially corrupted samples
Yıldırım, Botan; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2022-8-22)
Recent advances in supervised deep learning methods have shown great successes in image classification but these methods are known to owe their success to massive amount of data with reliable labels. However, constructing large-scale datasets inevitably results with varying levels of label noise which degrades performance of the supervised deep learning based classifiers. In this thesis, we make an analysis of sample selection based label noise robust approaches by providing extensive experimental evaluatio...
An experimental comparison of symbolic and neural learning algorithms
Baykal, Nazife (1998-04-23)
In this paper comparative strengths and weaknesses of symbolic and neural learning algorithms are analysed. Experiments comparing the new generation symbolic algorithms and neural network algorithms have been performed using twelve large, real-world data sets.
AN ABSTRACTION BASED REDUCED REFERENCE DEPTH PERCEPTION METRIC FOR 3D VIDEO
NUR YILMAZ, GÖKÇE; Akar, Gözde (2012-10-03)
In order to speed up the wide-spread proliferation of the 3D video technologies (e.g., coding, transmission, display, etc), the effect of these technologies on 3D perception should be efficiently and reliably investigated. Using Full-Reference (FR) objective metrics for this investigation is not practical especially for "on the fly" 3D perception evaluation. Thus, a Reduced Reference (RR) metric is proposed to predict the depth perception of 3D video in this paper. The color-plus-depth 3D video representati...
A new framework of multi-objective evolutionary algorithms for feature selection and multi-label classification of video data
Karagoz, Gizem Nur; Yazıcı, Adnan; Dokeroglu, Tansel; Coşar, Ahmet (2020-06-01)
There are few studies in the literature to address the multi-objective multi-label feature selection for the classification of video data using evolutionary algorithms. Selecting the most appropriate subset of features is a significant problem while maintaining/improving the accuracy of the prediction results. This study proposes a framework of parallel multi-objective Non-dominated Sorting Genetic Algorithms (NSGA-II) for exploring a Pareto set of non-dominated solutions. The subsets of non-dominated featu...
On numerical optimization theory of infinite kernel learning
Ozogur-Akyuz, S.; Weber, Gerhard Wilhelm (2010-10-01)
In Machine Learning algorithms, one of the crucial issues is the representation of the data. As the given data source become heterogeneous and the data are large-scale, multiple kernel methods help to classify "nonlinear data". Nevertheless, the finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, a novel method of "infinite" kernel combinations is proposed with the help of infinite and semi-infinite programming regarding all elements in kernel space. Look...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
U. Oğuzman, “An analysis of stereo depth estimation utilizing attention mechanisms, self-supervised pose estimators & temporal predictions,” M.S. - Master of Science, Middle East Technical University, 2022.