Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
A Transformer-Based Approach for Fusing Infrared and Visible Band Images
Download
Aytekin_Erdogan_thesis.pdf
Date
2023-8-29
Author
Erdogan, Aytekin
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
386
views
268
downloads
Cite This
Image fusion is a process where images obtained from different sensors are combined to generate a single image that benefits from complementary information. Recently, there has been a growing interest in image fusion, which involves fusing images from diverse sensors to produce an enhanced image. Although deep learning methods have been widely employed in state-of-the-art techniques to extract meaningful features for image fusion, these methods primarily focus on integrating local features while disregarding the broader context within the image. To overcome this limitation, Transformer-based models have emerged as a promising solution, aiming to capture general context dependencies through attention mechanisms. Inspired by this, we propose a novel image fusion approach that incorporates a transformer-based multi-scale fusion strategy, effectively considering both local and general context information, thus enhancing the overall fusion process. Our proposed method follows a two-stage training approach, where an auto-encoder is initially trained to extract deep features at multiple scales. Subsequently, the multi-scale features are fused using a combination of Convolutional Neural Networks (CNNs) and Transformers. The CNNs are utilized to capture local features, while the Transformer handles the integration of general context features. Notably, in contrast to similar methods, we propose novel loss functions to address the challenges associated with defining a loss function when ground truth for fusion is absent. Through extensive experiments on various benchmark datasets, our proposed method, along with the novel loss function definition, demonstrates superior performance compared to other competitive fusion algorithms. Overall, this thesis presents significant advancements in image fusion techniques, offering innovative approaches and contributing to the state-of-the-art in this field.
Subject Keywords
Image Fusion
,
Visual Infrared Image Fusion
,
Transformer Based Image Fusion
,
Structural Similarity Metric
URI
https://hdl.handle.net/11511/105259
Collections
Graduate School of Informatics, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
A. Erdogan, “A Transformer-Based Approach for Fusing Infrared and Visible Band Images,” M.S. - Master of Science, Middle East Technical University, 2023.