Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
GRAPH-BASED HIERARCHICAL TRACKLET MERGE FOR MULTIPLE OBJECT TRACKING
Download
cagri_thesis_final_v1.pdf
Date
2024-4-05
Author
Bilgi, Halil Çağrı
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
29
views
16
downloads
Cite This
The past decade has seen significant advancements in multi-object tracking, particularly with the rise of deep learning. However, many studies in online tracking have primarily focused on enhancing track management or extracting visual features, often leading to hybrid approaches with limited effectiveness, especially in scenarios with severe occlusions or crowded scenes. Conversely, in offline tracking, there has been a lack of emphasis on robust motion cues. This thesis proposes a novel solution to offline tracking by hierarchically merging tracklets, leveraging recent promising learning-based architectures. Our approach integrates motion cues and social interactions among targets using a joint Transformer and Graph Neural Network (GNN) encoder. The proposed solution is an end-to-end trainable model that does not require any handcrafted short-term or long-term matching processes. By representing tracklets across multiple frames using a graph structure, we enable collective reasoning of targets across different timestamps, leveraging advancements in graph representation learning. Furthermore, the Transformer encoder effectively captures the motion of each tracklet. By enabling bi-directional information propagation between these modalities, namely Transformer and GNN, we allow motion modeling to depend on interactions and, conversely, interaction modeling to depend on the motion of each target. Experimental results demonstrate the effectiveness of our approach, indicating that graph representation learning equipped with a joint Transformer encoder achieves results comparable to the state-of-the-art algorithms. These promising results emphasize the potential of the joint Transformer-GNN encoder architecture in multi-object tracking.
Subject Keywords
Multi-object Tracking
,
Offline Tracking
,
Graph Neural Networks
,
Transformer
URI
https://hdl.handle.net/11511/109417
Collections
Graduate School of Natural and Applied Sciences, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
H. Ç. Bilgi, “GRAPH-BASED HIERARCHICAL TRACKLET MERGE FOR MULTIPLE OBJECT TRACKING,” M.S. - Master of Science, Middle East Technical University, 2024.