TRAINING-FREE IMAGE-LEVEL AFFORDANCE DISCOVERY AND LABELING USING PRE-TRAINED DEEP NETWORKS

2025-3-4
Özçil, İsmail
The advancement in computing power has significantly reduced the training times for deep learning, enabling the rapid development of networks designed for object recognition. However, the exploration of object utility, the object's affordance, as opposed to object recognition, has received comparatively less attention. Existing object affordance models exhibit shortcomings, including limited robustness across diverse architectures and insufficient performance in complex environments. This work focuses on using pre-trained networks trained on object classification datasets to explore object affordances. While these networks have proven instrumental in transfer learning for classification tasks, the presented approach in this study diverges from conventional object classification methods by labeling affordances without modifying the final layers. Instead, pre-trained networks are employed to learn affordance labels without requiring specialized classification layers. Two approaches are tested: Subspace Projection Method and Manifold Curvature Method, which facilitate the determination of affordance labels without such modifications. Both Subspace Projection Method and Manifold Curvature Method were evaluated using nine distinct pre-trained networks across two different affordance datasets. Subspace Projection Method achieved a True Positive Rate of up to 94% and 96% for the best-performing networks on each dataset, while Manifold Curvature Method attained True Positive Rates exceeding 98% and 99% with its top-performing networks. Furthermore, a methodology to integrate and rank affordance estimates using Cross Sinkhorn Distance Matrices is introduced. This approach enables the discovery of new affordances and provides a logical ordering for a given object, even if it's not part of the original training dataset. Additionally, human feedback or self-experience is incorporated to refine the affordance ordering. By combining object detection, segmentation, and our proposed affordance labeling techniques, the affordances of objects in real-world scenes are identified and categorized after applying an auto threshold to the combined and ranked affordance labeling estimate results.
Citation Formats
İ. Özçil, “TRAINING-FREE IMAGE-LEVEL AFFORDANCE DISCOVERY AND LABELING USING PRE-TRAINED DEEP NETWORKS,” Ph.D. - Doctoral Program, Middle East Technical University, 2025.