SegIns: A simple extension to instance discrimination task for better localization learning

Date

2024-04-01

Author

Baydar, Melih
Akbaş, Emre

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

93
views

0
downloads

Recent self-supervised learning methods, where instance discrimination task is a fundamental way of pretraining convolutional neural networks (CNN), excel in transfer learning performance. Even though instance discrimination task is a well suited pretraining method for classification with its image-level learning, lack of dense representation learning makes it sub-optimal for localization tasks such as object detection. In this paper, we aim to mitigate this shortcoming of instance discrimination task by extending it to jointly learn dense representations alongside image-level representations. We add a segmentation branch parallel to the image-level learning to predict class-agnostic masks, enhancing location-awareness of the representations. We show the effectiveness of our pretraining approach on localization tasks by transferring the learned representations to object detection and segmentation tasks, providing relative improvements by up to 1.7% AP on PASCAL VOC and 0.8% AP on COCO object detection, 0.8% AP on COCO instance segmentation and 3.6% mIoU on PASCAL VOC semantic segmentation respectively.

Subject Keywords

Contrastive learning, Instance discrimination task, Non-contrastive learning, Object detection, Self-supervised learning, Semantic segmentation

URI

https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85188630078&origin=inward
https://hdl.handle.net/11511/109354

Journal

Journal of Visual Communication and Image Representation

DOI

https://doi.org/10.1016/j.jvcir.2024.104122

Collections

Department of Computer Engineering, Article

Citation Formats

M. Baydar and E. Akbaş, “SegIns: A simple extension to instance discrimination task for better localization learning,” Journal of Visual Communication and Image Representation, vol. 100, pp. 0–0, 2024, Accessed: 00, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85188630078&origin=inward.