Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection

2021-01-01
Alamri, Faisal
Kalkan, Sinan
Pugeault, Nicolas
Deep neural network approaches have demonstrated high performance in object recognition (CNN) and detection (Faster-RCNN) tasks, but experiments have shown that such architectures are vulnerable to adversarial attacks (FFF, UAP): low amplitude perturbations, barely perceptible by the human eye, can lead to a drastic reduction in labelling performance. This article proposes a new context module, called Transformer-Encoder Detector Module, that can be applied to an object detector to (i) improve the labelling of object instances; and (ii) improve the detector's robustness to adversarial attacks. The proposed model achieves higher mAP, F1 scores and AUC average score of up to 13% compared to the baseline Faster-RCNN detector, and an mAP score 8 points higher on images subjected to FFF or UAP attacks due to the inclusion of both contextual and visual features extracted from scene and encoded into the model. The result demonstrates that a simple ad-hoc context module can improve the reliability of object detectors significantly.
25th International Conference on Pattern Recognition (ICPR)

Suggestions

Region of Interest Detection Based Fast and Robust Geo-Spatial Object Recognition
Gürbüz, Yeti Ziya; Alatan, Abdullah Aydın (2013-01-01)
In this paper a novel computationally efficient algorithm to detect objects automatically from high definition satellite imagery with high performance is presented. The proposed algorithm has three main steps supporting each other: Filtering, shape based and appearance based object detection. A region of interest indicating the possible regions that may have the objects to be detected is determined in a very short time via filtering step. In the remaining steps, the objects are extracted from that region an...
MetaLabelNet: Learning to Generate Soft-Labels From Noisy-Labels
Algan, Gorkem; Ulusoy, İlkay (2022-01-01)
Real-world datasets commonly have noisy labels, which negatively affects the performance of deep neural networks (DNNs). In order to address this problem, we propose a label noise robust learning algorithm, in which the base classifier is trained on soft-labels that are produced according to a meta-objective. In each iteration, before conventional training, the meta-training loop updates soft-labels so that resulting gradients updates on the base classifier would yield minimum loss on meta-data. Soft-labels...
Stratified calibration and group synchronized focal length estimation for structure from motion algorithms
Çalışkan, Akın; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2017)
The estimation of unknown calibration parameters of the cameras without using any calibration pattern is critical for the performance of the 3D computer vision applications such as structure from motion, pose estimation, visual odometry, and it is still an open problem for the researchers. In this thesis, our contribution is two folded. First of all, we propose a novel stratified approach for estimating both the focal length and the radial distortion of a camera from given 2D point correspondences without k...
Deep convolutional neural networks for airport detection in remote sensing images
Budak, Umit; Sengur, Abdulkadir; Halıcı, Uğur (2018-05-05)
This study investigated the use of deep convolutional neural networks (CNNs) in providing a solution for the problem of airport detection in remote sensing images (RSIs). In recent years, Deep CNNs have gained much attention with numerous applications having been undertaken in the area of computer vision. Researchers generally approach airport detection as a pattern recognition problem, in which first various distinctive features are extracted, and then a classifier is adopted to detect airports. CNNs not o...
Path extraction of low SNR dim targets from grayscale 2-D image sequences
Ergüven, Sait; Demirbaş, Kerim; Department of Electrical and Electronics Engineering (2006)
In this thesis, an algorithm for visual detecting and tracking of very low SNR targets, i.e. dim targets, is developed. Image processing of single frame in time cannot be used for this aim due to the closeness of intensity spectrums of the background and target. Therefore; change detection of super pixels, a group of pixels that has sufficient statistics for likelihood ratio testing, is proposed. Super pixels that are determined as transition points are signed on a binary difference matrix and grouped by 4-...
Citation Formats
F. Alamri, S. Kalkan, and N. Pugeault, “Transformer-Encoder Detector Module: Using Context to Improve Robustness to Adversarial Attacks on Object Detection,” presented at the 25th International Conference on Pattern Recognition (ICPR), ELECTR NETWORK, 2021, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/91977.