Does estimated depth help object detection?

Çetinkaya, Bedrettin
With the widespread use of RGB-D cameras, depth information has improved solutions of many computer vision problems including object detection. Object detection can exploit depth information and different encodings obtained from the depth map. Although previous works proved that depth information can be used to improve object detection results, this thesis investigates the effects of depth map to object detection from different aspects in detailed experiments. To clarify these effects, we examine the following three questions: (i) Should depth be used in its raw form or should it be processed to obtain different encodings and color spaces? (ii) How and when should the depth information be integrated into the object detection pipeline? (iii) how does estimated depth affect object detection results? In addition, we propose a novel method to integrate depth features into the processing pipeline of a modern two-stage object detector. Compared to previous methods, our method produces better results and uses fewer parameters. In this thesis, we also explore new loss functions to better handle the consistency between RGB and depth discontinuities. We proposed both hand-crafted and learning-based loss functions which we call the" bound loss". Using the bound loss, we were able v to improve the mean average and absolute errors for depth estimation.