Endoscopic artefact detection with ensemble of deep neural networks and false positive elimination

2020-01-01
Polat, Gorkem
Sen, Deniz
Inci, Alperen
Temizel, Alptekin
Video frames obtained through endoscopic examination can be corrupted by many artefacts. These artefacts adversely affect the diagnosis process and make the examination of the underlying tissue difficult for the professionals. In addition, detection of these artefacts is essential for further automated analysis of the images and high-quality frame restoration. In this study, we propose an endoscopic artefact detection framework based on an ensemble of deep neural networks, classagnostic non-maximum suppression, and false-positive elimination. We have used different ensemble techniques and combined both one-stage and two-stage networks to have a heterogeneous solution exploiting the distinctive properties of different approaches. Faster R-CNN, Cascade R-CNN, which are two-stage detector, and RetinaNet, which is single-stage detector, have been used as base models. The best results have been obtained using the consensus of their predictions, which were passed through class-agnostic non-maximum suppression, and false-positive elimination.
2nd International Workshop and Challenge on Computer Vision in Endoscopy, EndoCV 2020 (03 Nisan 2020)

Suggestions

UTILIZATION OF EVENT BASED CAMERAS FOR VIDEO FRAME INTERPOLATION
Kılıç, Onur Selim; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2022-8-25)
Video Frame Interpolation (VFI) aims to synthesize several frames in the middle of two adjacent original video frames. State-of-the-art frame interpolation techniques create intermediate frames by considering the objects' motions within the frames. However, these approaches adopt a first-order approximation that fails without information between the keyframes. Event cameras are new sensors that provide additional information in the dead time between frames. They measure per-pixel brightness changes asynchro...
WATERMARKING FOR IMAGE-BASED RENDERING VIA HOMOGRAPHY-BASED VIRTUAL CAMERA LOCATION ESTIMATION
Koz, Alper; Cigla, Cevahir; Alatan, Abdullah Aydın (2008-10-15)
The recent advances in Image Based Rendering (IBR) have pioneered freely determining the viewing position and angle in a scene from multi-view video. Noting that a person could also record a personal video for this arbitrarily selected view and misuse this content, apparently, copyright and copy protection problems also exist and should be solved for IBR applications. In our recent work [1], we have proposed a watermarking method, which embeds the watermark pattern into every frame of multi-view video and e...
Recursive Prediction for Joint Spatial and Temporal Prediction in Video Coding
Kamışlı, Fatih (2014-06-01)
Video compression systems use prediction to reduce redundancies present in video sequences along the temporal and spatial dimensions. Standard video coding systems use either temporal or spatial prediction on a per block basis. If temporal prediction is used, spatial information is ignored. If spatial prediction is used, temporal information is ignored. This may be a computationally efficient approach, but it does not effectively combine temporal and spatial information. In this letter, we provide a framewo...
Hyperspectral target detection - An experimental study
GUNYEL, BERTAN; Cinbiş, Ramazan Gökberk; Ture, Sedat; Gurbuz, Ali Cafer (2015-05-19)
In hyperspectral imaging, the measured spectra are affected by the materials and objects that reside within or in close vicinity of the pixel which is being imaged. The detection of a material or object of interest in an imaged region is a common problem in various application areas. In this work, an experimental study is performed for target detection in hyperspectral images, supported by a performance comparison.
A comparison of inpainting techniques in image reanimation
Böncü, Ece Selin; Akar, Gözde (2018-05-05)
Inpainting applications include object removal on images and videos, crack filling, error concealment, texture synthesis, where in this paper, its usage for image coherence and perspective emphasis on video frames in 2D image-to-video conversion system is analysed. Besides, the performance of different techniques in object removal and image reconstruction is compared using visual experiments and quality metrics.
Citation Formats
G. Polat, D. Sen, A. Inci, and A. Temizel, “Endoscopic artefact detection with ensemble of deep neural networks and false positive elimination,” Iowa, Amerika Birleşik Devletleri, 2020, vol. 2595, p. 8, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/84781.