Streaming Multiscale Deep Equilibrium Models

Download

index.pdf

Date

2022-1-01

Author

Ertenli, Can Ufuk
Akbaş, Emre
Cinbiş, Ramazan Gökberk

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

351
views

126
downloads

We present StreamDEQ, a method that infers frame-wise representations on videos with minimal per-frame computation. In contrast to conventional methods where compute time grows at least linearly with the network depth, we aim to update the representations in a continuous manner. For this purpose, we leverage the recently emerging implicit layer models, which infer the representation of an image by solving a fixed-point problem. Our main insight is to leverage the slowly changing nature of videos and use the previous frame representation as an initial condition on each frame. This scheme effectively recycles the recent inference computations and greatly reduces the needed processing time. Through extensive experimental analysis, we show that StreamDEQ is able to recover near-optimal representations in a few frames time, and maintain an up-to-date representation throughout the video duration. Our experiments on video semantic segmentation and video object detection show that StreamDEQ achieves on par accuracy with the baseline (standard MDEQ) while being more than 3× faster. Code and additional results are available at https://ufukertenli.github.io/streamdeq/.

Subject Keywords

Implicit layer models, Video analysis and understanding, Video object detection, Video semantic segmentation

URI

https://hdl.handle.net/11511/101838

DOI

https://doi.org/10.1007/978-3-031-20083-0_12

Conference Name

17th European Conference on Computer Vision, ECCV 2022

Collections

Department of Computer Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

Graph-based multilevel temporal segmentation of scripted content videos Sakarya, Ufuk; TELATAR, ZİYA (2007-06-13) This paper concentrates on a graph-based multilevel temporal segmentation method for scripted content videos. In each level of the segmentation, a similarity matrix of frame strings, which are series of consecutive video frames, is constructed by using temporal and spatial contents of frame strings. A strength factor is estimated for each frame string by using a priori information of a scripted content. According to the similarity matrix reevaluated from a strength function derived by the strength factors, ...
QUALITY EVALUATION OF STEREOSCOPIC VIDEOS USING DEPTH MAP SEGMENTATION Sarikan, Selim S.; Olgun, Ramazan F.; Akar, Gözde (2011-09-09) This paper presents a new quality evaluation model for stereoscopic videos using depth map segmentation. This study includes both objective and subjective evaluation. The goal of this study is to understand the effect of different depth levels on the overall 3D quality. Test sequences with different coding schemes are used. The results show that overall quality has a strong correlation with the quality of the background, where disparity is smaller relative to the foreground. The results also showed that con...
Depth assisted object segmentation in multi-view video Cigla, Cevahir; Alatan, Abdullah Aydın (2008-01-01) In this work, a novel and unified approach for multi-view video (MVV) object segmentation is presented. In the first stage, a region-based graph-theoretic color segmentation algorithm is proposed, in which the popular Normalized Cuts segmentation method is improved with some modifications on its graph structure. Segmentation is obtained by recursive bi-partitioning of a weighted graph of an initial over-segmentation mask. The available segmentation mask is also utilized during dense depth map estimation ste...
Bi-directional 2-D mesh representation for video object rendering, editing and superresolution in the presence of occlusion Eren, Pekin Erhan; Tekalp, AM (2003-05-01) In this paper, we propose a new bi-directional 2-D mesh representation of video objects, which utilizes forward and backward reference frames (keyframes). This framework extends the previous uni-directional mesh representation to enable efficient rendering, editing, and superresolution of video objects in the presence of occlusion by allowing bidirectional texture mapping as in MPEG B-frames. The video object of interest is tracked between two successive keyframes (which can be automatically or interactivel...
Stability analysis of recurrent neural networks with piecewise constant argument of generalized type Akhmet, Marat; Yılmaz, Elanur (2010-09-01) In this paper, we apply the method of Lyapunov functions for differential equations with piecewise constant argument of generalized type to a model of recurrent neural networks (RNNs). The model involves both advanced and delayed arguments. Sufficient conditions are obtained for global exponential stability of the equilibrium point. Examples with numerical simulations are presented to illustrate the results.

Citation Formats

C. U. Ertenli, E. Akbaş, and R. G. Cinbiş, “Streaming Multiscale Deep Equilibrium Models,” Tel-Aviv-Yafo, İsrail, 2022, vol. 13671 LNCS, Accessed: 00, 2023. [Online]. Available: https://hdl.handle.net/11511/101838.