A pixel-by-pixel learned lossless image compression method with parallel decoding

Download
2022-7
Gümüş, Sinem
The success of deep learning in computer vision applications has led to the use of learning based algorithms also in image compression. Learning based lossless image compression algorithms can be divided into three categories, namely, pixel-by-pixel (or masked convolution based) algorithms, prior based algorithms and latent representation based algorithms. In the pixel-by-pixel algorithms, each pixel’s probability distribution is obtained by processing the previously coded left and upper neighbouring pixels with a neural network (NN), which is then used by an arithmetic coder for lossless compression. In the prior based algorithms, the probability distribution of the image is conditioned on a prior that is obtained with a NN and transmitted to the decoder. In the latent representation based algorithms, the image is transformed to a latent domain with a learned invertible mapping, and the latent representation is lossless compressed. This thesis studies a learned lossless image compression method that falls into the pixel-by-pixel (or masked convolution based) algorithms category. The study aims to provide a learned lossless image compression method by modelling each pixel’s probability distribution with a Gaussian Mixture Model (GMM), whose parameters are obtained by processing the pixel’s causal neighbourhood (i.e. previously compressed pixels) with a relatively simple NN. This causality dependency causes the decoder to operate sequentially, i.e. the NN has to be run for each pixel sequentially, which increases decoding time significantly. The causality dependency can be easily alleviated at the encoder via masked convolutions. To reduce the decoding time, parallel encoding and decoding algorithms are studied and implemented. The obtained lossless image compression performance is competitive and is compared to both state-of-the art traditional and learning based methods.

Suggestions

End-to-end learned image compression with conditional latent space modeling for entropy coding
Yesilyurt, Aziz Berkay; Kamışlı, Fatih (2021-01-24)
The use of neural networks in image compression enables transforms and probability models for entropy coding which can process images based on much more complex models than the simple Gauss-Markov models in traditional compression methods. All at the expense of higher computational complexity. In the neural-network based image compression literature, various methods to model the dependencies in the transform domain/latent space are proposed. This work uses an alternative method to exploit the dependencies o...
A low-complexity image compression approach with single spatial prediction mode and transform
Kamışlı, Fatih (2016-11-01)
The well-known low-complexity JPEG and the newer JPEG-XR systems are based on block-based transform and simple transform-domain coefficient prediction algorithms. Higher complexity image compression algorithms, obtainable from intra-frame coding tools of video coders H.264 or HEVC, are based on multiple block-based spatial-domain prediction modes and transforms. This paper explores an alternative low-complexity image compression approach based on a single spatial-domain prediction mode and transform, which ...
Adaptive mean-shift for automated multi object tracking
Beyan, C.; Temizel, Alptekin (2012-01-01)
Mean-shift tracking plays an important role in computer vision applications because of its robustness, ease of implementation and computational efficiency. In this study, a fully automatic multiple-object tracker based on mean-shift algorithm is presented. Foreground is extracted using a mixture of Gaussian followed by shadow and noise removal to initialise the object trackers and also used as a kernel mask to make the system more efficient by decreasing the search area and the number of iterations to conve...
AN ABSTRACTION BASED REDUCED REFERENCE DEPTH PERCEPTION METRIC FOR 3D VIDEO
NUR YILMAZ, GÖKÇE; Akar, Gözde (2012-10-03)
In order to speed up the wide-spread proliferation of the 3D video technologies (e.g., coding, transmission, display, etc), the effect of these technologies on 3D perception should be efficiently and reliably investigated. Using Full-Reference (FR) objective metrics for this investigation is not practical especially for "on the fly" 3D perception evaluation. Thus, a Reduced Reference (RR) metric is proposed to predict the depth perception of 3D video in this paper. The color-plus-depth 3D video representati...
A NOVEL SHADOW RESTORATION ALGORITHM BASED ON ATMOSPHERIC EFFECTS FOR AERIAL IMAGES
Aytekin, Caglar; Alatan, Abdullah Aydın (2010-09-29)
In aerial images, the performance of the segmentation and object recognition algorithms could suffer due to shadows in the scene. This effort describes a novel shadow restoration algorithm based on atmospheric effects and characteristics of sun light for aerial images. Firstly, shadow regions are detected exploiting the Rayleigh scattering phenomena and the well-known fact related to the low illumination intensity in the shadow regions. After detection, shadow restoration is achieved by first restoring part...
Citation Formats
S. Gümüş, “A pixel-by-pixel learned lossless image compression method with parallel decoding,” M.S. - Master of Science, Middle East Technical University, 2022.