A pixel-by-pixel learned lossless image compression method with parallel decoding

Gümüş, Sinem
The success of deep learning in computer vision applications has led to the use of learning based algorithms also in image compression. Learning based lossless image compression algorithms can be divided into three categories, namely, pixel-by-pixel (or masked convolution based) algorithms, prior based algorithms and latent representation based algorithms. In the pixel-by-pixel algorithms, each pixel’s probability distribution is obtained by processing the previously coded left and upper neighbouring pixels with a neural network (NN), which is then used by an arithmetic coder for lossless compression. In the prior based algorithms, the probability distribution of the image is conditioned on a prior that is obtained with a NN and transmitted to the decoder. In the latent representation based algorithms, the image is transformed to a latent domain with a learned invertible mapping, and the latent representation is lossless compressed. This thesis studies a learned lossless image compression method that falls into the pixel-by-pixel (or masked convolution based) algorithms category. The study aims to provide a learned lossless image compression method by modelling each pixel’s probability distribution with a Gaussian Mixture Model (GMM), whose parameters are obtained by processing the pixel’s causal neighbourhood (i.e. previously compressed pixels) with a relatively simple NN. This causality dependency causes the decoder to operate sequentially, i.e. the NN has to be run for each pixel sequentially, which increases decoding time significantly. The causality dependency can be easily alleviated at the encoder via masked convolutions. To reduce the decoding time, parallel encoding and decoding algorithms are studied and implemented. The obtained lossless image compression performance is competitive and is compared to both state-of-the art traditional and learning based methods.


Image compression method based on learned lifting-based dwt and learned zerotree-like entropy model
Şahin, Uğur Berk; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2022-8)
The success of deep learning in computer vision has sparked great interest in investigating deep learning-based algorithms also in many image processing applications, including image compression. The most popular end-to-end learned image compression approaches are based on auto-encoder architectures, where the image is mapped via convolutional neural networks (CNNs) into a transform (latent) representation that is quantized and processed again with CNNs to obtain the reconstructed image. The quantized laten...
End-to-end learned image compression with conditional latent space modeling for entropy coding
Yesilyurt, Aziz Berkay; Kamışlı, Fatih (2021-01-24)
The use of neural networks in image compression enables transforms and probability models for entropy coding which can process images based on much more complex models than the simple Gauss-Markov models in traditional compression methods. All at the expense of higher computational complexity. In the neural-network based image compression literature, various methods to model the dependencies in the transform domain/latent space are proposed. This work uses an alternative method to exploit the dependencies o...
A low-complexity image compression approach with single spatial prediction mode and transform
Kamışlı, Fatih (2016-11-01)
The well-known low-complexity JPEG and the newer JPEG-XR systems are based on block-based transform and simple transform-domain coefficient prediction algorithms. Higher complexity image compression algorithms, obtainable from intra-frame coding tools of video coders H.264 or HEVC, are based on multiple block-based spatial-domain prediction modes and transforms. This paper explores an alternative low-complexity image compression approach based on a single spatial-domain prediction mode and transform, which ...
NUR YILMAZ, GÖKÇE; Akar, Gözde (2012-10-03)
In order to speed up the wide-spread proliferation of the 3D video technologies (e.g., coding, transmission, display, etc), the effect of these technologies on 3D perception should be efficiently and reliably investigated. Using Full-Reference (FR) objective metrics for this investigation is not practical especially for "on the fly" 3D perception evaluation. Thus, a Reduced Reference (RR) metric is proposed to predict the depth perception of 3D video in this paper. The color-plus-depth 3D video representati...
Adaptive mean-shift for automated multi object tracking
Beyan, C.; Temizel, Alptekin (2012-01-01)
Mean-shift tracking plays an important role in computer vision applications because of its robustness, ease of implementation and computational efficiency. In this study, a fully automatic multiple-object tracker based on mean-shift algorithm is presented. Foreground is extracted using a mixture of Gaussian followed by shadow and noise removal to initialise the object trackers and also used as a kernel mask to make the system more efficient by decreasing the search area and the number of iterations to conve...
Citation Formats
S. Gümüş, “A pixel-by-pixel learned lossless image compression method with parallel decoding,” M.S. - Master of Science, Middle East Technical University, 2022.