End-to-end learned image compression with normalizing flows for latent space enhancement

Yavuz, Fatih
Learning based methods for image compression recently received considerable attention and demonstrated promising performance, surpassing many commonly used codecs. Architectures of learning based methodologies are typically comprised of a nonlinear analysis transform, which maps the input image to a latent representation, a synthesis transform that maps the quantized latent representation back to the image domain and a model for the probability distribution of the latent representation. Successful modelling of the probability distribution of the latent representation is critically important for their performance. Inspired by the success of normalizing flows as generative models, this work proposes a framework that utilizes flow based neural networks to improve the modelling of the probability distribution of the latent representation and consequently, the performance of a commonly known learned image compression network that is used as a benchmark. Normalizing flows implement an invertible mapping from one distribution to another, allowing the latent representation to be mapped to another domain in which its probability distribution can better match an intended probability distribution. The proposed networks are trained in an end-to-end fashion and can outperform the benchmark in rate-distortion performance.


End-to-end learned image compression with conditional latent space modeling for entropy coding
Yesilyurt, Aziz Berkay; Kamışlı, Fatih (2021-01-24)
The use of neural networks in image compression enables transforms and probability models for entropy coding which can process images based on much more complex models than the simple Gauss-Markov models in traditional compression methods. All at the expense of higher computational complexity. In the neural-network based image compression literature, various methods to model the dependencies in the transform domain/latent space are proposed. This work uses an alternative method to exploit the dependencies o...
Joint utilization of fixed and variable-length codes for improving synchronization immunity for image transmission
Alatan, Abdullah Aydın (1998-01-01)
Robust transmission of images is achieved by using fixed and variable-length coding together without much loss in compression efficiency. The probability distribution function of a DCT coefficient can be divided into two regions using a threshold; so that one portion contains roughly equiprobable transform coefficients. While fixed-length coding, which is a powerful solution to the synchronization problem, is used in this inner equiprobable region without sacrificing compression, the outer (saturating) regi...
End-to-end learned image compression with conditional latent space modelling for entropy coding
Yeşilyurt, Aziz Berkay; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2019)
This thesis presents a lossy image compression system based on an end-to-end trainable neural network. Traditional compression algorithms use linear transformation, quantization and entropy coding steps that are designed based on simple models of the data and are aimed to be low complexity. In neural network based image compression methods, the processing steps, such as transformation and entropy coding, are performed using neural networks. The use of neural networks enables transforms or probability models...
Lossless Image and Intra-Frame Compression With Integer-to-Integer DST
Kamışlı, Fatih (2019-02-01)
Video coding standards are primarily designed for efficient lossy compression, but it is also desirable to support efficient lossless compression within video coding standards using small modifications to the lossy coding architecture. A simple approach is to skip transform and quantization, and simply entropy code the prediction residual. However, this approach is inefficient at compression. A more efficient and popular approach is to skip transform and quantization but also process the residual block in s...
Intra prediction with 3-tap filters for lossless and lossy video coding
Ranjbar Alvar, Saeed; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2016)
Video coders are primarily designed for lossy compression. The basic steps in modern lossy video compression are block-based spatial or temporal prediction, transformation of the prediction error block, quantization of the transform coefficients and entropy coding of the quantized coefficients together with other side information. In some cases, this lossy coding architecture may not be efficient for compression. For example, when lossless video compression is desirable, the transform and quantization steps...
Citation Formats
F. Yavuz, “End-to-end learned image compression with normalizing flows for latent space enhancement,” M.S. - Master of Science, Middle East Technical University, 2022.