End-to-end learned image compression with conditional latent space modelling for entropy coding

Download

index.pdf

Date

2019

Author

Yeşilyurt, Aziz Berkay

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

390
views

150
downloads

This thesis presents a lossy image compression system based on an end-to-end trainable neural network. Traditional compression algorithms use linear transformation, quantization and entropy coding steps that are designed based on simple models of the data and are aimed to be low complexity. In neural network based image compression methods, the processing steps, such as transformation and entropy coding, are performed using neural networks. The use of neural networks enables transforms or probability models for entropy coding that can optimally process or represent data with much more complex dependencies instead of simple models, all at the expense of higher computational complexity than traditional methods. One major line of work on neural network based lossy image compression uses an autoencoder-type neural network for the transform and inverse transform of the compression system. The quantization of the latent variables, i.e. transform coefficients, and the arithmetic coding of the quantized latent variables are done with traditional methods. However, the probability distribution of the latent variables, which the arithmetic encoder works with, is represented also with a neural network. Parameters of all neural networks in the system are learned jointly from a training set of real images by minimizing the rate-distortion cost. One major work assumes the latent variables in a single channel (i.e. feature map or signal band) are independent and learns a single distribution model for each channel. The same authors then extend their work by incorporating a hyperprior neural network to capture the dependencies in the latent representation and improve the compression performance significantly. This thesis uses an alternative method to exploit the dependencies of the latent representation. The joint density of the latent representation is modeled as a product of conditional densities, which are learned using neural networks. However, each latent variable is not conditioned on all previous latent variables as in the Chain rule of factoring joint distributions, but only on a few previous variables, in particular the left, upper and upper-left spatial neighbors of that latent variable based on Markov property assumption. The compression performance is on par with the hyperprior based work, but the conditional densities require a much simpler network than the hyperprior network in the literature. While the conditional densities require much less training time due to their simplicity and less number of parameters than the hyperprior based neural network, their inference time is longer.

Subject Keywords

Image compression., Keywords: Image compression, transform coding, deep learning, conditional modelling.

URI

http://etd.lib.metu.edu.tr/upload/12623753/index.pdf
https://hdl.handle.net/11511/44158

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

End-to-end learned image compression with conditional latent space modeling for entropy coding Yesilyurt, Aziz Berkay; Kamışlı, Fatih (2021-01-24) The use of neural networks in image compression enables transforms and probability models for entropy coding which can process images based on much more complex models than the simple Gauss-Markov models in traditional compression methods. All at the expense of higher computational complexity. In the neural-network based image compression literature, various methods to model the dependencies in the transform domain/latent space are proposed. This work uses an alternative method to exploit the dependencies o...
DEEP LEARNING-BASED UNROLLED RECONSTRUCTION METHODS FOR COMPUTATIONAL IMAGING Bezek, Can Deniz; Öktem, Sevinç Figen; Department of Electrical and Electronics Engineering (2021-9-08) Computational imaging is the process of forming images from indirect measurements using computation. In this thesis, we develop deep learning-based unrolled reconstruction methods for various computational imaging modalities. Firstly, we develop two deep learning-based reconstruction methods for diffractive multi-spectral imaging. The first approach is based on plug-and-play regularization with deep denoisers whereas the second one is an end-to-end learned reconstruction based on unrolling. Secondly, we con...
PROGRESSIVE COMPRESSION OF DIGITAL ELEVATION DATA USING MESHES Kose, Kivanc; Yılmaz, Erdal; ÇETİN, AHMET ENİS (2009-07-17) In this paper a new Digital Elevation Map (DEM) image compression algorithm is proposed. DEM image can be threated as a grayscale image, whose pixel values are the elevation values of the map points. The grayscale DEM image is compressed using an adaptive wavelet based image compression algorithm. The method, which is an extension of the progressive mesh compression takes advantage of the multiresolution property of the wavelets while coding the map images. This makes it possible to decode different resolut...
Efficient algorithms for convolutional inverse problems in multidimensional imaging Doğan, Didem; Öktem, Figen S.; Department of Electrical and Electronics Engineering (2020) Computational imaging is the process of indirectly forming images from measurements using image reconstruction algorithms that solve inverse problems. In many inverse problems in multidimensional imaging such as spectral and depth imaging, the measurements are in the form of superimposed convolutions related to the unknown image. In this thesis, we first provide a general formulation for these problems named as convolutional inverse problems, and then develop fast and efficient image reconstruction algorith...
On lossless intra coding in HEVC with 3-tap filters Alvar, Saeed Ranjbar; Kamışlı, Fatih (2016-09-01) This paper presents a pixel-by-pixel spatial prediction method for lossless intra coding within High Efficiency Video Coding (HEVC). Previous pixel-by-pixel spatial prediction methods use only two neighboring pixels for prediction, based on the angular projection idea borrowed from block-based intra prediction in lossy coding, or are based on ad hoc methods applied in some intra modes. This paper explores a pixel-by-pixel prediction method which uses three neighboring pixels for prediction according to a tw...

Citation Formats

A. B. Yeşilyurt, “End-to-end learned image compression with conditional latent space modelling for entropy coding,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Electrical and Electronics Engineering., Middle East Technical University, 2019.