ON DISENTANGLED REPRESENTATION LEARNING

2024-7
MOĞULTAY ÖZCAN, HAZAL
Disentanglement is the problem of obtaining a representation, where the underlying sources of variation, generating the data are independently captured. A pioneering DRL method is Beta Variational Auto Encoder (β-VAE), which introduces a new hyperparameter β to weight the disentanglement term of the VAE loss function, which is empirically optimized. In this thesis, we make three contributions to improve the disentanglement capacity of β-VAE. First, to automatically estimate the β parameter, we propose Learnable VAE (L-VAE). L-VAE, mitigates the hyperparameter optimization problem of β-VAE by learning the relative weights of different terms in the loss function to dynamically control the trade-off between disentanglement and reconstruction. These weights and the parameters of the model architecture are learned concurrently, eliminating the complexity of empirical hyperparameter optimization. Both β-VAE and L-VAE introduce the same weight on all of the dimensions of the representation. However, we show that each dimension has a different degree of disentanglement. In order to dynamically learn an independent weight per dimension, we propose Multi-Dimensional Learnable VAE (mdL-VAE) as an extension to L-VAE. We show that both methods provide on par or better disentanglement-reconstruction trade-off without tuning a β hyperparameter and that mdL-VAE provide useful insights about the entanglement between the underlying factors of variations. Finally, we introduce a novel correlation-based disentanglement (CbD) measure that allows interpreting the degree of disentanglement for each dimension of the representation robustly. We demonstrate that CbD provides complementary and useful insights about the disentanglement performance of different disentanglement methods.
Citation Formats
H. MOĞULTAY ÖZCAN, “ON DISENTANGLED REPRESENTATION LEARNING,” Ph.D. - Doctoral Program, Middle East Technical University, 2024.