NEUROCOMPUTING, cilt.650, 2025 (SCI-Expanded, Scopus)
The major goal of disentangled representation learning is to form representation space, which independently captures the underlying sources of variation, responsible for generating the data. A pioneering approach is suggested under the group of methods based on Autoencoders (AE), such as, Variational Autoencoders (VAE), /3-Variational Autoencoders (/3-VAE), a-VAE, Control-VAE, Dynamic-VAE and Learnable-VAE (L-VAE). These methods incorporate a disentanglement term, mostly expressed as a Kullback-Leibler Divergence and several hyperparameters and regularization terms in the loss function. These methods assume an equal disentanglement degree for the sources indifferent dimensions of the representation by using an empirically adjusted and fixed /3 parameter (/3 = 0 or >= 1) across all dimensions. However, given the unobservable nature of the data-generating process and potential entanglements of different sources, we expect that distinct dimensions of the learned representation should exhibit varying degrees of disentanglement. In this study, we generalize the variational autoencoders and its variants by introducing a set of flexible weight functions and regularization terms for different dimensions. This generalization enables us to disentangle each latent dimension by learning the weight function of each dimension, independently. We also propose a special case of generalized VAE, called Multidimensional Learnable Variational Autoencoders, (mdL-VAE), which provide abetter disentanglement-reconstruction trade-off without empirically tuning the hyperparameters of the loss function. The learned weight functions of mdL-VAE provide us useful insights about the degree of entanglement among the underlying factors of variation.