We propose a forward-only multi-layer encoding-decoding framework based on the principle of Maximal Coding Rate Reduction (MCR$^2$), an information-theoretic metric that measures a statistical distance between two sets of feature vectors up to the second moment. The encoder directly transforms data vectors themselves via gradient ascent to maximize the MCR$^2$ distance between different classes in the feature space, resulting in class-wise mutually orthogonal subspace representations. The decoder follows a process symmetric to the encoder, and transforms the subspace feature vectors via gradient descent to minimize the MCR$^2$ distance between the reconstructed data and the original data. We show that the encoder transforms data to linear discriminative representations without breaking the higher-order manifolds, and the decoder reconstructs the data with high fidelity.
Keywords:
Subject: Computer Science and Mathematics - Computer Science
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.