A Study of the Classification of Low-Dimensional Data with Supervised Manifold Learning

2018-01-01
Supervised manifold learning methods learn data representations by preserving the geometric structure of data while enhancing the separation between data samples from different classes. In this work, we propose a theoretical study of supervised manifold learning for classification. We consider nonlinear dimensionality reduction algorithms that yield linearly separable embeddings of training data and present generalization bounds for this type of algorithms. A necessary condition for satisfactory generalization performance is that the embedding allow the construction of a sufficiently regular interpolation function in relation with the separation margin of the embedding. We show that for supervised embeddings satisfying this condition, the classification error decays at an exponential rate with the number of training samples. Finally, we examine the separability of supervised nonlinear embeddings that aim to preserve the low-dimensional geometric structure of data based on graph representations. The proposed analysis is supported by experiments on several real data sets.
JOURNAL OF MACHINE LEARNING RESEARCH

Suggestions

Out-of-Sample Generalizations for Supervised Manifold Learning for Classification
Vural, Elif (2016-03-01)
Supervised manifold learning methods for data classification map high-dimensional data samples to a lower dimensional domain in a structure-preserving way while increasing the separation between different classes. Most manifold learning methods compute the embedding only of the initially available data; however, the generalization of the embedding to novel points, i.e., the out-of-sample extension problem, becomes especially important in classification applications. In this paper, we propose a semi-supervis...
A Theoretical Analysis of Multi-Modal Representation Learning with Regular Functions
Vural, Elif (2021-01-07)
Multi-modal data analysis methods often learn representations that align different modalities in a new common domain, while preserving the within-class compactness and within-modality geometry and enhancing the between-class separation. In this study, we present a theoretical performance analysis for multi-modal representation learning methods. We consider a quite general family of algorithms learning a nonlinear embedding of the data space into a new space via regular functions. We derive sufficient condit...
PROGRESSIVE CLUSTERING OF MANIFOLD-MODELED DATA BASED ON TANGENT SPACE VARIATIONS
Gokdogan, Gokhan; Vural, Elif (2017-09-28)
An important research topic of the recent years has been to understand and analyze manifold-modeled data for clustering and classification applications. Most clustering methods developed for data of non-linear and low-dimensional structure are based on local linearity assumptions. However, clustering algorithms based on locally linear representations can tolerate difficult sampling conditions only to some extent, and may fail for scarcely sampled data manifolds or at high-curvature regions. In this paper, w...
A neuro-fuzzy MAR algorithm for temporal rule-based systems
Sisman, NA; Alpaslan, Ferda Nur; Akman, V (1999-08-04)
This paper introduces a new neuro-fuzzy model for constructing a knowledge base of temporal fuzzy rules obtained by the Multivariate Autoregressive (MAR) algorithm. The model described contains two main parts, one for fuzzy-rule extraction and one for the storage of extracted rules. The fuzzy rules are obtained from time series data using the MAR algorithm. Time-series analysis basically deals with tabular data. It interprets the data obtained for making inferences about future behavior of the variables. Fu...
A Survey of Constrained Clustering
Dinler, Derya; Tural, Mustafa Kemal (Springer-Verlag, 2016-04-01)
Traditional data mining methods for clustering only use unlabeled data objects as input. The aim of such methods is to find a partition of these unlabeled data objects in order to discover the underlying structure of the data. In some cases, there may be some prior knowledge about the data in the form of (a few number of) labels or constraints. Performing traditional clustering methods by ignoring the prior knowledge may result in extracting irrelevant information for the user. Constrained clustering, i.e.,...
Citation Formats
E. Vural, “A Study of the Classification of Low-Dimensional Data with Supervised Manifold Learning,” JOURNAL OF MACHINE LEARNING RESEARCH, pp. 1–55, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/52793.