Article
Version 1
Preserved in Portico This version is not peer-reviewed
ASISO: A Robust and Stable Data Synthesis Method to Optimize Samples
Version 1
: Received: 18 August 2023 / Approved: 18 August 2023 / Online: 18 August 2023 (12:00:13 CEST)
A peer-reviewed article of this Preprint also exists.
Du, Y.; Cai, Y.; Jin, X.; Wang, H.; Li, Y.; Lu, M. ASIDS: A Robust Data Synthesis Method for Generating Optimal Synthetic Samples. Mathematics 2023, 11, 3891. Du, Y.; Cai, Y.; Jin, X.; Wang, H.; Li, Y.; Lu, M. ASIDS: A Robust Data Synthesis Method for Generating Optimal Synthetic Samples. Mathematics 2023, 11, 3891.
Abstract
Most existing data synthesis methods are designed to tackle problems such as dataset imbalance, data anonymization and insufficient sample size. There is a lack of effective synthesis methods for the limited number of datasets which contain a large of features and unknown noise to expand the size of the dataset. We propose a data synthesis method, named Adaptive Subspace Interpolation for Sample Optimization (ASISO). The idea is to divide the original feature space into several subspaces with an equal number of samples, and then perform interpolation for the samples in the adjacent subspaces. This method can adaptively adjust the size of the dataset containing unknown noise, and the expanded data typically contain minimal error with actual. Moreover, it adjusts the structure of the samples, which can significantly reduce the proportion of samples with large errors. In addition, the hyperparameters of this method have an intuitive explanation and usually require little calibration. Experimental results on artificial data and benchmark data sets demonstrate that ASISO is a robust and stable method to optimize samples.
Keywords
data synthesis; unknown noise; interpolation; sample optimization; robust
Subject
Computer Science and Mathematics, Probability and Statistics
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment