Preprint
Article

Stratied Finite Empirical Bernstein Sampling

Altmetrics

Downloads

819

Views

879

Comments

0

This version is not peer-reviewed

Submitted:

31 May 2019

Posted:

31 May 2019

You are already at the latest version

Alerts
Abstract
We derive a concentration inequality for the uncertainty in the mean computed by stratified random sampling, and provide an online sampling method based on this inequality. Our concentration inequality is versatile and considers a range of factors including: the data ranges, weights, sizes of the strata, the number of samples taken, the estimated sample variances, and whether strata are sampled with or without replacement. Sequentially choosing samples to minimize this inequality leads to a online method for choosing samples from a stratified population. We evaluate and compare the effectiveness of our method against others for synthetic data sets, and also in approximating the Shapley value of cooperative games. Results show that our method is competitive with the performance of Neyman sampling with perfect variance information, even without having prior information on strata variances. We also provide a multidimensional extension of our inequality and discuss future applications.
Keywords: 
Subject: Computer Science and Mathematics  -   Probability and Statistics
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated