Login
A General Result For Selecting Balanced Unequal Probability Samples From a Stream
Résumé Probability sampling methods were developed in the framework of survey statistics. Recently sampling methods are the subject of a renewed interest for the reduction of the size of large data sets. A particular application is sampling from a data stream.
The stream is supposed to be so huge that the data cannot be saved. When a new unit appears, the decision to conserve it or not must be taken directly without examining all the units that already appeared in the stream. In this paper, we examine the existing possible methods for sampling with unequal probabilities from a stream. Next we propose a general result about sampling in several phases from a balanced sample that enables us to propose several new solutions for sampling and multi-phase sampling from a stream. Several new applications of this general result are developed.
   
Mots-clés balanced sampling; Chao method; two phases; sampling; stream
   
Citation Tillé, Y. (2019). A General Result For Selecting Balanced Unequal Probability Samples From a Stream. Information Processing Letters, 105840(152), 1-6.
   
Type Article de périodique (Anglais)
Date de publication 1-8-2019
Nom du périodique Information Processing Letters
Volume 105840
Numéro 152
Pages 1-6