Voici les éléments 1 - 3 sur 3
  • Publication
    Accès libre
    An Efficient Approach for Statistical Matching of Survey Data Trough Calibration, Optimal Transport and Balanced Sampling
    Statistical matching aims to integrate two statistical sources. These sources can be two samples or a sample and the entire population. If two samples have been selected from the same population and information has been collected on different variables of interest, then it is interesting to match the two surveys to analyse, for example, contingency tables or correlations. In this paper, we propose an efficient method for matching two samples that may each contain a weighting scheme. The method matches the records of the two sources. Several variants are proposed in order to create a directly usable file integrating data from both information sources.
  • Publication
    Accès libre
    Enhanced cube implementation for highly stratified population
    A balanced sampling design should always be the adopted strategy if auxiliary information is available. In addition, integrating a stratified structure of the population in the sampling process can considerably reduce the variance of the estimators. We propose here a new method to handle the selection of a balanced sample in a highly stratified population. The method improves substantially the commonly used sampling designs and reduces the time-consuming problem that could arise if inclusion probabilities within strata do not sum to an integer.
  • Publication
    Restriction temporaire
    Spatial Spread Sampling Using Weakly Associated Vectors
    Geographical data are generally autocorrelated. In this case, it is preferable to select spread units. In this paper, we propose a new method for selecting well-spread samples from a finite spatial population with equal or unequal inclusion probabilities. The proposed method is based on the definition of a spatial structure by using a stratification matrix. Our method exactly satisfies given inclusion probabilities and provides samples that are very well spread. A set of simulations shows that our method outperforms other existing methods such as the generalized random tessellation stratified or the local pivotal method. Analysis of the variance on a real dataset shows that our method is more accurate than these two. Furthermore, a variance estimator is proposed.