Voici les éléments 1 - 5 sur 5
  • Publication
    Accès libre
    Some Thoughts on Official Statistic and its Future
    (2021-10-19)
    In this article, we share some reflections on the state of statistical science and its evolution in the production systems of official statistics. Data sources and methods are evolving, raising questions about the future of official statistics. The history of the methods used deserves a closer look at the changes that are taking place in the world of official statistics.
  • Publication
    Accès libre
    A General Result For Selecting Balanced Unequal Probability Samples From a Stream
    (2019-8-1)
    Probability sampling methods were developed in the framework of survey statistics. Recently sampling methods are the subject of a renewed interest for the reduction of the size of large data sets. A particular application is sampling from a data stream. The stream is supposed to be so huge that the data cannot be saved. When a new unit appears, the decision to conserve it or not must be taken directly without examining all the units that already appeared in the stream. In this paper, we examine the existing possible methods for sampling with unequal probabilities from a stream. Next we propose a general result about sampling in several phases from a balanced sample that enables us to propose several new solutions for sampling and multi-phase sampling from a stream. Several new applications of this general result are developed.
  • Publication
    Métadonnées seulement
    Balanced k-Nearest Neighbor Imputation
    In order to overcome the problem of item nonresponse, random imputation methods are often used because they tend to preserve the distribution of the imputed variable. Among the random i.mputation methods, the random hot-deck has the interesting property of imputing observed values. A new random hot-deck imputation method is proposed. The key innovation of this method is that the selection of donors is viewed as a sampling problem and uses calibration and balanced sampling. This approach makes it possible to select donors such that if the auxiliary variables were imputed, their estimated totals would not change. As a consequence, very accurate and stable totals estimations can be obtained. Moreover, donors are selected in neighborhoods of recipients. In this way, the missing value of a recipient is replaced with an observed value of a similar unit. This second approach can greatly improve the quality of estimations. Finally, these two approaches imply underlying models and the method is resistent to model misspecification.
  • Publication
    Métadonnées seulement
    Variance estimation of the Gini index: Revisiting a result several times published
    Since Corrado Gini suggested the index that bears his name as a way of measuring inequality, the computation of variance of the Gini index has been subject to numerous publications. In this paper, we survey a large part of the literature related to the topic and show that the same results, as well as the same errors, have been republished several times, often with a clear lack of reference to previous work. Whereas existing literature on the subject is very fragmented, we regroup papers from various fields and attempt to bring a wider view of the problem. Moreover, we try to explain how this situation occurred and the main issues involved when trying to perform inference on the Gini index, especially under complex sampling designs. The interest of several linearization methods is discussed and the contribution of recent papers is evaluated. Also, a general result to linearize a quadratic form is given, allowing the approximation of variance to be computed in only a few lines of calculation. Finally, the relevance of the regression-based approach is evaluated and an empirical comparison is proposed.
  • Publication
    Accès libre
    Inference by linearization for Zenga’s new inequality index: a comparison with the Gini index
    Zenga’s new inequality curve and index are two recent tools for measuring inequality. Proposed in 2007, they should thus not be mistaken for anterior measures suggested by the same author. This paper focuses on the new measures only, which are hereafter referred to simply as the Zenga curve and Zenga index. The Zenga curve Z (alpha) involves the ratio of the mean income of the 100 alpha% poorest to that of the 100(1-alpha)% richest. The Zenga index can also be expressed by means of the Lorenz Curve and some of its properties make it an interesting alternative to the Gini index. Like most other inequality measures, inference on the Zenga index is not straightforward. Some research on its properties and on estimation has already been conducted but inference in the sampling framework is still needed. In this paper, we propose an estimator and variance estimator for the Zenga index when estimated from a complex sampling design. The proposed variance estimator is based on linearization techniques and more specifically on the direct approach presented by Demnati and Rao. The quality of the resulting estimators are evaluated in Monte Carlo simulation studies on real sets of income data. Finally, the advantages of the Zenga index relative to the Gini index are discussed.