Voici les éléments 1 - 10 sur 279
  • Publication
    Accès libre
    Gender wage difference estimation at quantile levels using sample survey data
    (2023-09-19)
    Mihaela-Cătălina Anastasiade-Guinand
    ;
    ;
    This paper is motivated by the growing interest in estimating gender wage differences in official statistics. The wage of an employee is hypothetically a reflection of her or his characteristics, such as education level or work experience. It is possible that men and women with the same characteristics earn different wages. Our goal is to estimate the differences between wages at different quantiles, using sample survey data within a superpopulation framework. To do this, we use a parametric approach based on conditional distributions of the wages in function of some auxiliary information, as well as a counterfactual distribution. We show in our simulation studies that the use of auxiliary information well correlated with the wages reduces the variance of the counterfactual quantile estimates compared to those of the competitors. Since, in general, wage distributions are heavy-tailed, the interest is to model wages by using heavy-tailed distributions like the GB2 distribution. We illustrate the approach using this distribution and the wages for men and women using simulated and real data from the Swiss Federal Statistical Office.
  • Publication
    Accès libre
    An Efficient Approach for Statistical Matching of Survey Data Trough Calibration, Optimal Transport and Balanced Sampling
    Statistical matching aims to integrate two statistical sources. These sources can be two samples or a sample and the entire population. If two samples have been selected from the same population and information has been collected on different variables of interest, then it is interesting to match the two surveys to analyse, for example, contingency tables or correlations. In this paper, we propose an efficient method for matching two samples that may each contain a weighting scheme. The method matches the records of the two sources. Several variants are proposed in order to create a directly usable file integrating data from both information sources.
  • Publication
    Accès libre
    Some Thoughts on Official Statistic and its Future
    (2021-10-19)
    In this article, we share some reflections on the state of statistical science and its evolution in the production systems of official statistics. Data sources and methods are evolving, raising questions about the future of official statistics. The history of the methods used deserves a closer look at the changes that are taking place in the world of official statistics.
  • Publication
    Accès libre
  • Publication
    Accès libre
    Linearization and Variance Estimation of the Bonferroni Inequality Index
    (Neuchâtel Institut de Statistique Faculté des sciences, 2021) ; ;
    Giorgi, Giovanni M.
    ;
    Guandalini, Alessio
    The study of income inequality is important for predicting the wealth of a country. There is an increasing number of publications where the authors call for the use of several indices simultaneously to better account for the wealth distribution. Due to the fact that income data are usually collected through sample surveys, the sampling properties of income inequality measures should not be overlooked. The most widely used inequality measure is the Gini index, and its inferential aspects have been deeply investigated. An alternative inequality index could be the Bonferroni inequality index, although less attention on its inference has been paid in the literature. The aim of this paper is to address the inference of the Bonferroni index in a finite population framework. The Bonferroni index is linearized by differentiation with respect to the sample indicators which allows for conducting a valid inference. Furthermore, the linearized variables are used to evaluate the effects of the different observations on the Bonferroni and Gini indices. The result demonstrates once for all that the former is more sensitive to the lowest incomes in the distribution than the latter.
  • Publication
    Accès libre
    Enhanced cube implementation for highly stratified population
    A balanced sampling design should always be the adopted strategy if auxiliary information is available. In addition, integrating a stratified structure of the population in the sampling process can considerably reduce the variance of the estimators. We propose here a new method to handle the selection of a balanced sample in a highly stratified population. The method improves substantially the commonly used sampling designs and reduces the time-consuming problem that could arise if inclusion probabilities within strata do not sum to an integer.
  • Publication
    Restriction temporaire
    Spatial Spread Sampling Using Weakly Associated Vectors
    Geographical data are generally autocorrelated. In this case, it is preferable to select spread units. In this paper, we propose a new method for selecting well-spread samples from a finite spatial population with equal or unequal inclusion probabilities. The proposed method is based on the definition of a spatial structure by using a stratification matrix. Our method exactly satisfies given inclusion probabilities and provides samples that are very well spread. A set of simulations shows that our method outperforms other existing methods such as the generalized random tessellation stratified or the local pivotal method. Analysis of the variance on a real dataset shows that our method is more accurate than these two. Furthermore, a variance estimator is proposed.
  • Publication
    Restriction temporaire
    Méthodes d’estimation sur petits domaines avec échantillonnage défini par un seuil d’inclusion
    (2020-6-1)
    Guadarrama, María
    ;
    Molina, Isabel
    ;
    L’échantillonnage défini par un seuil d’inclusion est appliqué quand il est trop coûteux ou difficile d’obtenir les informations requises pour un sous-ensemble d’unités de la population et que, par conséquent, ces unités sont délibérément exclues de la sélection de l’échantillon. Si les unités exclues sont différentes des unités échantillonnées pour ce qui est des caractéristiques d’intérêt, les estimateurs naïfs peuvent être fortement biaisés. Des estimateurs par calage ont été proposés aux fins de réduction du biais sous le plan. Toutefois, dans les estimations sur petits domaines, ils peuvent être inefficaces y compris en l’absence d’échantillonnage défini par un seuil d’inclusion. Les méthodes d’estimation sur petits domaines fondées sur un modèle peuvent servir à réduire le biais causé par l’échantillonnage défini par un seuil d’inclusion si le modèle supposé se vérifie pour l’ensemble de la population. Parallèlement, pour les petits domaines, ces méthodes fournissent des estimateurs plus efficaces que les méthodes de calage. Étant donné qu’on obtient les propriétés fondées sur un modèle en supposant que le modèle se vérifie, mais qu’aucun modèle n’est exactement vrai, nous analysons ici les propriétés de plan des procédures de calage et des procédures fondées sur un modèle pour l’estimation de caractéristiques sur petits domaines sous échantillonnage défini par un seuil d’inclusion. Nos conclusions confirment que les estimateurs fondés sur un modèle réduisent le biais causé par un échantillonnage défini par un seuil d’inclusion et donnent des résultats significativement meilleurs en matière d’erreur quadratique moyenne du plan.