Options
Convention Université de Neuchâtel/Office fédéral de la statistique
Titre du projet
Convention Université de Neuchâtel/Office fédéral de la statistique
Description
Convention cadre entre l'Université de Neuchâtel et l'Office fédéral de la statistique. Un programme de travail est établi chaque année. Les recherches portent sur le développement de méthodes statistiques dans le domaine de la statistique publique.
Chercheur principal
Statut
Completed
Date de début
1 Janvier 2001
Date de fin
1 Janvier 2016
Organisations
Identifiant interne
16871
identifiant
49 Résultats
Voici les éléments 1 - 10 sur 49
- PublicationMétadonnées seulementDecomposition of Gender Wage Inequalities through Calibration: Application to the Swiss Structure of Earnings Survey(2017-12-21)
; This paper proposes a new approach to decompose the wage difference between men and women that is based on a calibration procedure. This approach generalizes two current decomposition methods that are re-expressed using survey weights. The first one is the Blinder-Oaxaca method and the second one is a reweighting method proposed by DiNardo, Fortin and Lemieux. The new approach provides a weighting system that enables us to estimate such parameters of interest like quantiles. An application to data from the Swiss Structure of Earnings Survey shows the interest of this method. - PublicationMétadonnées seulementLa variance sous calage: Mode d’emploi la macro SURVEYCALLa macro SAS SURVEYCAL, programmée par Monique Graf, est le résultat d’un mandat confié à l’Institut de statistique de l’université de Neuchâtel par la section METH de l’Office fédéral de la statistique. Il s’agit d’étendre au cas du calage sur marges les résultats fournis par la procédure SAS SURVEYMEANS. Cette procédure procure des méthodes d’estimation basée sur le plan d’échantillonnage, dans le cas d’une enquête basée sur un plan de taille fixe (stratifié en grappes). SURVEYCAL permet de traiter pratiquement tous les cas envisagés dans SURVEYMEANS. Ce document est d’abord un mode d’emploi de SURVEYCAL. S’y rajoutent des illustrations utilisant des données provenant de l’enquête SILC 2009 et quelques recommandations pour choisir la méthode de calage d’une part et le mode de calcul de la variance par linéarisation, d’autre part. On introduit une méthode originale pour le calcul des bornes de calage dans les cas linéaire tronqué et logit.
- PublicationAccès libreImputation of income variables in a survey context and estimation of variance for indicators of poverty and social exclusion(2014-11-25)We present a method of imputation for income variables allowing direct analysis of the distribution of such data, particularly the estimation of complex statistics such as indicators for poverty and social exclusion as well as the estimation of their precision.
- PublicationMétadonnées seulementWeighted distributionsIn a super-population statistical model, a variable of interest, defined on a finite population of size N, is considered as a set of N independent realizations of the model. The log-likelihood at the population level is then written as a sum. If only a sample is observed, drawn according to a design with unequal inclusion probabilities, the log-pseudo-likelihood is the Horvitz-Thompson estimate of the population log-likelihood. In general, the extrapolation weights are multiplied by a normalization factor, in such a way that normalized weights sum to the sample size. In a single level design, the value of estimated model parameters are unchanged by the scaling of weights, but it is in general not the case for multi-level models. The problem of the choice of the normalization factors in cluster sampling has been largely addressed in the literature, but no clear recommendations have been issued. It is proposed here to compute the factors in such a way that the pseudo-likelihood becomes a proper likelihood. The super-population model can be written equivalently for the variable of interest or for a transformation of this variable. It is shown that the pseudo-likelihood is not invariant by transformation of the variable of interest.
- PublicationMétadonnées seulementVariance Estimation for Regression Imputed Quantiles, A first Step towards Variance Estimation for Inequality Indicators(2014-8-20)In a sample survey only a sub-part of the selected sample has answered (total non-response, treated by re-weighting). Moreover, some respondents did not answer all questions (partial non-response, treated through imputation). One is interested in income type variables. One further supposes here that the imputation is carried out by a regression. The idea presented by Deville and Särndal in 1994 is resumed, which consists in constructing an unbiased estimator of the variance of a total based solely on the known information (on the selected sample and the subset of respondents). While these authors dealt with a conventional total of an interest variable y, a similar development is reproduced in the case where the considered total is one of the linearized variable of quantiles or of inequality indicators, and that, furthermore, it is computed from the imputed variable y. By means of simulations on real survey data, one shows that regression imputation can have an important impact on the bias and variance estimations of inequality indicators. This leads to a method capable of taking into account the variance due to imputation in addition to the one due to the sampling design in the cases of quantiles.
- PublicationMétadonnées seulementQuelques Remarques sur un Petit Exemple de Jean-Claude Deville au Sujet de la Non-Réponse Non-Ignorable(2016-12-20)Un petit exemple présenté par Jean-Claude Deville en 2005 est soumis à trois méthodes d'estimation~: la méthode des moments, la méthode du maximum de vraisemblance et le calage généralisé. Les trois méthodes donnent exactement les mêmes résultats pour les deux modèles de non-réponse. On discute ensuite de la manière de choisir le modèle le plus adéquat.
- PublicationMétadonnées seulementEstimation of poverty indicators in small areas under skewed distributions(2014)
; ;Marin, Juan MiguelMolina, IsabelThe standard methods for poverty mapping at local level assume that incomes follow a log-normal model. However, the log-normal distribution is not always well suited for modeling the income, which often shows skewness even at the log scale. As an alternative, we propose to consider a much more flexible distribution called generalized beta distribution of the second kind (GB2). The flexibility of the GB2 distribution arises from the fact that it contains four parameters in contrast with the two parameters of the log normal. One of the parameters of the GB2 controls the shape of left tail and another controls the shape of the right tail, making it suitable to model different forms of skewness. In particular, it includes the log-normal distribution as a limiting case. In this sense, it can be seen as an extension of the log-normal model to handle more adequately potential atypical or extreme values and it has been successfully applied to model the income. We propose a small area model for the incomes based on a multivariate extension of the GB2 distribution. Under this model, we define empirical best (EB) estimators of general non-linear area parameters; in particular, poverty indicators and we describe how to obtain Monte Carlo approximations of the EB estimators. A parametric bootstrap procedure is proposed for estimation of the mean squared error. - PublicationMétadonnées seulementRegression for Compositions based on a Generalization of the Dirichlet DistributionConsider a positive random vector following a compound distribution where the compounding parameter multiplies non-random scale parameters. The associated composition is the vector divided by the sum of its components. The conditions under which the composition depends on the distribution of the compounding parameter are given. When the original vector follows a compound distribution based on independent Generalized Gamma components, the Simplicial Generalized Beta (SGB) is the most general distribution of the composition that is invariant with respect to the distribution of the compounding parameter. Some properties and moments of the SGB are derived. Conditional moments given a sub-composition give a way to impute missing parts when knowing a sub-composition only. Distributional checks are made possible through the marginal distributions of functions of the parts that should be Beta distributed. A multiple SGB regression procedure is set up and applied to data from the United Kingdom Time Use survey.
- PublicationMétadonnées seulementDiscretizing a compound distribution with application to categorical modelling(2017-2-17)
; Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the mixture probabilities of the conditional densities using information on population categories, thus modifying the original overall model. We thus obtain specific models for sub-populations that stem from the overall model. The distribution of a sub-population of interest is thus completely specified in terms of mixing probabilities. All characteristics of interest can be derived from this distribution and the comparison between sub-populations easily proceeds from the comparison of the mixing probabilities. A real example based on EU-SILC data is given. Then the methodology is investigated through simulation. - PublicationMétadonnées seulementUne propriété intéressante de l'entropie de certains plans d'échantillonnage(2010-12-21)
;Haziza, David