Options
Graf, Monique
Nom
Graf, Monique
Affiliation principale
Fonction
Ancien.ne collaborateur.trice
Identifiants
Résultat de la recherche
Voici les éléments 1 - 8 sur 8
- PublicationMétadonnées seulementRegression for Compositions based on a Generalization of the Dirichlet DistributionConsider a positive random vector following a compound distribution where the compounding parameter multiplies non-random scale parameters. The associated composition is the vector divided by the sum of its components. The conditions under which the composition depends on the distribution of the compounding parameter are given. When the original vector follows a compound distribution based on independent Generalized Gamma components, the Simplicial Generalized Beta (SGB) is the most general distribution of the composition that is invariant with respect to the distribution of the compounding parameter. Some properties and moments of the SGB are derived. Conditional moments given a sub-composition give a way to impute missing parts when knowing a sub-composition only. Distributional checks are made possible through the marginal distributions of functions of the parts that should be Beta distributed. A multiple SGB regression procedure is set up and applied to data from the United Kingdom Time Use survey.
- PublicationMétadonnées seulementSGB R-package Simplicial Generalized Beta RegressionPackage SGB contains a generalization of the Dirichlet distribution, called the Simplicial Generalized Beta (SGB). It is a new distribution on the simplex (i.e. on the space of compositions or positive vectors with sum of components equal to 1). The Dirichlet distribution can be constructed from a random vector of independent Gamma variables divided by their sum. The SGB follows the same construction with generalized Gamma instead of Gamma variables. The Dirichlet exponents are supplemented by an overall shape parameter and a vector of scales. The scale vector is itself a composition and can be modeled with auxiliary variables through a log-ratio transformation.
- PublicationMétadonnées seulementA distribution on the simplex of the Generalized Beta type(2018)Consider a random vector with positive components following a compound distribution where the compounding parameter multiplies fixed scale parameters. The closed random vector is the vector divided by the sum of its components. We explicit on what conditions the distribution of the closed random vector does not depend on the mixing distribution. When the original vector has independent generalized Gamma components, it is shown that the unrelatedness of the distribution of the closed random vector to the compounding distribution depends on the parameters of the generalized Gamma. This fact is exemplified with the multivariate Generalized Beta distribution of the second kind (MGB2) in which the compounding parameter follows an inverse Gamma distribution. We call the most general distribution of the closed random vector, for which the compounding parameter has no influence, the simplicial Generalized Beta (SGB). Some properties and moments of the SGB are derived. Conditional moments given a sub-composition give a way to impute missing parts when knowing a sub-composition only. Maximum likelihood estimators of the parameters are obtained. The method is applied to several examples.
- PublicationMétadonnées seulementWeighted distributionsIn a super-population statistical model, a variable of interest, defined on a finite population of size N, is considered as a set of N independent realizations of the model. The log-likelihood at the population level is then written as a sum. If only a sample is observed, drawn according to a design with unequal inclusion probabilities, the log-pseudo-likelihood is the Horvitz-Thompson estimate of the population log-likelihood. In general, the extrapolation weights are multiplied by a normalization factor, in such a way that normalized weights sum to the sample size. In a single level design, the value of estimated model parameters are unchanged by the scaling of weights, but it is in general not the case for multi-level models. The problem of the choice of the normalization factors in cluster sampling has been largely addressed in the literature, but no clear recommendations have been issued. It is proposed here to compute the factors in such a way that the pseudo-likelihood becomes a proper likelihood. The super-population model can be written equivalently for the variable of interest or for a transformation of this variable. It is shown that the pseudo-likelihood is not invariant by transformation of the variable of interest.
- PublicationMétadonnées seulementBibliographie sur la stratificationLa stratification d’une population en vue d’une enquête est en général basée sur une variable clé connue sur la population et bien corrélée avec les principales variables de l’enquête. Le problème de la définition de la stratification peut se décomposer en trois parties : 1. Recherche du nombre de classes ; 2. Calcul des limites de classes pour un nombre de classes donné ; 3. Allocation, c’est-à -dire détermination du nombre d’éléments à échantillonner dans chaque classe. La stratification multivariée base la définition des classes sur plusieurs variables. Le rapport est en cours de rédaction et comprend une bibliographie commentée d’environ 160 articles.
- PublicationMétadonnées seulementDiscretizing a compound distribution with application to categorical modelling. Part I: Methods(Neuchâtel Université de Neuchâtel Institut de Statistique, 2014)
; Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to de_ne a partition of the domain of de_nition of the random parameters, so that we can represent the expected density of the variable of interest as a _nite mixture of conditional densities. We then model the probabilities of the conditional densities using information on population categories, thus modifying the original overall model. Our examples uses the European Union Statistics on Income and Living Conditions (EU-SILC) data. For each country, we estimate a mixture model derived from the GB2 in which the probability weights are predicted with household categories. Comparisons across countries are processed using compositional data analysis tools. Our method also o_ers an indirect estimation of inequality and poverty indices. - PublicationMétadonnées seulementEstimation of poverty indicators in small areas under skewed distributions(2014)
; ;Marin, Juan MiguelMolina, IsabelThe standard methods for poverty mapping at local level assume that incomes follow a log-normal model. However, the log-normal distribution is not always well suited for modeling the income, which often shows skewness even at the log scale. As an alternative, we propose to consider a much more flexible distribution called generalized beta distribution of the second kind (GB2). The flexibility of the GB2 distribution arises from the fact that it contains four parameters in contrast with the two parameters of the log normal. One of the parameters of the GB2 controls the shape of left tail and another controls the shape of the right tail, making it suitable to model different forms of skewness. In particular, it includes the log-normal distribution as a limiting case. In this sense, it can be seen as an extension of the log-normal model to handle more adequately potential atypical or extreme values and it has been successfully applied to model the income. We propose a small area model for the incomes based on a multivariate extension of the GB2 distribution. Under this model, we define empirical best (EB) estimators of general non-linear area parameters; in particular, poverty indicators and we describe how to obtain Monte Carlo approximations of the EB estimators. A parametric bootstrap procedure is proposed for estimation of the mean squared error.