Voici les éléments 1 - 10 sur 30
  • Publication
    Métadonnées seulement
    Discretizing a compound distribution with application to categorical modelling
    Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the mixture probabilities of the conditional densities using information on population categories, thus modifying the original overall model. We thus obtain specific models for sub-populations that stem from the overall model. The distribution of a sub-population of interest is thus completely specified in terms of mixing probabilities. All characteristics of interest can be derived from this distribution and the comparison between sub-populations easily proceeds from the comparison of the mixing probabilities. A real example based on EU-SILC data is given. Then the methodology is investigated through simulation.
  • Publication
    Métadonnées seulement
    Modeling of income and indicators of poverty and social exclusion using the Generalized Beta Distribution of the Second Kind
    There are three reasons why estimation of parametric income distributions may be useful when empirical data and estimators are available: to stabilize estimation; to gain insight into the relationships between the characteristics of the theoretical distribution and a set of indicators, e.g. by sensitivity plots; and to deduce the whole distribution from known empirical indicators, when the raw data are not available. The European Union Statistics on Income and Living Conditions (EU-SILC) survey is used to address these issues. In order to model the income distribution, we consider the generalized beta distribution of the second kind (GB2). A pseudo-likelihood approach for fitting the distribution is considered, which takes into account the design features of the EU-SILC survey. An ad-hoc procedure for robustification of the sampling weights, which improves estimation, is presented. This method is compared to a non-linear fit from the indicators. Variance estimation within a complex survey setting of the maximum pseudo-likelihood estimates is done by linearization (a sandwich variance estimator), and a simplified formula for the sandwich variance, which accounts for clustering, is given. Performance of the fit and estimated indicators is evaluated graphically and numerically.
  • Publication
    Métadonnées seulement
    Discretizing a compound distribution with application to categorical modelling. Part I: Methods
    (Neuchâtel Université de Neuchâtel Institut de Statistique, 2014) ;
    Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to de_ne a partition of the domain of de_nition of the random parameters, so that we can represent the expected density of the variable of interest as a _nite mixture of conditional densities. We then model the probabilities of the conditional densities using information on population categories, thus modifying the original overall model. Our examples uses the European Union Statistics on Income and Living Conditions (EU-SILC) data. For each country, we estimate a mixture model derived from the GB2 in which the probability weights are predicted with household categories. Comparisons across countries are processed using compositional data analysis tools. Our method also o_ers an indirect estimation of inequality and poverty indices.
  • Publication
    Métadonnées seulement
    Compositional analysis of a mixture distribution with application to categorical modelling
    Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the probabilities of the conditional densities using information on population categories, thus modifying the original overall model. Our examples use the European Union Statistics on Income and Living Conditions (EU-SILC) data. For each country, we estimate a mixture model derived from the GB2 in which the probability weights are predicted with household categories. Comparisons across countries are processed using compositional data analysis tools. Our method also offers an indirect estimation of inequality and poverty indices.
  • Publication
    Métadonnées seulement
    GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation
    (Vienna (Austria) R Foundation for statistical computing, 2012-11-20) ;
    GB2 is a simple package that explores the Generalized Beta distribution of the second kind. Density, cumulative distribution function, quantiles and moments of the distributions are given. Functions for the full log-likelihood, the profile log-likelihood and the scores are provided. Formulae for various indicators of inequality and poverty under the GB2 are implemented. The GB2 is fitted using the methods of maximum pseudo-likelihood estimation using the full and profile log-likelihood, and non-linear least squares estimation of the model parameters. Various plots for the vizualization and analysis of the results are provided. Variance estimation of the parameters is provided for the method of maximum pseudo-likelihood estimation. A compound distribution based on the GB2 is presented. This compound distribution can be calculated by a left or right tail decomposition. Density, cumulative distribution function, moments and quantiles for the compound distribution can be calculated. The compound distribution is fitted using the method of maximum likelihood estimation. The fit can be also adapted for the use of auxiliary information.
  • Publication
    Métadonnées seulement
    Bias Robustness and Efficiency in Model-Based Inference
    In model-based inference, the selection of balanced samples has been considered to give protection against misspecification of the model. A recent development in finite population sampling is that balanced samples can be randomly selected. There are several possible strategies that use balanced samples. We give a definition of balanced sample that embodies overbalanced, mean-balanced, and $\pi$-balanced samples, and we derive strategies in order to equalize a $d$-weighted estimator with the best linear unbiased estimator. We show the value of selecting a balanced sample with inclusion probabilities proportional to the standard deviations of the errors with the Horvitz-Thompson estimator. This is a strategy that is design-robust and efficient. We show its superiority compared to other strategies that use balanced samples in the model-based framework. In particular, we show that this strategy is preferable to the use of overbalanced samples in the polynomial model. The problem of bias-robustness is also discussed, and we show how overspecifying the model can protect against misspecification.
  • Publication
    Métadonnées seulement
    Parametric modelling of income and indicators of poverty and social exclusion from EU-SILC data and simulated universe
    In the context of the AMELI project, we aim at developing reliable and efficient methodologies for the estimation of a certain set of indicators, computed within the EU-SILC survey, i.e. the median, the at-risk-of-poverty rate (ARPR), the relative median poverty gap (RMPG), the quintile share ratio (QSR) and the Gini index. The reason why parametric estimation may be useful when empirical data and estimators are available is threefold: 1. to stabilize estimation; 2. to get insight into the relationships between the characteristics of the theoretical distribution and a set of indicators, e.g. by sensitivity plots; 3. to deduce the whole distribution from known empirical indicators, when the raw data are not available. Special emphasis is laid on the Generalized Beta distribution of the second kind (GB2), derived by McDonald (1984). Apart from the scale parameter, this distribution has three shape parameters: the first governing the overall shape, the second the lower tail and the third the upper tail of the distribution. These characteristics give to the GB2 a large flexibility for fitting a wide range of empirical distributions and it has been established that it outperforms other four-parameter distributions for income data (Kleiber and Kotz, 2003). We have studied different types of estimation methods, taking into account the design features of the EU-SILC surveys. Pseudo-maximum likelihood estimation of the parameters is compared with a nonlinear fit from the indicators. Variance estimation is done by linearization and different types of simplified formulas for the variance proposed in the literature are evaluated by simulation. The GB2 can be represented as a compound distribution, the compounding parameter being the scale parameter. We use this property to decompose the GB2 distribution into a mixture of component distributions, to refine the GB2 fit of the income variable for subgroups and to ameliorate the GB2 estimates of the indicators. Computations are made on the synthetic universe AMELIA constructed from the EU-SILC data (Alfons et al., 2011) and the simulation is done with the R package SimFrame (Alfons, A., 2010). Both AMELIA and SimFrame are developed in the context of the AMELI project. The parametric methods we have developed are made available in the R package GB2, which is part of the output of the AMELI project. Ref: 1) AMELIwebsite 2) Alfons, A. (2010). simFrame: Simulation framework. R package version 0.2. URL http://CRAN.R-project.org/package=simFrame. 3) Alfons, A., Templ, M., Filzmoser, P., Kraft, S., Hulliger, B., Kolb, J.-P., and Münnich, R. (2011). Synthetic data generation of SILC data. Deliverable 6.2. of the Ameli project. 4) Graf, M. and Nedyalkova, D. (2010). GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation. R package version 1.0. 5) Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley & Sons, Hoboken, NJ.
  • Publication
    Métadonnées seulement
    Quality of EU-SILC data
    (Trier University of Trier, 2011) ;
    Wenger, A.
    ;
  • Publication
    Métadonnées seulement
    Report on the Simulation Results
    (Trier University of Trier, 2011)
    Hulliger, B.
    ;
    Alfons, A.
    ;
    Bruch, Ch.
    ;
    Filzmoser, P.
    ;
    ;
    kolb, J.-P.
    ;
    Lehtonen, R.
    ;
    Lussmann, D.
    ;
    Meraner, A.
    ;
    Münnich, R.
    ;
    ;
    Schoch, T.
    ;
    Templ, M.
    ;
    Valaste, M.
    ;
    Veijanen, A.
    ;
    Zins, S.