Options
Parametric modelling of income and indicators of poverty and social exclusion from EU-SILC data and simulated universe
Date de parution
2011-5-23
Résumé
In the context of the AMELI project, we aim at developing reliable and efficient methodologies for the estimation of a certain set of indicators, computed within the EU-SILC survey, i.e. the median, the at-risk-of-poverty rate (ARPR), the relative median poverty gap (RMPG), the quintile share ratio (QSR) and the Gini index.
The reason why parametric estimation may be useful when empirical data and estimators are available is threefold: 1. to stabilize estimation; 2. to get insight into the relationships between the characteristics of the theoretical distribution and a set of indicators, e.g. by sensitivity plots; 3. to deduce the whole distribution from known empirical indicators, when the raw data are not available.
Special emphasis is laid on the Generalized Beta distribution of the second kind (GB2), derived by McDonald (1984). Apart from the scale parameter, this distribution has three shape parameters: the first governing the overall shape, the second the lower tail and the third the upper tail of the distribution. These characteristics give to the GB2 a large flexibility for fitting a wide range of empirical distributions and it has been established that it outperforms other four-parameter distributions for income data (Kleiber and Kotz, 2003).
We have studied different types of estimation methods, taking into account the design features of the EU-SILC surveys. Pseudo-maximum likelihood estimation of the parameters is compared with a nonlinear fit from the indicators. Variance estimation is done by linearization and different types of simplified formulas for the variance proposed in the literature are evaluated by simulation.
The GB2 can be represented as a compound distribution, the compounding parameter being the scale parameter. We use this property to decompose the GB2 distribution into a mixture of component distributions, to refine the GB2 fit of the income variable for subgroups and to ameliorate the GB2 estimates of the indicators.
Computations are made on the synthetic universe AMELIA constructed from the EU-SILC data (Alfons et al., 2011) and the simulation is done with the R package SimFrame (Alfons, A., 2010). Both AMELIA and SimFrame are developed in the context of the AMELI project. The parametric methods we have developed are made available in the R package GB2, which is part of the output of the AMELI project.
Ref: 1) AMELIwebsite
2) Alfons, A. (2010). simFrame: Simulation framework. R package version 0.2. URL http://CRAN.R-project.org/package=simFrame.
3) Alfons, A., Templ, M., Filzmoser, P., Kraft, S., Hulliger, B., Kolb, J.-P., and Münnich, R. (2011). Synthetic data generation of SILC data. Deliverable 6.2. of the Ameli project.
4) Graf, M. and Nedyalkova, D. (2010). GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation. R package version 1.0.
5) Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley & Sons, Hoboken, NJ.
The reason why parametric estimation may be useful when empirical data and estimators are available is threefold: 1. to stabilize estimation; 2. to get insight into the relationships between the characteristics of the theoretical distribution and a set of indicators, e.g. by sensitivity plots; 3. to deduce the whole distribution from known empirical indicators, when the raw data are not available.
Special emphasis is laid on the Generalized Beta distribution of the second kind (GB2), derived by McDonald (1984). Apart from the scale parameter, this distribution has three shape parameters: the first governing the overall shape, the second the lower tail and the third the upper tail of the distribution. These characteristics give to the GB2 a large flexibility for fitting a wide range of empirical distributions and it has been established that it outperforms other four-parameter distributions for income data (Kleiber and Kotz, 2003).
We have studied different types of estimation methods, taking into account the design features of the EU-SILC surveys. Pseudo-maximum likelihood estimation of the parameters is compared with a nonlinear fit from the indicators. Variance estimation is done by linearization and different types of simplified formulas for the variance proposed in the literature are evaluated by simulation.
The GB2 can be represented as a compound distribution, the compounding parameter being the scale parameter. We use this property to decompose the GB2 distribution into a mixture of component distributions, to refine the GB2 fit of the income variable for subgroups and to ameliorate the GB2 estimates of the indicators.
Computations are made on the synthetic universe AMELIA constructed from the EU-SILC data (Alfons et al., 2011) and the simulation is done with the R package SimFrame (Alfons, A., 2010). Both AMELIA and SimFrame are developed in the context of the AMELI project. The parametric methods we have developed are made available in the R package GB2, which is part of the output of the AMELI project.
Ref: 1) AMELIwebsite
2) Alfons, A. (2010). simFrame: Simulation framework. R package version 0.2. URL http://CRAN.R-project.org/package=simFrame.
3) Alfons, A., Templ, M., Filzmoser, P., Kraft, S., Hulliger, B., Kolb, J.-P., and Münnich, R. (2011). Synthetic data generation of SILC data. Deliverable 6.2. of the Ameli project.
4) Graf, M. and Nedyalkova, D. (2010). GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation. R package version 1.0.
5) Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley & Sons, Hoboken, NJ.
Notes
, Séminar, OFS, Neuchâtel, Switzerland
Identifiants
Type de publication
conference presentation