TY - UNPB
TI - Discretizing a compound distribution with application to categorical modelling. Part I: Methods
KW - GB2 distribution, Mixture distribution, Maximum pseudolikelihood estimation, Sandwich variance estimator, Income distribution, Inequality and poverty indicators.
LA - en
AU - Graf, M.
AU - Nedyalkova, D.
PY - 2014
AB - Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to de_ne a partition of the domain of de_nition of the random parameters, so that we can represent the expected density of the variable of interest as a _nite mixture of conditional densities. We then model the probabilities of the conditional densities using information on population categories, thus modifying the original overall model. Our examples uses the European Union Statistics on Income and Living Conditions (EU-SILC) data. For each country, we estimate a mixture model derived from the GB2 in which the probability weights are predicted with household categories. Comparisons across countries are processed using compositional data analysis tools. Our method also o_ers an indirect estimation of inequality and poverty indices.
PB - Université de Neuchâtel
CY - Neuchâtel
T3 - Institut de Statistique
ER -
TY - CONF
TI - Compositional analysis of a mixture distribution with application to categorical modelling
UR - http://www.statistik.tuwien.ac.at/CoDaWork/CoDaWork2013Proceedings.pdf
AU - Graf, M.
AU - Nedyalkova, D.
LA - en
PY - 2013
DA - 10.12
C2 - 2013
AB - Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the probabilities of the conditional densities using information on population categories, thus modifying the original overall model. Our examples use the European Union Statistics on Income and Living Conditions (EU-SILC) data. For each country, we estimate a mixture model derived from the GB2 in which the probability weights are predicted with household categories. Comparisons across countries are processed using compositional data analysis tools. Our method also offers an indirect estimation of inequality and poverty indices.
KW - GB2 distribution, Mixture distribution, Maximum pseudolikelihood estimation, Sandwich variance estimator, Income distribution, Inequality and poverty indicators.
T2 - CodaWork 2013
CY - Vorau, Austria
SP - 53
ER -
TY - JOUR
TI - Modeling of income and indicators of poverty and social exclusion using the Generalized Beta Distribution of the Second Kind
LA - en
AU - Graf, M.
AU - Nedyalkova, D.
PY - 2013
DA - .
T2 - Review of Income and Wealth
VL - à paraître
SP - 1
EP - 20
ER -
TY - JOUR
TI - Bias Robustness and Efficiency in Model-Based Inference
UR - http://www3.stat.sinica.edu.tw/statistica/oldpdf/A22n215.pdf
KW - Balanced sampling, finite population sampling, polynomial model, ratio model, robust estimation
LA - en
AU - Nedyalkova, D.
AU - Tillé, Y.
PY - 2012
DA - 4.9
AB - In model-based inference, the selection of balanced samples has been considered to give protection against misspecification of the model. A recent development in finite population sampling is that balanced samples can be randomly selected. There are several possible strategies that use balanced samples. We give a definition of balanced sample that embodies overbalanced, mean-balanced, and $\pi$-balanced samples, and we derive strategies in order to equalize a $d$-weighted estimator with the best linear unbiased estimator. We show the value of selecting a balanced sample with inclusion probabilities proportional to the standard deviations of the errors with the Horvitz-Thompson estimator. This is a strategy that is design-robust and efficient. We show its superiority compared to other strategies that use balanced samples in the model-based framework. In particular, we show that this strategy is preferable to the use of overbalanced samples in the polynomial model. The problem of bias-robustness is also discussed, and we show how overspecifying the model can protect against misspecification.
T2 - Statistica Sinica
VL - 22
SP - 777
EP - 794
ER -
TY - RPRT
TI - GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation
UR - http://cran.r-project.org/web/packages/GB2/index.html
AU - Graf, M.
AU - Nedyalkova, D.
PY - 2012
DA - 20.11
AB - GB2 is a simple package that explores the Generalized Beta distribution of the second kind. Density, cumulative distribution function, quantiles and moments of the distributions are given. Functions for the full log-likelihood, the profile log-likelihood and the scores are provided. Formulae for various indicators of inequality and poverty under the GB2 are implemented. The GB2 is fitted using the methods of maximum pseudo-likelihood estimation using the full and profile log-likelihood, and non-linear least squares estimation of the model parameters. Various plots for the vizualization and analysis of the results are provided. Variance estimation of the parameters is provided for the method of maximum pseudo-likelihood estimation. A compound distribution based on the GB2 is presented. This compound distribution can be calculated by a left or right tail decomposition. Density, cumulative distribution function, moments and quantiles for the compound distribution can be calculated. The compound distribution is fitted using the method of maximum likelihood estimation. The fit can be also adapted for the use of auxiliary information.
PB - R Foundation for statistical computing
CY - Vienna (Austria)
LA - en
SP - 44
ER -
TY - RPRT
TI - Report on the Simulation Results
AU - Hulliger, B.
AU - Alfons, A.
AU - Bruch, C.
AU - Filzmoser, P.
AU - Graf, M.
AU - kolb, J. P.
AU - Lehtonen, R.
AU - Lussmann, D.
AU - Meraner, A.
AU - Münnich, R.
AU - Nedyalkova, D.
AU - Schoch, T.
AU - Templ, M.
AU - Valaste, M.
AU - Veijanen, A.
AU - Zins, S.
PY - 2011
DA - .
PB - University of Trier
CY - Trier
LA - en
T2 - AMELI - Advanced Methodology for European Laeken Indicators
M3 - Research project Report
SN - WP7, D7.1, FP7-SSH-2007-217322
ER -
TY - RPRT
TI - Policy Recommendations and Methodological Report
AU - Münnich, R.
AU - Zins, S.
AU - Alfons, A.
AU - Bruch, C.
AU - Filzmoser, P.
AU - Graf, M.
AU - Hulliger, B.
AU - kolb, J. P.
AU - Lehtonen, R.
AU - Lussmann, D.
AU - Meraner, A.
AU - Myrskylä, M.
AU - Nedyalkova, D.
AU - Schoch, T.
AU - Templ, M.
AU - Valaste, M.
AU - Veijanen, A.
PY - 2011
DA - .
PB - University of Trier
CY - Trier
LA - en
T2 - AMELI - Advanced Methodology for European Laeken Indicators
M3 - Research Project Report
SN - WP10, D10.1., D10.2, FP7-SSH-2007-217322
ER -
TY - RPRT
TI - Quality of EU-SILC data
AU - Graf, M.
AU - Wenger, A.
AU - Nedyalkova, D.
PY - 2011
DA - .
PB - University of Trier
CY - Trier
LA - en
T2 - AMELI - Advanced Methodology for European Laeken Indicators
M3 - Research project Report
SN - WP5, D5.1, FP7-SSH-2007-217322
ER -
TY - SLIDE
TI - Parametric modelling of income and indicators of poverty and social exclusion from EU-SILC data and simulated universe
LA - en
AU - Nedyalkova, D.
AU - Graf, M.
PY - 2011
DA - 23.5
AB - In the context of the AMELI project, we aim at developing reliable and efficient methodologies for the estimation of a certain set of indicators, computed within the EU-SILC survey, i.e. the median, the at-risk-of-poverty rate (ARPR), the relative median poverty gap (RMPG), the quintile share ratio (QSR) and the Gini index.
The reason why parametric estimation may be useful when empirical data and estimators are available is threefold: 1. to stabilize estimation; 2. to get insight into the relationships between the characteristics of the theoretical distribution and a set of indicators, e.g. by sensitivity plots; 3. to deduce the whole distribution from known empirical indicators, when the raw data are not available.
Special emphasis is laid on the Generalized Beta distribution of the second kind (GB2), derived by McDonald (1984). Apart from the scale parameter, this distribution has three shape parameters: the first governing the overall shape, the second the lower tail and the third the upper tail of the distribution. These characteristics give to the GB2 a large flexibility for fitting a wide range of empirical distributions and it has been established that it outperforms other four-parameter distributions for income data (Kleiber and Kotz, 2003).
We have studied different types of estimation methods, taking into account the design features of the EU-SILC surveys. Pseudo-maximum likelihood estimation of the parameters is compared with a nonlinear fit from the indicators. Variance estimation is done by linearization and different types of simplified formulas for the variance proposed in the literature are evaluated by simulation.
The GB2 can be represented as a compound distribution, the compounding parameter being the scale parameter. We use this property to decompose the GB2 distribution into a mixture of component distributions, to refine the GB2 fit of the income variable for subgroups and to ameliorate the GB2 estimates of the indicators.
Computations are made on the synthetic universe AMELIA constructed from the EU-SILC data (Alfons et al., 2011) and the simulation is done with the R package SimFrame (Alfons, A., 2010). Both AMELIA and SimFrame are developed in the context of the AMELI project. The parametric methods we have developed are made available in the R package GB2, which is part of the output of the AMELI project.
Ref: 1) AMELIwebsite
2) Alfons, A. (2010). simFrame: Simulation framework. R package version 0.2. URL http://CRAN.R-project.org/package=simFrame.
3) Alfons, A., Templ, M., Filzmoser, P., Kraft, S., Hulliger, B., Kolb, J.-P., and Münnich, R. (2011). Synthetic data generation of SILC data. Deliverable 6.2. of the Ameli project.
4) Graf, M. and Nedyalkova, D. (2010). GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation. R package version 1.0.
5) Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley & Sons, Hoboken, NJ.
ER -
TY - CHAP
TI - Tirages coordonnés d'échantillons poissoniens
T2 - Pratiques et méthodes de sondage
CY - Paris
AU - Nedyalkova, D.
AU - Qualité, L.
AU - Tillé, Y.
A3 - M. E. Tramblay
A3 - P. Lavallée
A3 - M. El Haj Tirari
LA - fr
PY - 2011
T3 - Pratiques et méthodes de sondage
PB - Dunod
ER -
TY - SLIDE
TI - Tirages coordonnés d'échantillons poissoniens
LA - en
AU - Qualité, L.
AU - Nedyalkova, D.
AU - Tillé, Y.
PY - 2010
DA - .3
ER -
TY - SLIDE
TI - Tirages coordonnés d'échantillons poissoniens
LA - fr
AU - Nedyalkova, D.
AU - Qualité, L.
AU - Tillé, Y.
PY - 2009
DA - .1
ER -
TY - THES
TI - Evaluation and development of strategies for sample coordination and statistical inference in finite population sampling
UR - http://doc.rero.ch/lm.php?url=1000,40,4,20090707141936-WZ/Th_nedyalkovaD.pdf
LA - en
AU - Nedyalkova, D.
PY - 2009
PB - Université de Neuchâtel
CY - Neuchâtel
T2 - Institut de statistique
M3 - PhD in Statistics
ER -
TY - JOUR
TI - General framework for the rotation of units in repeated survey sampling
UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9574.2009.00423.x/abstract
KW - sampling algorithms;sample coordination;business surveys;systematic sampling
LA - en
AU - Nedyalkova, D.
AU - Qualité, L.
AU - Tillé, Y.
PY - 2009
DA - 23.3
AB - Coordination of probabilistic samples is a challenging theoretical problem faced by statistical institutes. One of their aims is to obtain good estimates for each wave while spreading the response burden across the entire population. There is a collection of existing solutions that try to attend to these needs. These solutions, which were developed independently, are integrated in a general framework and their corresponding longitudinal designs are computed. The properties of these longitudinal designs are discussed. It is also noted that there is an antagonism between a good rotation and control over the cross-sectional sampling design. A compromise needs to be reached between the quality of the sample coordination, which appears to be optimal for a systematic longitudinal sampling design, and the freedom of choice of the cross-sectional design. In order to reach such a compromise, an algorithm that uses a new method of longitudinal sampling is proposed.
T2 - Statistica Neerlandica
IS - 3
VL - 63
SP - 269
EP - 293
ER -
TY - RPRT
TI - Tirages coordonnés d'échantillons à entropie maximale
UR - https://www2.unine.ch/files/content/sites/statistics/files/shared/Publications/Ned_Qua_Til_09_Tirages_coord_%20d%27%C3%A9chant_%20%C3%A0_entropie_max.pdf
AU - Nedyalkova, D.
AU - Qualité, L.
AU - Tillé, Y.
PY - 2009
DA - 14.3
PB - Université de Neuchâtel
CY - Neuchâtel
LA - fr
ER -
TY - SLIDE
TI - General Framework for the Rotation of Units in Repeated Survey Sampling - Un Cadre général pour la rotation des individus dans les enquêtes répétées
LA - en
AU - Nedyalkova, D.
AU - Qualité, L.
AU - Tillé, Y.
PY - 2008
DA - .5
ER -
TY - SLIDE
TI - General Framework for the Rotation of Units in Repeated Survey Sampling
LA - en
AU - Nedyalkova, D.
AU - Qualité, L.
PY - 2008
DA - .6
ER -
TY - SLIDE
TI - Optimal Sampling and Estimation Strategies under Linear Model - Stratégies d'échantillonnage et d'estimation optimales sous un modèle linéaire
LA - en
AU - Nedyalkova, D.
AU - Tillé, Y.
PY - 2008
DA - .5
ER -
TY - JOUR
TI - Optimal sampling and estimation strategies under linear model
UR - http://biomet.oxfordjournals.org/content/95/3/521.abstract
KW - Balanced sampling Design-based inference Finite population sampling Fully explainable heteroscedasticity Model-assisted inference Model-based inference Optimal strategy
LA - en
AU - Nedyalkova, D.
AU - Tillé, Y.
PY - 2008
DA - 23.3
AB - In some cases model-based and model-assisted inferences can lead to very different estimators. These two paradigms are not so different if we search for an optimal strategy rather than just an optimal estimator, a strategy being a pair composed of a sampling design and an estimator. We show that, under a linear model, the optimal model-assisted strategy consists of a balanced sampling design with inclusion probabilities that are proportional to the standard deviations of the errors of the model and the Horvitz–Thompson estimator. If the heteroscedasticity of the model is ‚fully explainable’ by the auxiliary variables, then this strategy is also optimal in a model-based sense. Moreover, under balanced sampling and with inclusion probabilities that are proportional to the standard deviation of the model, the best linear unbiased estimator and the Horvitz–Thompson estimator are equal. Finally, it is possible to construct a single estimator for both the design and model variance. The inference can thus be valid under the sampling design and under the model.
T2 - Biometrika
IS - 3
VL - 95
SP - 521
EP - 537
ER -
TY - JOUR
TI - Sampling Procedures for Coordinating Stratified Samples: Methods Based on Microstrata
UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1751-5823.2008.00057.x/abstract
LA - en
AU - Nedyalkova, D.
AU - Tillé, Y.
AU - Pea, J.
PY - 2008
DA - 23.3
AB - The aim of sampling coordination is to maximize or minimize the overlap between several samples drawn successively in a population that changes over time. Therefore, the selection of a new sample will depend on the samples previously drawn. In order to obtain a larger (or smaller) overlap of the samples than the one obtained by independent selection of samples, a dependence between the samples must be introduced. This dependence will emphasize (or limit) the number of common units in the selected samples. Several methods for coordinating stratified samples, such as the Kish & Scott method, the Cotton & Hesse method, and the Rivière method, have already been developed. Using simulations, we compare the optimality of these methods and their quality of coordination. We present six new methods based on permanent random numbers (PRNs) and microstrata. These new methods have the advantage of allowing us to choose between positive or negative coordination with each of the previous samples. Simulations are run to test the validity of each of them.
T2 - International Statistical Review
IS - 3
VL - 76
SP - 368
EP - 386
ER -
TY - CHAP
TI - Tirages coordonnés d'échantillons stratifiés : méthodes basées sur des microstrates
T2 - Méthodes d'enquêtes : applications aux enquêtes longitudinales, à la santé et aux enquêtes électorales
CY - Paris
UR - http://www.dunod.com/sciences-techniques/sciences-fondamentales/mathematiques/master-et-doctorat-capes-agreg/methodes-de-sondage
AU - Nedyalkova, D.
AU - Tillé, Y.
A3 - P. Guilbert
A3 - D. Haziza
A3 - A. Ruiz-Gazen
A3 - Y. Tillé
LA - fr
PY - 2008
T3 - Méthodes d'enquêtes : applications aux enquêtes longitudinales, à la santé et aux enquêtes électorales
PB - Dunod
ER -
TY - SLIDE
TI - Sampling procedures for coordinating stratified samples: methods based on microstrata
LA - en
AU - Nedyalkova, D.
AU - Tillé, Y.
AU - p, .
AU - Pea, J.
PY - 2007
DA - .12
ER -
TY - SLIDE
TI - Tirages coordonnés d'échantillons stratifiés : méthodes basées sur des microstrates
LA - fr
AU - Nedyalkova, D.
AU - Pea, J.
PY - 2007
DA - .11
ER -
TY - RPRT
TI - A Review of Some Current Methods of Coordination of Stratified Samples. Introduction and Comparison of New Methods Based on Microstrata
UR - https://www2.unine.ch/files/content/sites/statistics/files/shared/Publications/Ned_Pea_Til_09_A_Review_of_Some_Curren_Methods_of_Coordination_of_Stratified%20Samples.pdf
AU - Nedyalkova, D.
AU - Pea, J.
AU - Tillé, Y.
PY - 2006
DA - 14.3
PB - Université de Neuchâtel
CY - Neuchâtel
LA - en
ER -
TY - CONF
TI - Comparaison entre l'inférence basée sur le modèle et sur le plan de sondage pour estimer un total dans une population finie
AU - Nedyalkova, D.
AU - Tillé, Y.
LA - en
PY - 2006
DA - .
T2 - 38èmes journées de Statistique
ER -