Options
Nedyalkova, Desislava
Nom
Nedyalkova, Desislava
Affiliation principale
Identifiants
Résultat de la recherche
Voici les éléments 1 - 10 sur 30
- PublicationAccès libreEvaluation and development of strategies for sample coordination and statistical inference in finite population samplingCette thèse de doctorat se concentre sur deux sujets importants de la théorie des sondages. La première partie traite du problème du fondement de l'inférence statistique en populations finies. La seconde partie traite de la question de la coordination d'échantillons dans le temps. La thèse est basée sur quatre articles, dont trois ont été déjà publiés dans des revues internationales et le quatrième a été soumis pour publication. Dans les premières chapitres de la thèse, on discute de l'optimalité de stratégies composées d'un plan d'échantillonnage et d'un estimateur. On démontre que la stratégie qui consiste à utiliser l'échantillonnage équilibré avec des probabilités proportionnelles aux erreurs du modèle linéaire, et l'estimateur de Horvitz-Thompson est optimale sous le plan et sous le modèle. En suite, on montre que cette stratégie est toujours robuste et efficace dans le cas où le modèle s'avère faux en prenant un exemple sous le modèle polynomial. Les dernières chapitres traitent un premier temps de la coordination d'échantillons stratifiés, des méthodes existante dont on compare la qualité de coordination et l'optimalité à l'aide d'une étude de simulation. On propose de nouvelles méthodes basées sur des microstrates et on teste, à nouveau par simulations, leur validité. Enfin, on a réalisé une étude plus fondamentale de l'échantillonnage répété dans le temps. On y présente les plans longitudinaux les plus connus. On note qu'il y a un antagonisme entre une bonne coordination et le choix libre d'un plan transversal. On propose également une nouvelle méthode qui peut remédier à ce problème., This Ph.D. thesis concentrates on two important subjects in survey sampling theory. One is the problem of the foundation for statistical inference in finite population sampling, and the other is the problem of coordination of samples over time. The thesis is based on four articles. Three of them are already published and the last one is submitted for publication. First, we show that the model-based and design-based inferences can be reconciliated if we search for an optimal strategy rather than just an optimal estimator, a strategy being a pair composed of a sampling design and an estimator. If we accept the idea that balanced samples are randomly selected, e.g. by the cube method, then we show that, under the linear model, an optimal strategy consists of a balanced sampling design with inclusion probabilities that are proportional to the standard deviations of the errors of the model and the Horvitz-Thompson estimator. Moreover, if the heteroscedasticity of the model is "fully explainable" by the auxiliary variables, then the best linear unbiased estimator and the Horvitz-Thompson estimator are equal. We construct a single estimator for both the design and model variance. The inference can thus be valid under the sampling design and under the model. Finally, we show that this strategy is robust and efficient when the model is misspecified. Coordination of probabilistic samples is a challenging theoretical problem faced by statistical institutes. One of their aims is to maximize or minimize the overlap between several samples drawn successively in a population that changes over time. In order to do that, a dependence between the samples must be introduced. Several methods for coordinating stratified samples have already been developed. Using simulations, we compare their optimality and quality of coordination. We present new methods based on Permanent Random Numbers (PRNs) and microstrata which have the advantage of allowing us to choose between positive or negative coordination with each of the previous samples. Simulations are run to test the validity of each of them. Another aim of sampling coordination is to obtain good estimates for each wave while spreading the response burden across the entire population. We review the existing solutions. We compute their corresponding longitudinal designs and discuss their properties. We note that there is an antagonism between a good rotation and control over the cross-sectional sampling design. In order to reach a compromise between the quality of coordination and the freedom of choice of the cross-sectional design, we propose an algorithm that uses a new method of longitudinal sampling.
- PublicationMétadonnées seulementTirages coordonnés d'échantillons stratifiés : méthodes basées sur des microstrates(Paris: Dunod, 2008)
; ; ;Guilbert, Philippe ;Haziza, David ;Ruiz-Gazen, Anne - PublicationMétadonnées seulementDiscretizing a compound distribution with application to categorical modelling(2017-2-17)
; Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the mixture probabilities of the conditional densities using information on population categories, thus modifying the original overall model. We thus obtain specific models for sub-populations that stem from the overall model. The distribution of a sub-population of interest is thus completely specified in terms of mixing probabilities. All characteristics of interest can be derived from this distribution and the comparison between sub-populations easily proceeds from the comparison of the mixing probabilities. A real example based on EU-SILC data is given. Then the methodology is investigated through simulation. - PublicationMétadonnées seulement
- PublicationMétadonnées seulementPolicy Recommendations and Methodological Report(Trier University of Trier, 2011)
;Münnich, R. ;Zins, S. ;Alfons, A. ;Bruch, Ch. ;Filzmoser, P.; ;Hulliger, B. ;kolb, J.-P. ;Lehtonen, R. ;Lussmann, D. ;Meraner, A. ;Myrskylä, M.; ;Schoch, T. ;Templ, M. ;Valaste, M.Veijanen, A. - PublicationMétadonnées seulementTirages coordonnés d'échantillons poissoniens(Paris: Dunod, 2011)
; ; ; ;Tramblay, Marie-Eve ;Lavallée, PierreEl Haj Tirari, Mohammed - PublicationMétadonnées seulement
- PublicationMétadonnées seulementParametric modelling of income and indicators of poverty and social exclusion from EU-SILC data and simulated universe(2011-5-23)
; In the context of the AMELI project, we aim at developing reliable and efficient methodologies for the estimation of a certain set of indicators, computed within the EU-SILC survey, i.e. the median, the at-risk-of-poverty rate (ARPR), the relative median poverty gap (RMPG), the quintile share ratio (QSR) and the Gini index. The reason why parametric estimation may be useful when empirical data and estimators are available is threefold: 1. to stabilize estimation; 2. to get insight into the relationships between the characteristics of the theoretical distribution and a set of indicators, e.g. by sensitivity plots; 3. to deduce the whole distribution from known empirical indicators, when the raw data are not available. Special emphasis is laid on the Generalized Beta distribution of the second kind (GB2), derived by McDonald (1984). Apart from the scale parameter, this distribution has three shape parameters: the first governing the overall shape, the second the lower tail and the third the upper tail of the distribution. These characteristics give to the GB2 a large flexibility for fitting a wide range of empirical distributions and it has been established that it outperforms other four-parameter distributions for income data (Kleiber and Kotz, 2003). We have studied different types of estimation methods, taking into account the design features of the EU-SILC surveys. Pseudo-maximum likelihood estimation of the parameters is compared with a nonlinear fit from the indicators. Variance estimation is done by linearization and different types of simplified formulas for the variance proposed in the literature are evaluated by simulation. The GB2 can be represented as a compound distribution, the compounding parameter being the scale parameter. We use this property to decompose the GB2 distribution into a mixture of component distributions, to refine the GB2 fit of the income variable for subgroups and to ameliorate the GB2 estimates of the indicators. Computations are made on the synthetic universe AMELIA constructed from the EU-SILC data (Alfons et al., 2011) and the simulation is done with the R package SimFrame (Alfons, A., 2010). Both AMELIA and SimFrame are developed in the context of the AMELI project. The parametric methods we have developed are made available in the R package GB2, which is part of the output of the AMELI project. Ref: 1) AMELIwebsite 2) Alfons, A. (2010). simFrame: Simulation framework. R package version 0.2. URL http://CRAN.R-project.org/package=simFrame. 3) Alfons, A., Templ, M., Filzmoser, P., Kraft, S., Hulliger, B., Kolb, J.-P., and Münnich, R. (2011). Synthetic data generation of SILC data. Deliverable 6.2. of the Ameli project. 4) Graf, M. and Nedyalkova, D. (2010). GB2: Generalized Beta Distribution of the Second Kind: properties, likelihood, estimation. R package version 1.0. 5) Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley & Sons, Hoboken, NJ.
- «
- 1 (current)
- 2
- 3
- »