Logo du site
  • English
  • Français
  • Se connecter
Logo du site
  • English
  • Français
  • Se connecter
  1. Accueil
  2. Université de Neuchâtel
  3. Publications
  4. Explainable Machine Learning: Approximating Shapley Values for Dependent Predictors
 
  • Details
Options
Vignette d'image

Explainable Machine Learning: Approximating Shapley Values for Dependent Predictors

Auteur(s)
Kasperek, Jan 
Institut de statistique 
Editeur(s)
Matei, Alina 
Institut de statistique 
Date de parution
2024
Nombre de page
61
Mots-clés
  • explainability
  • Shapley values
  • SHAP
  • KernelSHAP
  • machine learning
  • model interpretability
  • Monte Carlo
  • computational statistics
  • dependence modeling
  • copula
  • conditional inference trees
  • explainability

  • Shapley values

  • SHAP

  • KernelSHAP

  • machine learning

  • model interpretabilit...

  • Monte Carlo

  • computational statist...

  • dependence modeling

  • copula

  • conditional inference...

Résumé
Modern Machine Learning algorithms often outperform classical statistical methods in predictive accuracy. This comes at the expense of model interpretability. As businesses and institutions increasingly rely on Machine Learning to support and automate decision making processes to reap the benefits of more accurate predictions, explaining these model outputs becomes more important. A universally applicable approach to explaining such complex models is based on the Shapley value, a concept originating from game theory. However, its calculation is very computer-intensive, so approximations have to be used. The state-of-the-art approach, Kernel SHAP, assumes independence of the predictors, which is unrealistic in practice. Recent research has developed improvements to incorporate dependencies between predictors. After a review of the theoretical underpinnings, the original KernelSHAP method is compared with improved versions in realistic settings, using three real-world datasets. While the improved versions are found to have smaller approximation error to exact Shapley values, they are also more computationally demanding. Further improvements are discussed and possible research directions are suggested. The thesis is structured as follows: After introducing explainable machine learning in chapter 1, the Shapley value and its applications to model explainability are explored in chapter 2. Chapter 3 presents methods to approximate Shapley values as well as recent improvements to these methods, which are tested on real datasets in chapter 4. Some possible directions for future research are pointed out in chapter 5, before giving a final conclusion in chapter 6. Code for the experiments of chapter 4 is found in the appendix.
Notes
Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Statistics

Supervisor:
Prof. tit. Dr. Alina Matei
Université de Neuchâtel
Faculty of Science
Institute of Statistics
Identifiants
https://libra.unine.ch/handle/123456789/32728
Type de publication
master thesis
Dossier(s) à télécharger
 Kasperek_Explainable Machine Learning_2024.pdf (5.1 MB)
google-scholar
Présentation du portailGuide d'utilisationStratégie Open AccessDirective Open Access La recherche à l'UniNE Open Access ORCIDNouveautés

Service information scientifique & bibliothèques
Rue Emile-Argand 11
2000 Neuchâtel
contact.libra@unine.ch

Propulsé par DSpace, DSpace-CRIS & 4Science | v2022.02.00