https://github.com/bdwilliamson/spvim_supplementary
Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary
Keywords
machine-learning
nonparametric-statistics
statistical-inference
variable-importance
Last synced: 5 months ago
·
JSON representation
Repository
Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"
Basic Info
Statistics
- Stars: 5
- Watchers: 5
- Forks: 1
- Open Issues: 0
- Releases: 0
Archived
Topics
machine-learning
nonparametric-statistics
statistical-inference
variable-importance
Created over 5 years ago
· Last pushed over 5 years ago
https://github.com/bdwilliamson/spvim_supplementary/blob/master/
# `spvim_supplementary`: Supplementary materials for the SPVIM paper This repository contains the supplementary material for and code to reproduce the analyses in ["Efficient nonparametric statistical inference on population feature importance using Shapley values"](https://arxiv.org/abs/2006.09481) by Williamson and Feng (*arXiv*, 2020; to appear in the Proceedings of the Thirty-seventh International Conference on Machine Learning [ICML 2020]). All analyses were implemented in the freely available software packages Python and R; specifically, Python version 3.7.4 and R version 3.6.3. This README file provides an overview of the code available in the repository. ## Code directory We have separated our code further into two sub-directories based on the two main objectives of the manuscript: 1. Numerical experiments to evaluate the operating characteristics of our proposed method (`sims`). 2. An analysis of patients' stays in the ICU from the Multiparameter Intelligent Monitoring in Intensive Care II ([MIMIC-II](https://mimic.physionet.org/)) database (`data_analysis`). All analyses were performed on a Linux cluster using the Slurm batch scheduling system. The head node of the batch scheduler allows the shorthand "ml" in place of "module load". If you use a different batch scheduling system, the individual code files are flagged with the line where you can change batch variables. If you prefer to run the analyses locally, you may -- however, these analyses will then take a large amount of time. ----- ## Issues If you encounter any bugs or have any specific questions about the analysis, please [file an issue](https://github.com/bdwilliamson/spvim_supplementary/issues).
Owner
- Name: Brian Williamson
- Login: bdwilliamson
- Kind: user
- Location: Seattle, Washington USA
- Company: Kaiser Permanente Washington Health Research Institute
- Website: https://bdwilliamson.github.io/
- Repositories: 46
- Profile: https://github.com/bdwilliamson
Assistant Investigator at Kaiser Permanente Washington Health Research Institute. Interested in inference in high-dimensional settings.