https://github.com/bdwilliamson/spvim_supplementary

Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary

Keywords

machine-learning nonparametric-statistics statistical-inference variable-importance

Last synced: 10 months ago · JSON representation

Repository

Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"

Basic Info

Host: GitHub
Owner: bdwilliamson
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 929 KB

Statistics

Stars: 5
Watchers: 5
Forks: 1
Open Issues: 0
Releases: 0

Archived

Topics

machine-learning nonparametric-statistics statistical-inference variable-importance

Created about 6 years ago · Last pushed almost 6 years ago

https://github.com/bdwilliamson/spvim_supplementary/blob/master/

# `spvim_supplementary`: Supplementary materials for the SPVIM paper

This repository contains the supplementary material for and code to reproduce the analyses in ["Efficient nonparametric statistical inference on population feature importance using Shapley values"](https://arxiv.org/abs/2006.09481) by Williamson and Feng (*arXiv*, 2020; to appear in the Proceedings of the Thirty-seventh International Conference on Machine Learning [ICML 2020]). All analyses were implemented in the freely available software packages Python and R; specifically, Python version 3.7.4 and R version 3.6.3.

This README file provides an overview of the code available in the repository.

## Code directory

We have separated our code further into two sub-directories based on the two main objectives of the manuscript:

1. Numerical experiments to evaluate the operating characteristics of our proposed method (`sims`).
2. An analysis of patients' stays in the ICU from the Multiparameter Intelligent Monitoring in Intensive Care II ([MIMIC-II](https://mimic.physionet.org/)) database (`data_analysis`).

All analyses were performed on a Linux cluster using the Slurm batch scheduling system. The head node of the batch scheduler allows the shorthand "ml" in place of "module load". If you use a different batch scheduling system, the individual code files are flagged with the line where you can change batch variables. If you prefer to run the analyses locally, you may -- however, these analyses will then take a large amount of time.

-----

## Issues

If you encounter any bugs or have any specific questions about the analysis, please
[file an issue](https://github.com/bdwilliamson/spvim_supplementary/issues).

Owner

Name: Brian Williamson
Login: bdwilliamson
Kind: user
Location: Seattle, Washington USA
Company: Kaiser Permanente Washington Health Research Institute

Website: https://bdwilliamson.github.io/
Repositories: 46
Profile: https://github.com/bdwilliamson

Assistant Investigator at Kaiser Permanente Washington Health Research Institute. Interested in inference in high-dimensional settings.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/bdwilliamson/spvim_supplementary

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

https://github.com/bdwilliamson/spvim_supplementary/blob/master/

Owner

GitHub Events

Total

Last Year