https://github.com/bdwilliamson/spvim_supplementary

Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"

https://github.com/bdwilliamson/spvim_supplementary

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary

Keywords

machine-learning nonparametric-statistics statistical-inference variable-importance
Last synced: 5 months ago · JSON representation

Repository

Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"

Basic Info
  • Host: GitHub
  • Owner: bdwilliamson
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 929 KB
Statistics
  • Stars: 5
  • Watchers: 5
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Archived
Topics
machine-learning nonparametric-statistics statistical-inference variable-importance
Created over 5 years ago · Last pushed over 5 years ago

https://github.com/bdwilliamson/spvim_supplementary/blob/master/

# `spvim_supplementary`: Supplementary materials for the SPVIM paper

This repository contains the supplementary material for and code to reproduce the analyses in ["Efficient nonparametric statistical inference on population feature importance using Shapley values"](https://arxiv.org/abs/2006.09481) by Williamson and Feng (*arXiv*, 2020; to appear in the Proceedings of the Thirty-seventh International Conference on Machine Learning [ICML 2020]). All analyses were implemented in the freely available software packages Python and R; specifically, Python version 3.7.4 and R version 3.6.3.

This README file provides an overview of the code available in the repository.

## Code directory

We have separated our code further into two sub-directories based on the two main objectives of the manuscript:

1. Numerical experiments to evaluate the operating characteristics of our proposed method (`sims`).
2. An analysis of patients' stays in the ICU from the Multiparameter Intelligent Monitoring in Intensive Care II ([MIMIC-II](https://mimic.physionet.org/)) database (`data_analysis`).

All analyses were performed on a Linux cluster using the Slurm batch scheduling system. The head node of the batch scheduler allows the shorthand "ml" in place of "module load". If you use a different batch scheduling system, the individual code files are flagged with the line where you can change batch variables. If you prefer to run the analyses locally, you may -- however, these analyses will then take a large amount of time.

-----

## Issues

If you encounter any bugs or have any specific questions about the analysis, please
[file an issue](https://github.com/bdwilliamson/spvim_supplementary/issues).

Owner

  • Name: Brian Williamson
  • Login: bdwilliamson
  • Kind: user
  • Location: Seattle, Washington USA
  • Company: Kaiser Permanente Washington Health Research Institute

Assistant Investigator at Kaiser Permanente Washington Health Research Institute. Interested in inference in high-dimensional settings.

GitHub Events

Total
Last Year