spacebench
SpaCE, the Spatial Confounding Environment, loads benchmark datasets for causal inference methods tackling spatial confounding
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.3%) to scientific vocabulary
Repository
SpaCE, the Spatial Confounding Environment, loads benchmark datasets for causal inference methods tackling spatial confounding
Basic Info
- Host: GitHub
- Owner: NSAPH-Projects
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://nsaph-projects.github.io/space/
- Size: 7.11 MB
Statistics
- Stars: 17
- Watchers: 5
- Forks: 4
- Open Issues: 7
- Releases: 3
Metadata Files
README.md

🚀 Description
Spatial confounding poses a significant challenge in scientific studies involving spatial data, where unobserved spatial variables can influence both treatment and outcome, possibly leading to spurious associations. To address this problem, SpaCE provides realistic benchmark datasets and tools for systematically evaluating causal inference methods designed to alleviate spatial confounding. Each dataset includes training data, true counterfactuals, a spatial graph with coordinates, and a smoothness and confounding scores characterizing the effect of a missing spatial confounder. The datasets cover real treatment and covariates from diverse domains, including climate, health and social sciences. Realistic semi-synthetic outcomes and counterfactuals are generated using state-of-the-art machine learning ensembles, following best practices for causal inference benchmarks. SpaCE facilitates an automated end-to-end machine learning pipeline, simplifying data loading, experimental setup, and model evaluation.
🐍 Installation
Install the PyPI version:
sh
pip install "spacebench[all]"
The option [all] installs all dependencies necessary for the spatial confounding algorithms and the examples. If you only want to use the SpaceDatasets, use pip install spacebench instead.
You can also install the latest 🔥 features from the development version:
sh
pip install "git+https://github.com/NSAPH-Projects/space@dev#egg=spacebench[all]"
Python 3.10 or higher is required. See the docs and requirements.txt for more information.
🐢 Getting started
To obtain a benchmark dataset for spatial confounding you need to 1) create a SpaceEnv which contains real treatment and confounder data, and a realistic semi-synthetic outcome, 2) create a SpaceDataset which masks a spatially-varying confounder and facilitates the data loading pipeline for causal inference.
python
from spacebench import SpaceEnv
env = SpaceEnv('healthd_dmgrcs_mortality_disc')
dataset = env.make()
print(dataset)
SpaceDataset with a missing spatial confounder:
treatment: (3109,) (binary)
confounders: (3109, 30)
outcome: (3109,)
counterfactuals: (3109, 2)
confounding score of missing: 0.02
spatial smoothness score of missing: 0.11
graph edge list: (9237, 2)
graph node coordinates: (3109, 2)
parent SpaceEnv: healthd_dmgrcs_mortality_disc
WARNING ⚠️ : this dataset contains a (realistic) synthetic outcome!
By using it, you agree to understand its limitations. The variable
names have been masked to emphasize that no inferences can be made
about the source data.
Available SpaceEnvs
The list of available environments can be in the documentations or in an interactive session as:
python
from spacebench import DataMaster
dm = DataMaster()
dm.master.head()
| environments | treatmenttype | collection | |:-------------------------------|:-----------------|:---------------------------------| | healthddmgrcsmortalitydisc | binary | Air Pollution and Mortality | | cdcsvilimtenghburdiccont | continuous | Social Vulnerability and Welfare | | climaterelhumwfsmokecont | continuous | Heat Exposure and Wildfires | | climatewfsmokeminrtydisc | binary | Heat Exposure and Wildfires | | healthdhhincomortalitycont | continuous | Air Pollution and Mortality | | healthdpollutnmortalitycont | continuous | Air Pollution and Mortality | | countyeducatnelectioncont | continuous | Welfare and Elections | | countyphyactivlifexpcycont | continuous | Welfare and Elections | | countydmgrcselectiondisc | binary | Welfare and Elections | | cdcsvinohsdppovertycont | continuous | Social Vulnerability and Welfare | | cdcsvinohsdppovertydisc | binary | Social Vulnerability and Welfare |
To learn more about the data collections and the environments see the docs. The data collections and environments are hosted at the Harvard Dataverse. "Data "nutrition labels" for the collections can be found here. The environments are produced using the space-data repository from a data collection with a configuration file. Don't forget to read our paper.
🙉 Code of Conduct
Please note that the SpaCE project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
👽 Contact
We welcome contributions and feedback about spacebench. If you have any suggestions or ideas, please open an issue or submit a pull request.
Documentation
The documentation is hosted at https://nsaph-projects.github.io/space/.
Owner
- Name: NSAPH Projects
- Login: NSAPH-Projects
- Kind: organization
- Repositories: 24
- Profile: https://github.com/NSAPH-Projects
Citation (CITATION.cff)
cff-version: 1.2.0
title: "SpaCE: The Spatial Confounding Environment"
identifiers:
- description: "SpaCE Data GitHub repository."
type: url
value: "https://github.com/NSAPH-Projects/space-data"
- description: "SpaCE GitHub repository."
type: url
value: "https://github.com/NSAPH-Projects/space"
- description: "SpaCE Data Collection."
type: doi
value: 10.7910/DVN/SYNPBS
authors:
- family-names: Tec
given-names: Mauricio
- family-names: Trisovic
given-names: Ana
orcid: https://orcid.org/0000-0003-1991-0533
- family-names: Audirac
given-names: Michelle
- family-names: Woodward
given-names: Sophie
- family-names: Hu
given-names: Kate
- family-names: Khoshnevis
given-names: Naeem
- family-names: Dominici
given-names: Francesca
year: 2023
license: MIT
GitHub Events
Total
- Issues event: 3
- Watch event: 2
- Issue comment event: 5
- Push event: 1
Last Year
- Issues event: 3
- Watch event: 2
- Issue comment event: 5
- Push event: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 79
- Total pull requests: 81
- Average time to close issues: 23 days
- Average time to close pull requests: 6 days
- Total issue authors: 8
- Total pull request authors: 7
- Average comments per issue: 1.05
- Average comments per pull request: 0.73
- Merged pull requests: 70
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mauriciogtec (38)
- audiracmichelle (12)
- Naeemkh (8)
- atrisovic (8)
- sophi890 (4)
- jckitch (3)
- fresleven (1)
- zcalhoun (1)
Pull Request Authors
- mauriciogtec (24)
- Naeemkh (18)
- atrisovic (15)
- audiracmichelle (8)
- sophi890 (5)
- jckitch (3)
- zcalhoun (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 11 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 7
- Total maintainers: 1
pypi.org: spacebench
Spatial confounding poses a significant challenge in scientific studies where unobserved spatial variables influence both treatment and outcome, leading to spurious associations. SpaCE provides realistic benchmark datasets and tools for systematically valuating causal inference methods for spatial confounding. Each dataset includes training data with spatial confounding, true counterfactuals, a spatial graph with coordinates, and realistic semi-synthetic outcomes.
- Homepage: https://github.com/NSAPH-Projects/space
- Documentation: https://spacebench.readthedocs.io/
- License: MIT
-
Latest release: 0.1.4
published about 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite
- s-weigand/setup-conda v1 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v3 composite
- s-weigand/setup-conda v1 composite
- jsonlines >=3.1
- matplotlib >=3.4.3
- pysal >=2.5.0
- pytorch-lightning >=2.0.2
- scikit-learn >=1.2.2
- seaborn >=0.11.2
- torch_geometric >=2.3.1
- torchaudio >=2.0.2
- torchmetrics >=0.11.4
- torchvision >=0.15.2
- xgboost >=1.7.4
- networkx >=3.0
- numpy >=1.19.2
- pandas >=1.5.3
- pyDataverse >=0.3.1
- pyproj ==3.4.1
- pyyaml >=6.0
- requests >=2.28.1
- scipy >=1.10.1
- setuptools >=58.0.4
- tqdm >=4.62.3
- urllib3 >=1.26.11