empirical_proxy_measures
Beyond brain age: Empirically-derived proxy measures of mental health.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.9%) to scientific vocabulary
Repository
Beyond brain age: Empirically-derived proxy measures of mental health.
Basic Info
- Host: GitHub
- Owner: KamalakerDadi
- License: bsd-3-clause
- Language: Python
- Default Branch: master
- Size: 14.2 MB
Statistics
- Stars: 6
- Watchers: 2
- Forks: 2
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Empirical Proxy Measures Paper Code and Results
This repository included code and aggregated results from the forthcoming paper:
Populationmodeling with machine learning can enhancemeasures of mental health. (2021). Kamalaker Dadi, Gaël Varoquaux, Josselin Houenou, Danilo Bzdok, Bertrand Thirion, Denis Engemann.
code and data sharing
The data shared with this repository is aggregated and does not reveal individual-specific inputs nor participant ids.
To obtain model predictions needed to compute all results, the code can be run on the input data accessible from the UK Biobank.
For all main figures and most supplementary figures, we shared aggregate results that will allow readers to replot the figures and inspect without having to recomputing all results.
structure of individual-specific results
model predictions
Some of the scripts refer to input files with a "post_predictive*.csv" naming pattern. Thes files contain intermediate model predictions (both left-out fold and held-out generalization dataset) alongside the true values for age, fluid intelligence and neuroticism from each fold from the cross-validation procedure performed on the data used for model construction. Thes predictions can be computed using the scripts inside the folder compute_predictive_models based on the original input data available from UK Biobank.
The table below gives a summary of the structure and data types of the results.
|Unnamed: 0 |20016-2.0 |20127-0.0 |21022-0.0 |eid |evaluation |fold |predicted |target |test_indices |true |variable |Data type | |:----------|:---------|:---------|:---------|:-------|:----------|:-------|:---------|:---------|:------------|:-------|:---------|:---------| |integer |logical |logical |integer |integer |character |integer |double |character |integer |integer |character |character |
validation against health data
For some of the analyses, model predictions were validated against external health data available from the UK Biobank. The tables below show the file structure for the respective inputs and health domains.
alcohol.csv
|eid |1558-0.0 |1568-0.0 |1578-0.0 |1588-0.0 |1598-0.0 |1608-0.0 |1618-0.0 |3731-0.0 |4407-0.0 |4418-0.0 |4429-0.0 |4440-0.0 |4451-0.0 |4462-0.0 |5364-0.0 |20117-0.0 |variable | |:-------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:---------|:---------| |integer |double |double |double |double |double |double |double |double |double |double |double |double |double |double |double |double |character |
sleep.csv
|eid |1160-0.0 |1170-0.0 |1180-0.0 |1190-0.0 |1200-0.0 |1210-0.0 |1220-0.0 |variable | |:-------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:---------| |integer |double |double |double |double |double |double |double |character |
physical_activity.csv
|eid |22032-0.0 |22033-0.0 |22034-0.0 |22035-0.0 |22036-0.0 |22037-0.0 |22038-0.0 |22039-0.0 |22040-0.0 |variable | |:-------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------| |integer |double |double |double |double |double |double |double |double |double |character |
smoking.csv
|eid |20161-0.0 |20162-0.0 |variable | |:-------|:---------|:---------|:---------| |integer |double |double |character |
plotting the figures
To plot the figures, please cd into the respective figure folder and run the plotting scripts.
Plotting scripts have the naming convention plot_*.r or plot_*.py.
Note that the final figures included in the paper have sometimes benefitted from aesthetic customizations using an illustrator software.
Also not that supplementary figures (S2) and the concept figure (figur 1) makes use of individual-level data that is only accessible via the UK Biobank. Aggregation was not possible in these two exceptions.
Owner
- Login: KamalakerDadi
- Kind: user
- Repositories: 4
- Profile: https://github.com/KamalakerDadi
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this code, directly or for inspiration, please cite it as below" authors: - family-names: "Dadi" given-names: "Kamalaker" orcid: "https://orcid.org/0000-0003-2214-1050" - family-names: "Engemann" given-names: "Denis" orcid: "https://orcid.org/0000-0002-7223-1014" title: "Empirical Proxy Measures" version: 0.0.1 date-released: 2021-09-10 url: "https://github.com/KamalakerDadi/empirical_proxy_measures"