empirical_proxy_measures

Beyond brain age: Empirically-derived proxy measures of mental health.

https://github.com/kamalakerdadi/empirical_proxy_measures

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Beyond brain age: Empirically-derived proxy measures of mental health.

Basic Info

Host: GitHub
Owner: KamalakerDadi
License: bsd-3-clause
Language: Python
Default Branch: master
Size: 14.2 MB

Statistics

Stars: 6
Watchers: 2
Forks: 2
Open Issues: 0
Releases: 1

Created almost 6 years ago · Last pushed almost 5 years ago

Metadata Files

Readme License Citation

Empirical Proxy Measures Paper Code and Results

This repository included code and aggregated results from the forthcoming paper:

Populationmodeling with machine learning can enhancemeasures of mental health. (2021). Kamalaker Dadi, Gaël Varoquaux, Josselin Houenou, Danilo Bzdok, Bertrand Thirion, Denis Engemann.

code and data sharing

The data shared with this repository is aggregated and does not reveal individual-specific inputs nor participant ids.

To obtain model predictions needed to compute all results, the code can be run on the input data accessible from the UK Biobank.

For all main figures and most supplementary figures, we shared aggregate results that will allow readers to replot the figures and inspect without having to recomputing all results.

structure of individual-specific results

model predictions

Some of the scripts refer to input files with a "post_predictive*.csv" naming pattern. Thes files contain intermediate model predictions (both left-out fold and held-out generalization dataset) alongside the true values for age, fluid intelligence and neuroticism from each fold from the cross-validation procedure performed on the data used for model construction. Thes predictions can be computed using the scripts inside the folder compute_predictive_models based on the original input data available from UK Biobank.

The table below gives a summary of the structure and data types of the results.

|Unnamed: 0 |20016-2.0 |20127-0.0 |21022-0.0 |eid |evaluation |fold |predicted |target |test_indices |true |variable |Data type | |:----------|:---------|:---------|:---------|:-------|:----------|:-------|:---------|:---------|:------------|:-------|:---------|:---------| |integer |logical |logical |integer |integer |character |integer |double |character |integer |integer |character |character |

validation against health data

For some of the analyses, model predictions were validated against external health data available from the UK Biobank. The tables below show the file structure for the respective inputs and health domains.

alcohol.csv

|eid |1558-0.0 |1568-0.0 |1578-0.0 |1588-0.0 |1598-0.0 |1608-0.0 |1618-0.0 |3731-0.0 |4407-0.0 |4418-0.0 |4429-0.0 |4440-0.0 |4451-0.0 |4462-0.0 |5364-0.0 |20117-0.0 |variable | |:-------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:---------|:---------| |integer |double |double |double |double |double |double |double |double |double |double |double |double |double |double |double |double |character |

sleep.csv

|eid |1160-0.0 |1170-0.0 |1180-0.0 |1190-0.0 |1200-0.0 |1210-0.0 |1220-0.0 |variable | |:-------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:---------| |integer |double |double |double |double |double |double |double |character |

physical_activity.csv

|eid |22032-0.0 |22033-0.0 |22034-0.0 |22035-0.0 |22036-0.0 |22037-0.0 |22038-0.0 |22039-0.0 |22040-0.0 |variable | |:-------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------|:---------| |integer |double |double |double |double |double |double |double |double |double |character |

smoking.csv

|eid |20161-0.0 |20162-0.0 |variable | |:-------|:---------|:---------|:---------| |integer |double |double |character |

plotting the figures

To plot the figures, please cd into the respective figure folder and run the plotting scripts. Plotting scripts have the naming convention plot_*.r or plot_*.py.

Note that the final figures included in the paper have sometimes benefitted from aesthetic customizations using an illustrator software.

Also not that supplementary figures (S2) and the concept figure (figur 1) makes use of individual-level data that is only accessible via the UK Biobank. Aggregation was not possible in these two exceptions.

Owner

Login: KamalakerDadi
Kind: user

Repositories: 4
Profile: https://github.com/KamalakerDadi

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this code, directly or for inspiration, please cite it as below"
authors:
- family-names: "Dadi"
  given-names: "Kamalaker"
  orcid: "https://orcid.org/0000-0003-2214-1050"
- family-names: "Engemann"
  given-names: "Denis"
  orcid: "https://orcid.org/0000-0002-7223-1014"
title: "Empirical Proxy Measures"
version: 0.0.1
date-released: 2021-09-10
url: "https://github.com/KamalakerDadi/empirical_proxy_measures"

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science