pgsc_calc

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 9 DOI reference(s) in README
✓
Academic publication links
Links to: nature.com, zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary

Keywords

genetic-risk-score nextflow pgs pgs-catalog polygenic-risk-score polygenic-risk-scores polygenic-score polygenic-scores prs workflow

Last synced: 6 months ago · JSON representation

Repository

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation

Basic Info

Host: GitHub
Owner: PGScatalog
License: apache-2.0
Language: Nextflow
Default Branch: main
Homepage: https://pgsc-calc.readthedocs.io/en/latest/
Size: 10.5 MB

Statistics

Stars: 134
Watchers: 12
Forks: 29
Open Issues: 25
Releases: 22

Topics

genetic-risk-score nextflow pgs pgs-catalog polygenic-risk-score polygenic-risk-scores polygenic-score polygenic-scores prs workflow

Created over 4 years ago · Last pushed 8 months ago

Metadata Files

Readme License Citation

The Polygenic Score Catalog Calculator (`pgsc_calc`)

Introduction

pgsc_calc is a bioinformatics best-practice analysis pipeline for calculating polygenic [risk] scores on samples with imputed genotypes using existing scoring files from the Polygenic Score (PGS) Catalog and/or user-defined PGS/PRS.

Pipeline summary

[!IMPORTANT]
* Whole genome sequencing (WGS) data are not currently supported by the calculator * Its possible to create compatible gVCFs from WGS data. We plan to improve support for WGS data in the near future.

The workflow performs the following steps:

Downloading scoring files using the PGS Catalog API in a specified genome build (GRCh37 and GRCh38).
Reading custom scoring files (and performing a liftover if genotyping data is in a different build).
Automatically combines and creates scoring files for efficient parallel computation of multiple PGS
- Matching variants in the scoring files against variants in the target dataset (in plink bfile/pfile or VCF format)
Calculates PGS for all samples (linear sum of weights and dosages)
Creates a summary report to visualize score distributions and pipeline metadata (variant matching QC)

And optionally:

Genetic Ancestry: calculate similarity of target samples to populations in a reference dataset (1000 Genomes (1000G)), using principal components analysis (PCA)
PGS Normalization: Using reference population data and/or PCA projections to report individual-level PGS predictions (e.g. percentiles, z-scores) that account for genetic ancestry

See documentation for a list of planned features under development.

PGS applications and libraries

pgsc_calc uses applications and libraries internally developed at the PGS Catalog, which can do helpful things like:

Query the PGS Catalog to bulk download scoring files in a specific genome build
Match variants from scoring files to target variants
Adjust calculated PGS in the context of genetic ancestry

If you want to write Python code to work with PGS, check out the pygscatalog repository to learn more.

If you want a simpler way of working with PGS, ignore this section and continue below to learn more about pgsc_calc.

Quick start

Install Nextflow (>=23.10.0)
Install Docker or Singularity (v3.8.3 minimum) (please only use Conda as a last resort)
Download the pipeline and test it on a minimal dataset with a single command:

console nextflow run pgscatalog/pgsc_calc -profile test,<docker/singularity/conda>
Start running your own analysis!

console nextflow run pgscatalog/pgsc_calc -profile <docker/singularity/conda> --input samplesheet.csv --pgs_id PGS001229

See getting started for more details.

Documentation

Full documentation is available on Read the Docs

Credits

pgscatalog/pgsc_calc is developed as part of the PGS Catalog project, a collaboration between the University of Cambridges Department of Public Health and Primary Care (Michael Inouye, Samuel Lambert) and the European Bioinformatics Institute (Helen Parkinson, Laura Harris).

The pipeline seeks to provide a standardized workflow for PGS calculation and ancestry inference implemented in nextflow derived from an existing set of tools/scripts developed by Inouye lab (Rodrigo Canovas, Scott Ritchie, Jingqin Wu) and PGS Catalog teams (Samuel Lambert, Laurent Gil).

The adaptation of the codebase, nextflow implementation, and PGS Catalog features are written by Benjamin Wingfield, Samuel Lambert, Laurent Gil with additional input from Aoife McMahon (EBI). Development of new features, testing, and code review is ongoing including Inouye lab members (Rodrigo Canovas, Scott Ritchie) and others. If you use the tool we ask you to cite our paper describing software and updated PGS Catalog resource:

>Lambert, Wingfield et al. (2024) Enhancing the Polygenic Score Catalog with tools for score calculation and ancestry normalization. Nature Genetics. doi:10.1038/s41588-024-01937-x.

This pipeline is distrubuted under an Apache License amd uses code and infrastructure developed and maintained by the nf-core community (Ewels et al. Nature Biotech (2020) doi:10.1038/s41587-020-0439-x), reused here under the MIT license.

Additional references of open-source tools and data used in this pipeline are described in CITATIONS.md.

This work has received funding from EMBL-EBI core funds, the Baker Institute, the University of Cambridge, Health Data Research UK (HDRUK), and the European Unions Horizon 2020 research and innovation programme under grant agreement No 101016775 INTERVENE.

Owner

Name: The Polygenic Score (PGS) Catalog
Login: PGScatalog
Kind: organization
Location: Cambridge, UK

Website: https://www.pgscatalog.org/
Twitter: PGSCatalog
Repositories: 7
Profile: https://github.com/PGScatalog

Code repository for the Polygenic Score (PGS) Catalog, an open database of PGS and the relevant metadata needed to apply and evaluate them correctly

GitHub Events

Total

Create event: 7
Release event: 3
Issues event: 43
Watch event: 25
Delete event: 9
Issue comment event: 77
Push event: 31
Pull request event: 2
Pull request review event: 2
Fork event: 8

Last Year

Create event: 7
Release event: 3
Issues event: 43
Watch event: 25
Delete event: 9
Issue comment event: 77
Push event: 31
Pull request event: 2
Pull request review event: 2
Fork event: 8

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 17
Total pull requests: 1
Average time to close issues: about 2 months
Average time to close pull requests: N/A
Total issue authors: 15
Total pull request authors: 1
Average comments per issue: 1.18
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 17
Pull requests: 1
Average time to close issues: about 2 months
Average time to close pull requests: N/A
Issue authors: 15
Pull request authors: 1
Average comments per issue: 1.18
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

smlmbrt (18)
Fiwx (14)
nebfield (9)
Carldeboer (4)
ashenfernando1 (3)
Linlin1213a (3)
samreenzafer (3)
zl2860 (2)
csjohnson23 (2)
scienception (2)
peterjuv (2)
TravisMizeIGH (2)
Sabramow (2)
frahimov (2)
DarioS (2)

Pull Request Authors

nebfield (23)
smlmbrt (11)
uelandte (1)
Sabramow (1)
AWS-crafter (1)
llgg2024 (1)
mglev1n (1)
gajamatassa (1)
jlgao46 (1)

Top Labels

Issue Labels

bug (45) user-query (31) enhancement (24) documentation (7) wontfix (1)

Pull Request Labels

user-query (4) bug (3) documentation (3) enhancement (2)

Dependencies

.github/workflows/ci.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/setup-python v2 composite
actions/upload-artifact v2 composite
conda-incubator/setup-miniconda v2.1.1 composite
eWaterCycle/setup-singularity v5 composite

environments/plink2/Dockerfile docker

debian stable-slim build

environments/report/Dockerfile docker

continuumio/miniconda3 latest build

docs/requirements.txt pypi

sphinx-book-theme >=0.3.3
sphinx-jsonschema >=1.19.1
sphinxemoji >=0.2.0

tests/requirements.txt pypi

pandas >=1.4.2 test
pytest-workflow >=1.6.0 test
requests >=2.27.1 test

.github/workflows/ancestry.yml actions

actions/cache v3 composite
actions/cache/restore v3 composite
actions/checkout v3 composite
actions/setup-python v3 composite
actions/upload-artifact v3 composite
nf-core/setup-nextflow v1 composite

.github/workflows/conda.yml actions

actions/checkout v3 composite
actions/setup-java v3 composite
conda-incubator/setup-miniconda v2 composite
nf-core/setup-nextflow v1 composite

.github/workflows/module.yml actions

actions/cache v3 composite
actions/cache/restore v3 composite
actions/checkout v3 composite
actions/setup-python v3 composite
actions/upload-artifact v2 composite
nf-core/setup-nextflow v1 composite

.github/workflows/preload-docker.yml actions

actions/cache v3 composite
actions/checkout v3 composite

.github/workflows/preload-reference.yml actions

actions/cache v3 composite

.github/workflows/preload-singularity.yml actions

actions/cache v3 composite
actions/checkout v3 composite
eWaterCycle/setup-singularity v7 composite

.github/workflows/standard-test.yml actions

actions/cache v3 composite
actions/cache/restore v3 composite
nf-core/setup-nextflow v1 composite

environments/pyyaml/Dockerfile docker

python 3.10.9-slim-bullseye build

environments/zstd/Dockerfile docker

debian bullseye-slim build

.github/workflows/ancestry-conda.yml actions

actions/cache/restore v3 composite
actions/checkout v3 composite
actions/setup-java v3 composite
actions/setup-python v3 composite
actions/upload-artifact v3 composite
conda-incubator/setup-miniconda v2 composite

.github/workflows/ancestry-vcf.yml actions

actions/cache v3 composite
actions/cache/restore v3 composite
actions/checkout v3 composite
actions/setup-python v3 composite
actions/upload-artifact v3 composite
nf-core/setup-nextflow v1 composite

.github/workflows/cleanup.yml actions

actions/checkout v4 composite

environments/fraposa/environment.yml conda

pip
python 3.10.*

environments/pyyaml/environment.yml conda

pip
python 3.10.*

environments/report/environment.yml conda

quarto 1.3.433.*
r-data.table
r-dplyr
r-dt
r-forcats
r-ggplot2
r-jsonlite
r-purrr
r-r.utils
r-readr
r-rmarkdown
r-tidyr

environments/zstd/environment.yml conda

zstd 1.4.8.*

environments/pgscatalog_utils/environment.yml pypi

pgscatalog_utils ==0.4.2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

pgsc_calc

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

The Polygenic Score Catalog Calculator (`pgsc_calc`)

Introduction

Pipeline summary

PGS applications and libraries

Quick start

Documentation

Credits

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

pgsc_calc

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

The Polygenic Score Catalog Calculator (pgsc_calc)

Introduction

Pipeline summary

PGS applications and libraries

Quick start

Documentation

Credits

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

The Polygenic Score Catalog Calculator (`pgsc_calc`)