https://github.com/broadinstitute/t2t-ace

Accurate CNV Evaluation Using Telomere-to-Telomere Assemblies

https://github.com/broadinstitute/t2t-ace

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Accurate CNV Evaluation Using Telomere-to-Telomere Assemblies

Basic Info
  • Host: GitHub
  • Owner: broadinstitute
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 21.3 MB
Statistics
  • Stars: 4
  • Watchers: 4
  • Forks: 1
  • Open Issues: 3
  • Releases: 2
Created over 2 years ago · Last pushed 7 months ago
Metadata Files
Readme License

README.md

T2T-ACE (beta version)

Accurate CNV Evaluation Using Telomere-to-Telomere Assemblies

Run T2T-ACE

This tool is designed to evaluate the accuracy of CNV calls using the T2T assembly as a reference. The tool will align the CNV calls to the T2T assembly and the hg38 assembly and compare the alignment results. python3 T2T_ACE/run_T2T-ACE.py --cnv_vcf <cnv_vcf> --t2t_ref <t2t_assembly.fa> --hg38_ref <hg38_assembly.fa>

Download Assembly Files

The T2T assembly and the hg38 assembly can be downloaded from the following links: * GENCODE hg38 primary assembly: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencodehuman/release46/GRCh38.primary_assembly.genome.fa.gz * HG002 T2T assembly v1.1: https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/HG002/assemblies/hg002v1.1.fasta.gz

Design Description

DEL evaluation

T2T-ACE align the left and right flanking regions of a DEL variant called in reference genome (hg38) to the HG002-T2T reference. By calculating the distance between the left and right flanking regions are aligned in HG002-T2T reference, we can determine the correctness and genotype of this DEL variant.

  • Correctness: If the distance between the left and right flanking regions are aligned in HG002-T2T reference is within the range of 0.8 * (length of the DEL variant) and 1.2 * (length of the DEL variant), we consider this DEL variant is FP.
  • Genotype: DEL
  • Het DEL Example: DEL
  • Hom DEL Example: DEL
  • FP DEL Example: DEL

DUP evaluation

T2T-ACE align the DUP variant called in reference genome (hg38) to the HG002-T2T reference. T2T-ACE aligns the DNA sequence representing a DUP variant called in hg38 to the HG002-T2T reference.

  • Correctness: A TP DUP event is characterized by a higher copy number in HG002-T2T than in hg38, while a FP DUP event shows fewer copies in HG002-T2T than hg38. Since the hg38 assembly is haploid, and the HG002-T2T assembly is diploid, we assume a copy neutral event has one copy in hg38 and two copies (one for the maternal and paternal haplotypes) in HG002-T2T. A copy is defined as an alignment made by minimap2 where the alignment length is at least 50% of the query length. This method allows us to identify not only whether a CNV event is correct, but in the case of duplications also allows us to identify the precise number, locations, and genotype of duplication events, even if they occur on separate chromosomes from the original call on hg38.
  • Hom DUP Example: DUP
  • Het DUP Example: DUP
  • FP DUP Example: DUP

Notes

Not all FP CNV calls are due to errors in the CNV calling algorithm. Some FP CNV calls are due to the limitations of hg38 reference genome.

Owner

  • Name: Broad Institute
  • Login: broadinstitute
  • Kind: organization
  • Location: Cambridge, MA

Broad Institute of MIT and Harvard

GitHub Events

Total
  • Watch event: 2
  • Issue comment event: 1
  • Push event: 3
  • Pull request review event: 1
  • Pull request event: 2
  • Create event: 1
Last Year
  • Watch event: 2
  • Issue comment event: 1
  • Push event: 3
  • Pull request review event: 1
  • Pull request event: 2
  • Create event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • yueyaog (3)
  • MicahR-Y (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

Dockerfile docker
  • continuumio/miniconda3 latest build
.github/workflows/unit_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite