cnv_liftover_gvf

liftover CNV VCF files from hg38 to 37, extract their data and convert them to GVF files to be imported into third party software for further annotation

https://github.com/nawar82/cnv_liftover_gvf

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

liftover CNV VCF files from hg38 to 37, extract their data and convert them to GVF files to be imported into third party software for further annotation

Basic Info
  • Host: GitHub
  • Owner: nawar82
  • License: mit
  • Language: Python
  • Default Branch: master
  • Size: 9.65 MB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Liftover CNV VCF Files from hg38 to hg19 and Convert to GVF

This script automates the processing of Copy Number Variation (CNV) VCF files generated by GATK gCNV. It can optionally perform liftover from hg38 to hg19 and convert the VCF data into GVF format.

Outputs

Corresponding GVF files in the results foled (same level of the scripts folder) with Genotype,Copynumber,Numpoints,and sample name in the attributes in addition to all other standard GVF fields (seqid,source,type,start,end,score,strand,phase,attributes)

Features:

  • Reads a list of VCF file paths from an input text file.
  • Optionally performs liftover of CNV regions using the UCSC liftOver tool.
  • Updates CNV IDs and the INFO field (END attribute) to reflect hg19 coordinates.
  • Generates output files with _hg19.gvf suffix.
  • Skips missing files and provides warnings.
  • Applies CN filtering rules only to chrX, depending on the sample's sex:
    • Males (chrX):
    • CN == 1 variants are skipped.
    • CN == 0 is classified as DEL.
    • CN ≥ 2 is classified as DUP.
    • Females (chrX):
    • CN == 2 variants are skipped.

Dependencies:

  • Python libraries: vcfpy, argparse, subprocess
  • External tools: UCSC liftOver (if liftover is enabled)
  • Required files:
    • UCSC chain file for hg38 to hg19 conversion (hg38ToHg19.over.chain.gz)
    • Valid CNV VCF files generated by gCNV tool

Usage:

  1. Prepare a text file (e.g., file_list.txt) listing the paths to VCF files, one per line.
  2. Run the script with liftover enabled (default): bash python Liftover_and_gvf.py -i file_list.txt
  3. Run the script without liftover (for hg19 VCFs): bash python Liftover_and_gvf.py -i file_list.txt --no-liftover DOI

Developed by Nawar Dalila

Owner

  • Name: Nawar Dalila
  • Login: nawar82
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as follows."
authors:
  - family-names: Dalila
    given-names: Nawar
title: "Liftover and GVF Conversion for CNV Reports"
version: 1.0.0
DOI: 10.5281/zenodo.15019190
url: "https://github.com/nawar82/CNV_liftover_gvf"
license: "MIT"

GitHub Events

Total
  • Release event: 2
  • Watch event: 1
  • Delete event: 1
  • Public event: 1
  • Push event: 8
  • Pull request event: 4
  • Create event: 5
Last Year
  • Release event: 2
  • Watch event: 1
  • Delete event: 1
  • Public event: 1
  • Push event: 8
  • Pull request event: 4
  • Create event: 5