maizesnp

SNP analysis of different maize (Zea mays L.) inbred lines in comparison to B73 as a reference line

https://github.com/lu-vedder/maizesnp

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.4%) to scientific vocabulary

Keywords

bioinformatics genetics maize snp-analysis
Last synced: 6 months ago · JSON representation ·

Repository

SNP analysis of different maize (Zea mays L.) inbred lines in comparison to B73 as a reference line

Basic Info
  • Host: GitHub
  • Owner: lu-vedder
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 38.1 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
bioinformatics genetics maize snp-analysis
Created about 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

MaizeSNP

The MaizeSNP wolkflow was specifically designed for the SNP analysis of different maize (Zea mays L.) inbred lines in comparison to B73 as a reference line. The final result is a TSV-file, containing a set of SNPs between the respective inbred line and B73, located in the protein-coding regions of maize. Further, this file includes the allele counts, obtained from the original mapping.

Workflow.txt
The workflow file gives the exact order and parameter settings for runnig the other scripts. Please mind the following: * Beforehand a mapping of all merged reads of one indred line ('LINE') has been performed using Bowtie2 (performed with version v2.2.9) [1]. The resulting mapping files (SAM-files) were prepared for the SNP calling as stated in the workflow using Samtools (v1.3.1) [2] and Picard (v2.9.0) [3]. * The SNP calling was performed using GATK (performed with version v3.7-0-gcfedb67) [4]. The input file is the before mentioned prepared mapping result. * The reference genome of B73 was used in version 3. The usage of new versions may need some adaptations.

filterblacklistmerged_SNPs.py
Using the SNPs between 'our' B73 and the reference genome as a blacklist for the filtering of "true" SNPs. This step should reduce the mapping bias caused by variations between the reference B73 and 'our' B73 used in the lab experiments.

empty_blacklist.txt
This is just a dummy file. It can be used as an empty blacklist file in the 'filterblacklistmerged_SNPs.py' script.

countSNPalleles_mergedBAM.py - Python2 environment required!
Count the number of reads matching the Ref/SNP allele using the merged BAM files.

collectgeneidsfromgtf.py
Collecting the 39,469 gene IDs of the protein coding genes of Zea mays, reference version 3 (GTF-format).

collectgeneidsfromgff3.py
Collecting the gene IDs of the protein coding genes of Zea mays based on reference version 4 (GFF3-format).

filtermergedsnpsforcoding_genes.py
Filter the SNPs (TSV-format) for positions located in the protein-coding genes set. Allele counts may be included in the file.

Citation

Please cite via Zenodo: DOI
Or as: Vedder, L. (2024). MaizeSNP (Version 1.0.0) [Computer software]. https://github.com/lu-vedder/MaizeSNP (for details see the CITATION file).

License

Copyright (c) 2024 Lucia VedderORCID logo
For details see the LICENSE file.


[1] https://bowtie-bio.sourceforge.net/bowtie2/index.shtml
[2] http://www.htslib.org
[3] https://broadinstitute.github.io/picard
[4] https://gatk.broadinstitute.org

Owner

  • Name: Lucia Vedder
  • Login: lu-vedder
  • Kind: user
  • Location: Bonn, Germany
  • Company: University of Bonn

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: MaizeSNP
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Lucia
    family-names: Vedder
    email: lvedder@uni-bonn.de
    affiliation: University of Bonn
    orcid: 'https://orcid.org/0000-0002-8924-9800'
identifiers:
  - type: doi
    value: 10.5281/zenodo.10684044
    description: Zenodo - Release v1.0.0
repository-code: 'https://github.com/lu-vedder/MaizeSNP'
abstract: >-
  The MaizeSNP wolkflow was specifically designed for the
  SNP analysis of different maize (Zea mays L.) inbred lines
  in comparison to B73 as a reference line. The final result
  is a TSV-file, containing a set of SNPs between the
  respective inbred line and B73, located in the
  protein-coding regions of maize. Further, this file
  includes the allele counts, obtained from the original
  mapping.
keywords:
  - Bioinformatics
  - Genetics
  - SNP analysis
  - NGS mapping
  - Maize
  - Zea mays
  - B73
license: MIT
version: 1.0.0
date-released: '2024-02-20'

GitHub Events

Total
Last Year