maizesnp
SNP analysis of different maize (Zea mays L.) inbred lines in comparison to B73 as a reference line
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Keywords
Repository
SNP analysis of different maize (Zea mays L.) inbred lines in comparison to B73 as a reference line
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
MaizeSNP
The MaizeSNP wolkflow was specifically designed for the SNP analysis of different maize (Zea mays L.) inbred lines in comparison to B73 as a reference line. The final result is a TSV-file, containing a set of SNPs between the respective inbred line and B73, located in the protein-coding regions of maize. Further, this file includes the allele counts, obtained from the original mapping.
Workflow.txt
The workflow file gives the exact order and parameter settings for runnig the other scripts. Please mind the following:
* Beforehand a mapping of all merged reads of one indred line ('LINE') has been performed using Bowtie2 (performed with version v2.2.9) [1]. The resulting mapping files (SAM-files) were prepared for the SNP calling as stated in the workflow using Samtools (v1.3.1) [2] and Picard (v2.9.0) [3].
* The SNP calling was performed using GATK (performed with version v3.7-0-gcfedb67) [4]. The input file is the before mentioned prepared mapping result.
* The reference genome of B73 was used in version 3. The usage of new versions may need some adaptations.
filterblacklistmerged_SNPs.py
Using the SNPs between 'our' B73 and the reference genome as a blacklist for the filtering of "true" SNPs.
This step should reduce the mapping bias caused by variations between the reference B73 and 'our' B73 used in the lab experiments.
empty_blacklist.txt
This is just a dummy file. It can be used as an empty blacklist file in the 'filterblacklistmerged_SNPs.py' script.
countSNPalleles_mergedBAM.py - Python2 environment required!
Count the number of reads matching the Ref/SNP allele using the merged BAM files.
collectgeneidsfromgtf.py
Collecting the 39,469 gene IDs of the protein coding genes of Zea mays, reference version 3 (GTF-format).
collectgeneidsfromgff3.py
Collecting the gene IDs of the protein coding genes of Zea mays based on reference version 4 (GFF3-format).
filtermergedsnpsforcoding_genes.py
Filter the SNPs (TSV-format) for positions located in the protein-coding genes set. Allele counts may be included in the file.
Citation
Please cite via Zenodo:
Or as: Vedder, L. (2024). MaizeSNP (Version 1.0.0) [Computer software]. https://github.com/lu-vedder/MaizeSNP (for details see the CITATION file).
License
Copyright (c) 2024 Lucia Vedder![]()
For details see the LICENSE file.
[1] https://bowtie-bio.sourceforge.net/bowtie2/index.shtml
[2] http://www.htslib.org
[3] https://broadinstitute.github.io/picard
[4] https://gatk.broadinstitute.org
Owner
- Name: Lucia Vedder
- Login: lu-vedder
- Kind: user
- Location: Bonn, Germany
- Company: University of Bonn
- Repositories: 1
- Profile: https://github.com/lu-vedder
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: MaizeSNP
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Lucia
family-names: Vedder
email: lvedder@uni-bonn.de
affiliation: University of Bonn
orcid: 'https://orcid.org/0000-0002-8924-9800'
identifiers:
- type: doi
value: 10.5281/zenodo.10684044
description: Zenodo - Release v1.0.0
repository-code: 'https://github.com/lu-vedder/MaizeSNP'
abstract: >-
The MaizeSNP wolkflow was specifically designed for the
SNP analysis of different maize (Zea mays L.) inbred lines
in comparison to B73 as a reference line. The final result
is a TSV-file, containing a set of SNPs between the
respective inbred line and B73, located in the
protein-coding regions of maize. Further, this file
includes the allele counts, obtained from the original
mapping.
keywords:
- Bioinformatics
- Genetics
- SNP analysis
- NGS mapping
- Maize
- Zea mays
- B73
license: MIT
version: 1.0.0
date-released: '2024-02-20'