mosquito_pipeline

Pipeline for analysis of Anopheles mosquitoes

https://github.com/sophiemoss/mosquito_pipeline

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.1%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Pipeline for analysis of Anopheles mosquitoes

Basic Info
  • Host: GitHub
  • Owner: sophiemoss
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 4.04 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

mosquito_pipeline

Pipeline for whole genome sequence analysis of Anopheles mosquitoes.

Step 1: Acquire fastq reads of samples and use fastq2matrix to generate bam files and vcf files for each sample.

Step 2: Conduct basic statistics on samples to check quality.

Step 3: Make a genomics database vcf using the samples that you deem good enough quality to keep.

Step 4: Filter the genomics database vcf to retain only good quality SNPs.

Step 5: Conducting Principal Components Analysis with your filtered vcf

Step 6: Create PCA using this R script

Step 7: Calculating and plotting admixture

Step 8: Creating a maximum likelihood tree

Step 9: Calculating FST (using python and jupyter notebook)

Step 10: Calculating genetic diversity metrics, nucleotide diversity and tajimas D

Step 11: Selection, calculating Garud's H12 statistic (using python and jupyter notebook)

Step 12: Selection, calculating iHS (using python and jupyter notebook)

Step 13: Selection, calculating XPEHH (using python and jupyter notebook)

Step 14: Using DELLY to analyse structural variants

Other scripts:

calculate_n50.py can be used to calculate the n50 of a sequence, for example a reference genome.

check_sex.py can be used to check the sex of mosquito samples using the ratio of coverage between the X chromosome and an autosome.

chromocoverage.py and createcoverageplotssambambageneric.py are part of basicstatistics and can be used to assess the coverage of sequence data across the genome, including visualising this in plots.

generateadmixbarplot_colours.R is an R script used as part of 7.admixture.sh

Owner

  • Login: sophiemoss
  • Kind: user

Citation (citation.cff)

cff-version: 1.0.0
message: "If you use this pipeline, please cite it as below."
authors:
- family-names: "MOSS"
  given-names: "S"
  orcid: "https://orcid.org/0000-0003-2843-9085"
title: "mosquito_pipeline"
version: 1.0.0
date-released: 2025-03-14
url: "https://github.com/sophiemoss/mosquito_pipeline"

GitHub Events

Total
  • Push event: 3
  • Public event: 1
  • Pull request event: 2
Last Year
  • Push event: 3
  • Public event: 1
  • Pull request event: 2