abundance_estimation

Mirrored respository of abundance_estimation

https://github.com/sanger-pathogens/abundance_estimation

Science Score: 52.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
    Organization sanger-pathogens has institutional domain (www.sanger.ac.uk)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Mirrored respository of abundance_estimation

Basic Info
  • Host: GitHub
  • Owner: sanger-pathogens
  • License: mit
  • Language: Nextflow
  • Default Branch: main
  • Size: 148 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 9 months ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.aws.md

Amazon AWS cloud

You can run this pipeline on AWS cloud.

You need to set up AWS batch first, as decribed here.

Then, copy nextflow.aws.config.template to nextflow.aws.config and modify the aws values to match your setup (if you do not want to put access keys into your config file, you can also use environment variables).

Finally, upload your data to AWS. You can use S3 (cheaper) or EFS for stb_file, genome_dir, outdir, and workdir; for sourmash_db and bmtagger_db you need EFS. manifest can stay on your local disk.

Your input files (eg .bam) need to be copied into the cloud as well, best to S3. You will have to change your manifest files accordingly, see example.

You can now run your pipeline from your local machine, like so:

bash nextflow -C nextflow.config.aws run main.nf --manifest mainfest.csv

Owner

  • Name: Pathogen Informatics, Wellcome Sanger Institute
  • Login: sanger-pathogens
  • Kind: organization
  • Location: Hinxton, Cambs., UK

Citation (CITATIONS.md)

# abundance_estimation: Citations

## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

## Pipeline tools

- [Bowtie2](https://doi.org/10.1038/nmeth.1923)

  > Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012, 9:357-359.

- [inStrain](https://doi.org/10.1038/s41587-020-00797-0)

  > Olm, M.R., Crits-Christoph, A., Bouma-Gregson, K. et al. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat Biotechnol 39, 727–736 (2021).

## Software packaging/containerisation tools

- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

  > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)
  > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

GitHub Events

Total
  • Delete event: 1
  • Push event: 16
  • Create event: 10
Last Year
  • Delete event: 1
  • Push event: 16
  • Create event: 10