automated-window-sliding
Bioinformatics pipeline using a sliding window approach to create a sequence of trees from multiple sequence alignments.
Science Score: 18.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.8%) to scientific vocabulary
Keywords
Repository
Bioinformatics pipeline using a sliding window approach to create a sequence of trees from multiple sequence alignments.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Introduction
Automated-Window-Sliding is a bioinformatics pipeline that can be used as a starting point for sliding window based phylogenetic analysis. For this the input alignment is split into several subalignments using a sliding window approach or alternatively custom alignment ranges provided in a CSV file. For each of the subalignment windows a tree is reconstructed and in the end all trees are collected in a single Nexus/Newick file. This file can then be used to study effects that change the phylogenetic signal along the alignment, e.g. recombinations, reassortment and selection effects.

- Split alignment into subalignments (Python Script)
- Find best-fit evolutionary model for whole alignment or subalignments (IQ-TREE ModelFinder)
- Run tree inference on each subalignment
- Collect reconstructed trees in a single file (Python Script)
- Nexus
- Newick
- Both
Installation
This pipeline runs on the Nextflow Workflow System. For the installation of Nextflow, please refer to this page.
To run the pipeline the following programs are required: * Python (tested with 3.11) * Python packages: DendroPy, Biopython * IQ-TREE (RAxML-ng optional: if used for tree reconstruction)
If you do not want to manually install these dependencies you can run the pipeline with docker, singularity or conda by using -profile <docker/singularity/podman/apptainer/conda/mamba>. If you want to use your locally installed programs omit this parameter.
The pipeline can be downloaded via the nextflow pull command
bash
nextflow pull ggruber193/automated-window-sliding
which automatically pulls the latest version of the pipeline into the folder $HOME/.nextflow/assets on your computer. This command is also used to update the pipeline to the latest version.
Alternatively the nextflow run command can be used to pull the pipeline and then run it immediately
bash
nextflow run ggruber193/automated-window-sliding <additional options>
You can also clone the repository and use the pipeline this way. Here you have to use the nextflow run and provide the path to the main.nf file.
bash
git clone https://github.com/ggruber193/automated-window-sliding.git
bash
nextflow run <path/to/cloned/repository>/main.nf <additional options>
Usage
To check if everything works correctly the pipeline can be run on a minimal test case by using -profile test:
bash
nextflow run ggruber193/automated-window-sliding -profile test -outdir <OUTDIR>
You can use multiple profile options in a single run, for example -profile test,docker to run the minimal test case with docker.
To run the pipeline on your own data provide a multiple sequence alignment (only accepts FASTA, PHYLIP, NEXUS, MSF, CLUSTAL) with --input:
bash
nextflow run ggruber193/automated-window-sliding --input <ALIGNMENT> --outdir <OUTDIR>
To view available pipeline parameters use:
bash
nextflow run ggruber193/automated-window-sliding --help
For more information about the usage and output of the pipeline refer to the full Documentation of this project.
In addition to the pipeline specific parameters there are several parameters that Nextflow provides. These are invoked with a single dash, e.g. -resume to resume a previously failed pipeline run or -qs <int> to limit the number of parallel processes. For a full overview of Nextflow CLI parameters please refer to this page or use nextflow run -h
Citations
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
Owner
- Login: ggruber193
- Kind: user
- Repositories: 1
- Profile: https://github.com/ggruber193
Citation (CITATIONS.md)
# ggruber193/automated-window-sliding: Citations ## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/) > Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. ## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/) > Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. ## Pipeline tools - [IQ-TREE](https://doi.org/10.1093/molbev/msaa015) > Bui Quang Minh, Heiko A Schmidt, Olga Chernomor, Dominik Schrempf, Michael D Woodhams, Arndt von Haeseler, Robert Lanfear, IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era, Molecular Biology and Evolution, Volume 37, Issue 5, May 2020, Pages 1530–1534, https://doi.org/10.1093/molbev/msaa015 - [IQ-TREE ModelFinder](https://doi.org/10.1038/nmeth.4285) > Kalyaanamoorthy, S., Minh, B., Wong, T. et al. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14, 587–589 (2017). https://doi.org/10.1038/nmeth.4285 - [RAxML-ng](https://doi.org/10.1093/bioinformatics/btz305) > Alexey M Kozlov, Diego Darriba, Tomáš Flouri, Benoit Morel, Alexandros Stamatakis, RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference, Bioinformatics, Volume 35, Issue 21, November 2019, Pages 4453–4455, https://doi.org/10.1093/bioinformatics/btz305 - [Biopython](https://doi.org/10.1093/bioinformatics/btp163) > Peter J. A. Cock, Tiago Antao, Jeffrey T. Chang, Brad A. Chapman, Cymon J. Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski, Michiel J. L. de Hoon, Biopython: freely available Python tools for computational molecular biology and bioinformatics, Bioinformatics, Volume 25, Issue 11, June 2009, Pages 1422–1423, https://doi.org/10.1093/bioinformatics/btp163 - [DendroPy](10.1093/bioinformatics/btq228) > Sukumaran J, Holder MT. DendroPy: a Python library for phylogenetic computing. Bioinformatics. 2010 Jun 15;26(12):1569-71. doi: 10.1093/bioinformatics/btq228. Epub 2010 Apr 25. PMID: 20421198. ## Software packaging/containerisation tools - [Anaconda](https://anaconda.com) > Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Feb. 2024. Web. - [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/) > Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506. - [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/) > da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671. - [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241) > Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241. - [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/) > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
GitHub Events
Total
- Issues event: 1
- Watch event: 4
- Fork event: 2
Last Year
- Issues event: 1
- Watch event: 4
- Fork event: 2
Dependencies
- debian bookworm-slim build