Recent Releases of ensemblrepeatdownload

ensemblrepeatdownload - v.2.0.1 - Shadowfax the Planerider (patch 1)

Enhancements & fixes

  • Update module versions
  • Remove reference to Anaconda repositories
  • Remove defaults from lib/Utils.groovy

Software dependencies

Note, since the pipeline is using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only Docker or Singularity containers are supported, conda is not supported.

| Dependency | Old version | New version | | ---------- | ----------- | ----------- | | Python | 3.8.3,3.9.1 | 3.9.1 | | samtools | 1.17 | 1.21 | | tabix | 1.11 | 1.20 |

- Groovy
Published by tkchafin over 1 year ago

ensemblrepeatdownload - v2.0.0 – Shadowfax the Planerider

This version supports the new FTP structure of Ensembl

Enhancements & fixes

  • Support for the updated directory structure of the Ensembl FTP
  • Relative paths in the sample-sheet are now evaluated from the --outdir parameter
  • Memory usage rules for samtools dict
  • Appropriate use of tabix's TBI and CSI indexing, depending on the sequence lengths
  • New command-line parameter (--annotation_method): required for accessing the files on the Ensembl FTP
  • --outdir is a mandatory parameter

Parameters

| Old parameter | New parameter | | ------------- | ------------------- | | | --annotation_method |

In the samplesheet

| Old parameter | New parameter | | ------------- | ----------------- | | speciesdir | outdir | | | annotationmethod | | assembly_name | |

NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.

Software dependencies

Note, since the pipeline is using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. Only Docker or Singularity containers are supported, conda is not supported.

| Dependency | Old version | New version | | ---------- | ----------- | ----------- | | multiqc | 1.13 | 1.14 |

- Groovy
Published by muffato almost 2 years ago

ensemblrepeatdownload - sanger-tol/ensemblrepeatdownload v1.0.0 - Gwaihir the Windlord

Overview

The pipeline takes a CSV file that contains assembly accession number, Ensembl species names (as they may differ from Tree of Life ones !), output directories. Assembly accession numbers are optional too. If missing, the pipeline assumes it can be retrieved from files named ACCESSION in the standard location on disk. The pipeline downloads the repeat annotation as the masked Fasta file and a BED file. All files are compressed with bgzip, and indexed with samtools faidx or tabix.

Steps involved:

  • Download the masked fasta file from Ensembl.
  • Extract the coordinates of the masked regions into a BED file.
  • Compress and index the BED file with bgzip and tabix.

Dependencies

All dependencies are automatically fetched by Singularity.

  • bgzip
  • samtools
  • tabix
  • python3
  • wget
  • awk
  • gzip

- Groovy
Published by muffato over 3 years ago