Recent Releases of metagenome-atlas

metagenome-atlas - GTDB V9 R220, Spades 4

feat: GTDB v9 Refseq 220 by @SilasK in https://github.com/metagenome-atlas/atlas/pull/730 Spades v4

- Python
Published by SilasK almost 2 years ago

metagenome-atlas - v2.18.2

2.18.2 (2024-06-28)

Bug Fixes

- Python
Published by github-actions[bot] almost 2 years ago

metagenome-atlas - All in the sample table

What's Changed

  • Qc reads, assembly are now written in the sample.tsv from the start. This should fix errors of partial writing to the sample.tsv https://github.com/metagenome-atlas/atlas/issues/695
  • It also allows you to add external assemblies.
  • singletons reads are no longer used trough the pipeline.
  • This changes the default paths for raw reads and assemblies. assembly are now in Assembly/fasta/{sample}.fasta reads: QC/reads/{sample}_{fraction}.fastq.gz

Seemless update: If you update atlas and continue on an old project. Your old files will be copies. Or the path defined in the sample.tsv will be used.

- Python
Published by SilasK over 2 years ago

metagenome-atlas - Co-binning

Co-binning with sub-groups

https://github.com/metagenome-atlas/atlas/pull/683

In this new version, Atlas uses binning with co-abundance as default. While binning each sample individually is faster, using co-abundance for binning, by quantifying the coverage of contigs across multiple samples provides valuable insights about contig co-variation.

See also my blog post

Starting with version 2.18, atlas places every sample in a single BinGroup and defaults to vamb as the binner unless there are very few samples. For fewer than 8 samples, metabat is the default binner.

The defaults are fine except when you have many samples (>150) where atlas gives a warning that you should put sour samples in more than one bin group.

Note

Previously each sample was put in its own BinGroup optimized for single-sample binning. Running vamb in those versions would consider all samples, regardless of their BinGroup. Hence updating to v2.18 might cause errors if using a sample.tsv file from an older Atlas version. You can resolve this by assigning a unique BinGroup to each sample.

Link to documentation

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.17.2...v2.18.0

- Python
Published by SilasK almost 3 years ago

metagenome-atlas - v2.17.2

Fixes

  • Ignore certificate for gtdb_v08 by @mladen5000 in https://github.com/metagenome-atlas/atlas/pull/674
  • Fixed pandas dependency for instrain by @mladen5000 in https://github.com/metagenome-atlas/atlas/pull/678
  • Convert mem_mb value from gb to mb for select rules by @LLansing in https://github.com/metagenome-atlas/atlas/pull/681
  • ci: use micromamba by @SilasK in https://github.com/metagenome-atlas/atlas/pull/682

- Python
Published by SilasK almost 3 years ago

metagenome-atlas - Use skani for genome clustering

Skani

The tool Skani claims to be better and faster than the combination of mash + FastANI as used by dRep I implemented the skin for species clustering. We now do the species clustering in the atlas run binning step. So you get information about the number of dereplicated species in the binning report. This allows you to run different binners before choosing the one to use for the genome annotation. Also, the file storage was improved all important files are in Binning/{binner}/

My custom species clustering does the following steps:

  1. Pre-cluster genomes with single-linkage at 92.5 ANI.
  2. Re-calibrate checkm2 results.
    • If a minority of genomes from a pre-cluster use a different translation table they are removed
    • If some genomes of a pre-cluster don't use the specialed completeness model we re-calibrate completeness to the minimum value. This ensures that not a bad genome evaluated on the general model is preferred over a better genome evaluated on the specific model. See also https://silask.github.io/post/better_genomes/ Section 2.
  3. Drop genomes that don't correspond to the filter criteria after re-calibration
  4. Cluster genomes with ANI threshold default 95%
  5. Select the best genome as representative based on the Quality score Completeness - 5x Contamination

New Contributors

  • @jotech made their first contribution in https://github.com/metagenome-atlas/atlas/pull/667

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.16.3...v2.17.0

- Python
Published by SilasK almost 3 years ago

metagenome-atlas - GTDB v8

Save GTDB v8 in download folder for GTDB v8 Thanky to @strejcem

- Python
Published by SilasK about 3 years ago

metagenome-atlas - V2.16

What's Changed

  • fix gene_info.parquet by @SilasK in https://github.com/metagenome-atlas/atlas/pull/642
  • docs: update gene catalog by @SilasK in https://github.com/metagenome-atlas/atlas/pull/643
  • add minimum mapping quality in pileup by @johnne in https://github.com/metagenome-atlas/atlas/pull/647
  • gtdb v8 by @SilasK in https://github.com/metagenome-atlas/atlas/pull/648

New Contributors

  • @johnne made their first contribution in https://github.com/metagenome-atlas/atlas/pull/647

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.15.2...v2.16.1

- Python
Published by SilasK about 3 years ago

metagenome-atlas - v2.15.2

What's Changed

  • Annotate gene catalog with Kegg, CAZy using DRAM
  • You can turn off GUNC

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.15.1...v2.15.2

- Python
Published by SilasK about 3 years ago

metagenome-atlas - GUNC'n'More

What's Changed

  • Use Gunc
  • New Folder organisation: Main output files for Binning are in the new folder Binning
  • Use hdf-format for gene catalogs. Allow efficient storage and selective access to large count and coverage matrices from the genecatalog. (See docs for how to load them) https://github.com/metagenome-atlas/atlas/pull/621
  • Semibin v. 1.5 by @SilasK in https://github.com/metagenome-atlas/atlas/pull/622

- Python
Published by SilasK about 3 years ago

metagenome-atlas - Use checkM2

What's Changed

  • Support for checkm2 by @SilasK in https://github.com/metagenome-atlas/atlas/pull/607

Thank you @trickovicmatija for your help.

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.13.1...v2.14.0

- Python
Published by SilasK over 3 years ago

metagenome-atlas - V2.13

What's Changed

  • use minimap for contigs, genecatalog and genomes in https://github.com/metagenome-atlas/atlas/pull/569 https://github.com/metagenome-atlas/atlas/pull/577
  • filter genomes my self in https://github.com/metagenome-atlas/atlas/pull/568 The filter function is defined in the config file: genome_filter_criteria: "(Completeness-5*Contamination >50 ) & (Length_scaffolds >=50000) & (Ambigious_bases <1e6) & (N50 > 5*1e3) & (N_scaffolds < 1e3)" The genome filtering is similar as other publications in the field, e.g. GTDB. What is maybe a bit different is that genomes with completeness around 50% and contamination around 10% are excluded where as using the default parameters dRep would include those.

  • use Drep again in https://github.com/metagenome-atlas/atlas/pull/579 We saw better performances using drep. This scales also now to ~1K samples

  • Use new Dram version 1.4 by in https://github.com/metagenome-atlas/atlas/pull/564

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.12.0...v2.13.0

- Python
Published by SilasK over 3 years ago

metagenome-atlas - v2.12.0

What's Changed

  • GTDB-tk requires rule extract_gtdb to run first by @Waschina in https://github.com/metagenome-atlas/atlas/pull/551
  • use Galah instead of Drep
  • use bbsplit for mapping to genomes (maybe move to minimap in future)
  • faster gene catalogs quantification using minimap.
  • Compatible with snakemake v7.15 ## New Contributors
  • @Waschina made their first contribution in https://github.com/metagenome-atlas/atlas/pull/551

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.11.1...v2.12.0

- Python
Published by SilasK over 3 years ago

metagenome-atlas - Fix Enormous gene catalog

Due to an bug, the genecatalog was created based on all gene not only the representatives in v.2.11

If you have an oversized gene catalog: Rerun:

atlas run genecatalog -R generate_orf_info

Small change in Dram environment to fix #547

- Python
Published by SilasK over 3 years ago

metagenome-atlas - Use parquet and pyfastx to handle large gene catalogs

What's Changed

  • Make atlas handle large gene catalogs using parquet and pyfastx (Fix #515)

parquet files can be opened in python with ``` import pandas as pd coverage = pd.readparquet("workingdir/Genecatalog/counts/mediancoverage.parquet") coverage.setindex("GeneNr", inplace=True)

```

and in R it should be something like:

``` arrow::readparquet("workingdir/Genecatalog/counts/median_coverage.parquet")

```

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.10.0...v2.11.0

- Python
Published by SilasK almost 4 years ago

metagenome-atlas - GTDB v 207 low memory profiling

New Features

  • GTDB version 207
  • Low memory taxonomic annotation

Minor changes

  • Fix Typos by @roshni-b in https://github.com/metagenome-atlas/atlas/pull/520
  • Speed up DRAM annotations by @jmtsuji in https://github.com/metagenome-atlas/atlas/pull/534

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.9.1...v2.10.0

- Python
Published by SilasK almost 4 years ago

metagenome-atlas - Go Public

What's Changed

  • ✨ Start an atlas project from public data in SRA Docs
  • Make atlas ready for python 3.10 https://github.com/metagenome-atlas/atlas/pull/498
  • Add strain profiling using inStrain You can run atlas run genomes strains

New Contributors

  • @alienzj made their first contribution to fix config when run DRAM annotate in https://github.com/metagenome-atlas/atlas/pull/495

Full Changelog: https://github.com/metagenome-atlas/atlas/compare/v2.8.2...v2.9.0

- Python
Published by SilasK about 4 years ago

metagenome-atlas - V2.8 - Toiminnot

This is a major update of metagenome-atlas. It was developed for the 3-day course in Finnland, that's also why it has a finish release name.

What is new?

New binners

It integrates bleeding-edge binners Vamb and SemiBin that use Co-binning based on co-abundance. Thank you @yanhui09 and @psj1997 for helping with this. The first results show better results using these binners over the default.

See more

Pathway annotations

The command atlas run genomes produces genome-level functional annotation and Kegg pathways respective modules. It uses DRAM from @shafferm with a hack to produce all available Kegg modules.

See more

Genecatalog

The command atlas run gene catalog now produces directly the abundance of the different genes. See more in #276

In future this part of the pipeline will include protein assembly to better tackle complicated metagenomes.

Minor updates

Reports are back

See for example the QC report

Update of all underlying tools

All tools use in atlas are now up to date. From assebler to GTDB. The one exception is, BBmap which contains a bug and ignores the minidenty parameter.

Atlas init

Atlas init correctly parses fastq files even if they are in subfolders and if paired-ends are named simply Sample1/Sample2. @Sofie8 will be happy about this. Atlas log uses nice colors.

Default clustering of Subspecies

The default ANI threshold for genome-dereplication was set to 97.5% to include more sub-species diversity.

See more

- Python
Published by SilasK over 4 years ago

metagenome-atlas - Python 3.8, new ruaml

Fix #437 #423

- Python
Published by SilasK over 4 years ago

metagenome-atlas - Bug fixes Drep

- Python
Published by SilasK almost 5 years ago

metagenome-atlas - new eggNOG

V 2.6a2 should solve the eggNOg version as we use now eggNOG mapper >2.1.2

Alos Singularity suport is comming #359

Many of you have noticed unbearable long downloading times for installing packages E.g. #390 With the update of snakemake to >6.1 mamba is used instead of conda. This makes installing packages faster and more robust.

Even though this implied big changes in the back end, the API, and dependency graph doesn't change. This means You can continue your previous project with the updated version of atlas, at least this should be possible.

I never less marked this version as alpha because I had a breaking change in the HTML reports. I don't think fixing this is now a priority as almost all the stats can be looked at as tables. Once, I have fixed this you will be able to rerun the reports without impact on the other results.

- Python
Published by SilasK about 5 years ago

metagenome-atlas - Mambo jambo

My PhD Thesis is written, I have now time to work a bit on atlas.

Many of you have noticed unbearable long downloading times for installing packages E.g. #390 With the update of snakemake to >6.1 mamba is used instead of conda. This makes installing packages faster and more robust.

Even though this implied big changes in the back end, the API, and dependency graph doesn't change. This means You can continue your previous project with the updated version of atlas, at least this should be possible.

I never less marked this version as alpha because I had a breaking change in the HTML reports. I don't think fixing this is now a priority as almost all the stats can be looked at as tables. Once, I have fixed this you will be able to rerun the reports without impact on the other results.

- Python
Published by SilasK about 5 years ago

metagenome-atlas - Update to GTDB release 6

The new GTDB data will be stored in databasefolder/GTDBV6 you can delete the folder for V5 if you have used atlas before.

- Python
Published by SilasK about 5 years ago

metagenome-atlas - Fix eggNOGmapper

- Python
Published by SilasK over 5 years ago

metagenome-atlas - Allow parameters in init qc

- Python
Published by SilasK over 5 years ago

metagenome-atlas -

corrects code for normal execution of eggNOG mapper

as in 2.4.2

Makes eggNOG annotation faster if virtual memory can be used:

If you have a virtual disk e.g. /dev/shm you can enable that atlas copy the eggNOG db to this virtual drive which accelerates the eggNOg annotation considerably.

More info in the eggNOG docs

You ned to set: eggNOG_use_virtual_disk: true

- Python
Published by SilasK over 5 years ago

metagenome-atlas - Faster annotation with eggNOG mapper

Makes eggNOG annotation faster if virtual memory can be used:

If you have a virtual disk e.g. /dev/shm you can enable that atlas copy the eggNOG db to this virtual drive which accelerates the eggNOg annotation considerably.

More info in the eggNOG docs

You ned to set: eggNOG_use_virtual_disk: true

Fix #259

Fix #313

- Python
Published by SilasK over 5 years ago

metagenome-atlas - Smal bug in genes2genomes.tsv

full compatibility with Atlas analyze

- Python
Published by SilasK almost 6 years ago

metagenome-atlas - With new GTDB v05

- Python
Published by SilasK almost 6 years ago

metagenome-atlas - set better default parameters

See #303

- Python
Published by SilasK almost 6 years ago

metagenome-atlas - 2.3.4

This release fixes some bugs that were found by. @MaxRubinBlum and @amojarro and @baptisteavot

- Python
Published by SilasK almost 6 years ago

metagenome-atlas -

This release fixes #296 by constraining pandas version

- Python
Published by SilasK almost 6 years ago

metagenome-atlas - Keep using snakemake 5.9

295

- Python
Published by SilasK almost 6 years ago

metagenome-atlas - Functional annotation with eggNOG mapper V2

- Python
Published by SilasK over 6 years ago

metagenome-atlas - V2.2 GTDB

This version includes new taxonomic classification using GTDB

It also can generate Phylogenetic trees using this database or the checkM markers. The trees are properly rooted by the midpoints.

- Python
Published by SilasK over 6 years ago

metagenome-atlas - Workshop

bug fixes in dastool env

- Python
Published by SilasK almost 7 years ago

metagenome-atlas - bug fixes for snakemake checkpoints

compatible with snakemake v5.4.5 you can start with interleaved reads

- Python
Published by SilasK almost 7 years ago

metagenome-atlas - Support for hybrid assembly

- Python
Published by SilasK about 7 years ago

metagenome-atlas - v2.0.6

-Fixes De-Replication of Bins -Fixes Bin-Report -Uses scripts directive for all reports

- Python
Published by SilasK about 7 years ago

metagenome-atlas - v2.0.5

  • get unique bin names
  • get unique bin names
  • Fixes #177
  • Fixes #174
  • Fixes download workflow

- Python
Published by SilasK over 7 years ago

metagenome-atlas - v2.0.4

Fixes #175

- Python
Published by SilasK over 7 years ago

metagenome-atlas - v2.0.3

bug fixes with taxonomy thanks to @jmtsuji

- Python
Published by SilasK over 7 years ago

metagenome-atlas - Small bug fixes

- Python
Published by SilasK over 7 years ago

metagenome-atlas - v2.0.1

Changelog:

  • New API
  • on the fly install of conda environments and databases
  • Updated docs
  • Metagenomic binning with metabat, maxbin, DAS Tool
  • Use scaffolds when using spades
  • Build gene catalog of genes from MAGs with linclust
  • Annotate with eggNOG mapper
  • Taxonomic classification with CAT

scheme of workflow

Road map: * Use genome properties to get pathways * Combine unbinned contigs and add genes to gene catalog * replace dynamic snakemake arguments with checkpoints

- Python
Published by brwnj over 7 years ago

metagenome-atlas - v1.0.35

- Python
Published by SilasK over 7 years ago

metagenome-atlas - v1.0.34

  • added support for spades k
  • corrected number of mapping reads to an assembly

- Python
Published by SilasK almost 8 years ago

metagenome-atlas - v1.0.33

Compatible with snakemake v5, spades v3.12

- Python
Published by SilasK almost 8 years ago

metagenome-atlas -

  • adds reports to python package

- Python
Published by brwnj about 8 years ago

metagenome-atlas - Bug fixes in annotate and assembly protocols

  • fixes bug in command line for featureCounts
  • fixes bug when using atlas annotate without fastqs

- Python
Published by brwnj over 8 years ago

metagenome-atlas -

  • bug fixes where assembly was only using SE reads
  • updates to checkm where databases are now downloaded separately

- Python
Published by brwnj over 8 years ago

metagenome-atlas -

  • bug fixes
  • adds read length stats
  • makes deduplication (via clumpify.sh) default

- Python
Published by brwnj over 8 years ago

metagenome-atlas -

  • adds new pre-processing protocols

- Python
Published by brwnj over 8 years ago

metagenome-atlas -

  • updates environment to pinned versions

- Python
Published by brwnj almost 9 years ago

metagenome-atlas -

  • updates annotation protocol
  • adds checkm bin annotation

- Python
Published by brwnj almost 9 years ago

metagenome-atlas -

  • fixes file path report when checking config file to improve "config file is invalid" message

- Python
Published by brwnj about 9 years ago

metagenome-atlas -

  • fixes megahit assembly method
  • fixes method related to splitting fastas in coassembly protocol

- Python
Published by brwnj about 9 years ago

metagenome-atlas -

  • updates setup.py to remove pypi package install

- Python
Published by brwnj about 9 years ago

metagenome-atlas -

  • fixes single-end call to megahit

- Python
Published by brwnj about 9 years ago

metagenome-atlas -

- Python
Published by brwnj over 9 years ago