Recent Releases of ncbi-genome-download

ncbi-genome-download - Release 0.3.3

This is release 0.3.3 of ncbi-genome-download.

This release has no new features on top of 0.3.2 but adds some information on how to cite the software.

Detailed changes: Kai Blin (5): README: Add citation information CITATION: Add a citation metadata file CITATION: Second attempt at generating a valid CITATION.rff file CITATION: use the correct file name for the citation metadata file Bump version to 0.3.3

- Python
Published by kblin over 2 years ago

ncbi-genome-download - Re-release 0.3.2 for zenodo.

This is a re-release 0.3.2 of ncbi-genome-download, to get a Zenodo DOI generated. It's functionally identical to the existing 0.3.2 release.

Major changes of this release are: * Add support for the new format assembly info headers, fixing downloads. * Support for the translated-cds format (thanks @SwiftSeal) * Allow fuzzy searches for accessions * Cache the MD5SUMS files for a day as well, to make re-starting the download easier.

Detailed changes: ``` Kai Blin (9): core: Actually expose the --fuzzy-accessions logic built back in 2019 on the command line core: Improve the --refseq-categories tooltip core: Only re-download MD5SUMS if they're more than a day old core: Show progress bar both for downloading MD5SUMS and data files chore: update the python versions for the CI workflow chore: Remove old drone CI integration chore: Reformat README.md to fix markdownlint errors summary: Support the new format summary files Bump version to 0.3.2

Moray Smith (1): Update config.py ```

- Python
Published by kblin over 2 years ago

ncbi-genome-download - Release 0.3.2

This is release 0.3.2 of ncbi-genome-download.

Major changes of this release are: * Add support for the new format assembly info headers, fixing downloads. * Support for the translated-cds format (thanks @SwiftSeal) * Allow fuzzy searches for accessions * Cache the MD5SUMS files for a day as well, to make re-starting the download easier.

Thanks also to @chasemc and @twelvesummer for submitting patches for the header format.

Detailed changes: ``` Kai Blin (9): core: Actually expose the --fuzzy-accessions logic built back in 2019 on the command line core: Improve the --refseq-categories tooltip core: Only re-download MD5SUMS if they're more than a day old core: Show progress bar both for downloading MD5SUMS and data files chore: update the python versions for the CI workflow chore: Remove old drone CI integration chore: Reformat README.md to fix markdownlint errors summary: Support the new format summary files Bump version to 0.3.2

Moray Smith (1): Update config.py ```

- Python
Published by kblin over 2 years ago

ncbi-genome-download - Release 0.3.1

This is release 0.3.1 of ncbi-acc-download.

Main features of this release are: * support for progress bars (thanks to @444thLiao) * various bug fixes (thanks @peterjc and @jrjhealey)

Detailed changes:

``` Joe Healey (1): remove unused function and update email

Kai Blin (8): core: Change the progress bar shorthand to -P, default to no progress bar chore: Use pytest.fixture instead of deprecated pytest.yield_fixture core: Don't attempt to download metagenome info from refseq even if group is all chore: Ignore more IDE files chore: Fix all linter errors reported by flake8 Makefile: switch linting to flake8 to match CI setup chore: Update mailmap Bump version number to 0.3.1

Peter Cock (1): Fixed typo in command line API help

Tianhua Liao (2): add progress bar fix repeat bars ```

- Python
Published by kblin over 4 years ago

ncbi-genome-download - Release 0.3.0

This is release 0.3.0 of ncbi-genome-download.

This is a release breaking backwards compatibility a bit, hence the new minor relase number. If you are just using the command line tool, everything should still work, but note that some of the options have changed to their plural forms. If you are using the API, you need to update your code to use the new plural forms of the option names.

This version also no longer supports Python 2.7.

In addition, this version also contains some contributed features or bugfixes: * gimme_taxa.py now is installable, thanks Istvan (@ialbert) * We no longer break on FTP entries without an FTP path, thanks Paul (@openpaul) * We now raise an error if you try to download metagenomes from RefSeq. Thanks again, Paul (@openpaul) * Updated Chinese README file, thanks James (@jamesyangget) * We no longer leak pool workers when running parallel downloads, thanks Gerrit (@Wrzlprmft)

Detailed changes:

``` Gerrit Ansmann (2): Using context manager for pool. This should at least partially fix Issue #120. Restructuring to avoid excessively long and complicated line.

Istvan Albert (1): made gimme_taxa.py an installable script

James Yang (3): fix and update translation Update README-CN.md Update README-CN.md

Kai Blin (18): core: Make getnameand_checksum not skip the wrong files config: Init section before group main: Print nicer error messages on invalid arguments config: Add tests for new 'no metagenomes in refseq' check chore: Update contributor map chore: Drop python2 compatibility code chore: Style pass to make flake8 happy chore: Update supported Python versions in README chore: set up GitHub workflow for testing and publishing chore: Add vim config directory to gitignore core: Make acceptable --refseq-categories a list chore: Use py3 to test in drone as well core: Fix default for new list-based --refseq-category parameter core: Split strain parsing from strain label generation core: Also allow to filter by strain core: Also show strain in the dry-run listing core: Break the API so all list types now use the plural form. Bump version number to 0.3.0

Paul Saary (2): move na check into filter function raise warning if refseq metagenome is requested, as there is no such thing at the moment ```

- Python
Published by kblin almost 6 years ago

ncbi-genome-download - Release 0.2.12

This is release 0.2.12 of ncbi-genome-download.

Highlights of this release are:

  • Parallel downloads of checksum files (Thanks to Adelme Bazin (@axbazin))
  • New --flat-output option to dump all downloaded files into a single directory
  • We now have a Chinese translation of the README (Thanks James Yang (@jamesyangget))

Detailed changes: ``` Adelme Bazin (5): core : Checking MD5SUM in parallel when more than one process is allowed add integration test to check if metadata table was filled properly. Expected failure when using multiprocessing. fill metadata table in configdownload instead of downloadfilejob to avoid problems with multiprocessing modify the metadatafill test functions so that they follow the same logic than the current code fix the call to Pool with the 'with' statement for python2

Kai Blin (13): core: Allow keeping the downloaded files in a flat hierarchy chore: Break long help text lines core: Fix the --flat-output description README: Update install documentation core: Add tests for type material downloads chore: Add docstring and coverage skip to downloadjob_creator_caller config: Test is_compatible_assembly_accession() in fuzzyaccession mode core: Add a docstring to new fillmetadata function core: Acquire the metadata table object outside of the download loops config: Also support downloading metagenomes core: Check for exact match on genus name before trying to capitalise it chore: Update README to note that 0.2.12 is the last version to support Python 2 Bump version number to 0.2.12

James Yang (1): translate README.md into Chinese (#97) ```

- Python
Published by kblin about 6 years ago

ncbi-genome-download - Release 0.2.11

This is release 0.2.11 of ncbi-genome-download which fixes two logging issues.

Thanks to David Morgan (@Cptmorgan27) for providing a patch.

Detailed changes:

``` David Morgan (1): core: remove print statement for type material

Kai Blin (4): chore: Use a named logger instead of the root logger README: Make it clearer that more than just bacteria and viral groups are available chore: Remove landscape.io link, as that service seems dead Bump version number to 0.2.11 ```

- Python
Published by kblin over 6 years ago

ncbi-genome-download - Release 0.2.10

This is a bugfix release to ncbi-genome-download also adding two convenience features.

Major changes are: * Use realtive instead of absolute symlinks for human-readable output (thanks @chrisgulvik) * No longer crash on abnormal organism names (thanks to @andrewsanchez for the initial pull request) * Allow for fuzzy matching of both organism name and accessions

Detailed changes: ``` Chris Gulvik (1): create_symlink func modified to create relative rather than absolute symbolic links; resolves #62

Kai Blin (5): core: Allow for fuzzy matching for specified organism names core: Allow for fuzzy matching of specified accessions chore: Fix two whitespace issues core: Deal with organism names that don't contain a species part Bump version number to 0.2.10 ```

- Python
Published by kblin almost 7 years ago

ncbi-genome-download - Release 0.2.9

This release adds the "relation to type material filter" contributed by Jason Davis-Cooke. Thanks for that.

Detailed changes: ``` Jason Davis-Cooke (1): feat(core): add 'relation to type material' as as filtering option (#82)

Kai Blin (2): README: Document the type material filter option Bump version number to 0.2.9 ```

- Python
Published by kblin about 7 years ago

ncbi-genome-download - Release 0.2.8

This is mainly a bugfix release fixing a UnicodeEncodeError when writing to a --metadata-table file with non-ASCII entries like in record GCF_000234725.1.

Thanks to @danudwary and @jananiravi for the error reports.

Also thanks to Tessa Pierce and Joe Healey for their contributions.

Detailed changes: ``` Joe Healey (1): update readme with conda install

Kai Blin (3): config: Change a tab indent to spaces core: Open metatable file with utf-8 encoding Bump version number to 0.2.8

Tessa Pierce (1): add support for rm (repeat masked) eukaryotic genomes ```

- Python
Published by kblin over 7 years ago

ncbi-genome-download - Release 0.2.7

This is release 0.2.7 of ncbi-genome-download. Highlights of this version include:

  • Input options that supported a comma-separated list can now also read from files.
  • Support for downloading files in RNA FASTA format (thanks, @bluegenes).
  • Contributed script to get taxids for all children of a parent taxon (thanks @jrjhealey and @nick-youngblut).

Detailed changes:

Joe Healey (2): added contrib dir and script for queriying NCBI Taxa names/nums Added gimme_taxa to README

Kai Blin (20): core: Add a dry-run option core: Add a cache for the assembly summary files core: Move configuration/input validation to a config object core: Fix caching code for python 2.7 core: Remove unused SUPPORTEDTAXONOMICGROUPS import tests: Remove a useless print call core: Add printfunction import for python 2.7 compatibility config: refactor command line list parameter handling core: Add a test for os.makedirs error handling config: Add the option of reading list from a file for _createlist config: Allow passing file-based lists for genus, taxid, and species taxid parameters core: Split out the entry filtering logic core: refactor iterating over taxonomic groups into _download function core: Refactor to split _download into two separate, distinct functions tests: Remove outdated comment core: Allow multiple assembly levels core: Allow filtering by assembly accessions chore: Add Joe to .mailmap for correct attribution in the release notes chore: setup.py finally understands markdown natively Bump version number to 0.2.7

Nicholas Youngblut (3): major alterations to gimmetaxa.py removed --name parameter; now name can be used in taxid list updated README section on gimmetaxa; fixed gimme_taxa header bug

Tessa Pierce (2): add support for *rna.fna.gz file download add test for download rna-fna

- Python
Published by kblin over 7 years ago

ncbi-genome-download - Release 0.2.6

This is release 0.2.6 of ncbi-genome-download. Highlights of this version include:

  • Multiple formats, taxids, species, etc. can now be downloaded as once, see README for how to use this. Thanks to Ruben (@rhpvorderman).
  • You can now save information on downloaded files in a tab-separated table similar to the assembly_summary.txt file NCBI provides. Thanks to Ryan (@rrwick).

Detailed changes:

``` Kai Blin (22): chore: Add slack notifications to drone build core: Always return a boolean in worker() core: Make table an optional parameter for downloadentry core: Make table an optional argument for downloadfilejob() and move to back chore: Add a 'lint' target to run pylint chore: Configure CI to run slack notifications only on push, but regardless of success/failure core: Disable a couple of pylint warnings jobs: Spin out the DownloadJob to be a proper object config: Move all configuration into an extra module jobs: get full test coverage core: Detect if no downloads matched the specified filter core: Don't use multiprocessing if only a single thread should be used core: Don't cover function to generate the argument parser core: Metadata table code needs refactoring, snooze coverage warnings chore: Make pycodestyle happy and update outdated docstrings chore: No pycodestyle warnings anymore. chore: Ignore ropeproject output for git core: Rename downloadentry to create_downloadjob, as it doesn't download anything anymore metadata: Split out handling of the metadata table to a separate module README: Python 3.3 has reached end-of-life, drop support chore: Add mailmap to fix Ruben's name for git-shortlog Bump version number to 0.2.6

Ruben Vorderman (25): use args object update docstring allow multiple formats to be downloaded remove redundant is not none. Add description enabled multiple download of much things improve efficiency by relocating the loops minor spell mistakes separated argumentparser from main fixed quite a few tests fixed some other tests fixed all tests updated readme backwards compatibility eliminated version problem restore old test for backwards compatibility speed up algorithm immensely cleanup revert test to original revert removal in method call section typo review style additions use pylint to check for style errors add test to check all logic add test to check all logic remove 'unknown' from 'SUPPORTEDTAXONOMIC_GROUPS'

Ryan Wick (2): Runner script for convenience New option for saving metadata table ```

- Python
Published by kblin almost 8 years ago

ncbi-genome-download - Release 0.2.5

This is release 0.2.5 of ncbi-genome-download.

Highlights of this version include:

  • Enable specifically downloading reference and representative genomes.
  • New 'ngd' command alias saves you from having to type 'ncbi-genome-download' all the time.

Detailed changes:

Kai Blin (3): core: Allow filtering by RefSeq category setup: Also add short alias 'ngd' for the CLI script Bump version to 0.2.5

- Python
Published by kblin over 8 years ago

ncbi-genome-download - Release 0.2.4

This is release 0.2.4 of ncbi-genome-download.

Highlights of this version include:

  • Enable using ncbi-genome-download as API from your own scripts, thanks to Marc Bourqui (@mbourqui).
  • Also allow downloading the assemblyreport.txt and assemblystats.txt files, thanks to Peter Cock (@peterjc).
  • Better handle interrupting with Ctrl-C while downloading in multiple threads.

Detailed changes: ``` Kai Blin (6): core: More gracefully abort on Ctrl-C core: Silence some pylint style warnings I disagree with chore: Update drone CI config for drone 0.7 README: Switch CI links to HTTPS. publish.sh: Switch to using twine and build universal binary wheel Bump version to 0.2.4

Marc Bourqui (27): Update gitignore for PyCharm Replace args with kwargs in download signature Merge download and download Update tests to previous commits Bug fixes in Enums, some renamings, more doc Update packaging related stuff * Version number * Requirements * README Move tests folder (were not finding package files otherwise) Fix AttributeError Fix Enum related issues Fix issues and tests Update imports Fix tests * Bring back _download() for testing purposes WIP preserve enum order Update EMap to avoid usage of dict which does not preserve order Move test folder for drone testing Minor performance improvement Fix README Remove _init_.py in tests/ and revert testparsechecksums() auto line breaks Fix unreachable code Unbump version number Fix undesired new line Rename var to _downloadmock when dealing with _download() Fix SystemError: Parent module '' not loaded, cannot perform relative import Fix TypeError and AssertionErrors Fix Enum ordering for Pyhton 2 Move comment line Fix AttributeError

Peter Cock (3): Download XXXassemblyreport.txt and XXXassemblystats.txt Use strict bash mode in publish.sh Include PyPI version badge in README ```

- Python
Published by kblin almost 9 years ago

ncbi-genome-download -

This is release 0.2.3 of ncbi-genome-download.

Highlights of this version include: - Properly deal with existing human-readable symlinks when re-running a download, thanks to An Phung (@anphung). - Support creating human-readable symlinks without reloading the genome data - Work around formatting issues in NCBI's viral assembly_summary.txt files

Detailed changes:

``` An Phung (1): core: Check symlink existence, fix #20

Kai Blin (6): createreadabledir: Special case viral entries, because they are special snowflakes core: Make sure invalid checksum lines are skipped. core: Allow creating human-readable links for already downloaded files summary: Skip invalid assemblysummary.txt lines summary: Try to fix invalid viral assemblysummary.txt lines instead of skipping them Bump version to 0.2.3 ```

- Python
Published by kblin about 9 years ago