Recent Releases of funannotate

funannotate - funannotate v1.8.17

Bug fix release.

  • fix FASTA splitter in iprscan local 83ceee5e4119b296aa6c2c33e796747d592dea54
  • pin docker and pip installs to biopython<1.80; so we try to avoid any of the other deprecated or breaking code
  • update to support EVM v1 or v2; back-port training data from funannotate2 (for testing)
  • run signalp6 if in PATH
  • fix pandas compatibility >v2
  • add config-window parameter to protein2genome script

- Python
Published by nextgenusfs about 2 years ago

funannotate - funannotate v1.8.15

Bug fix release

  • Trying hard to support augustus v3.5
  • fix sorting by length #896
  • update dbCAN Cazyme filtering #890
  • fixes for parsing eggnog mapper #868 #892
  • signalp6 fixes #822
  • and many others

- Python
Published by nextgenusfs almost 3 years ago

funannotate - funannotate v1.8.13

Bug fix release

  • fix to internal BUSCO code to support Augustus v3.4 #742 @IanDMedeiros
  • several fixes to funannotate update #727 #740
  • several patches to Docker build

- Python
Published by nextgenusfs over 3 years ago

funannotate - funannotate v1.8.11

Bug fix release

  • support for signalp v6 parsing #650 #716
  • improve protein2genome performance #652 #654
  • add --tmpdir options to most scripts (defaults to /tmp), ie use if have access to SSD to improve speed
  • upgrade eggnog-mapper to newer version, use in memory if enough is available to improve speed
  • fix version check #655
  • fix for abnormal contig identifiers #672
  • fix for EVM contig memory parsing #646
  • add -m,--mito-pass-thru to funannotate annotate to add back mitochondrial genome, mitochondrial contigs should be removed prior to running funannotate as uses different genetic code
  • antismash6 support #692
  • update download urls to https from ftp
  • fix for EVM if last line of GFF3 is blank #709
  • simplify runSuprocess functions #707 #710 @mglubber
  • move temporary files for check for soft masking to output folder #722
  • conda augustus is apparently broken, fix in docker#722 #724, so need to install augustus a different way (ie apt-get on Debian)
  • add missing help menu options #714
  • fix for funannotate update compare annotations function #727
  • add --ml_model to funannotate compare to control ML model selection in iqtree, default is to run modelfinder which is very slow.

- Python
Published by nextgenusfs over 3 years ago

funannotate - funannotate v1.8.9

  • fix for EvidenceModeler partitioning -- on rare cases genes/contigs were not getting partitioned correctly
  • fix logic for PASA training of glimmer/snap #591
  • fix logic for pre-trained datasets in funannotate predict #597
  • re-organize tbl2asn and provide single threaded backup if failure #599 and fix log error #621
  • remove perl script generating AGP file --> now python. which removes bioperl as dependency #617
  • fix cds-transcripts coordinates (not used for any other processing) but on partial transcripts they were incorrect #614
  • fix bug in EVM filtering if no genes to remove #620
  • update eggnog parser to support versions >2.1.2 #566
  • fix tmpdir error in funannotate clean #594
  • python2 is no longer supported.

- Python
Published by nextgenusfs over 4 years ago

funannotate - funannotate v1.8.7

Bug fix release

  • bug fix for eggnog mapper v2 parsing, occasionally some records have more than 1 OG in the "best_OG" field -- this is probably a bug in emapper.py, but funannotate will now ignore those entries as there is no way to tell which is actually the best OG.
  • make fasta2agp accept more IUPAC characters #532
  • fix for diamond >v2.0.8 (yet another change in the database info format resulted in error).
  • remove support for python 2

- Python
Published by nextgenusfs almost 5 years ago

funannotate - funannotate v1.8.5

Bug fix release, recommended for all users.

  • use uuid unique identifier for tmp file names (previous uses process id)
  • fix diamond command in mapping proteins to genome #529
  • add some contig name checking for inputs to predict -- some users erroneously pass GFF3 files that reference transcripts or some other assembly, warn user #528
  • add support for antiSMASH v6 #531 #539
  • check genome for IUPAC errors #532
  • re-write long reads mapping to trinity transcripts #539
  • allow for fewer busco models for training #556
  • expose some EVM parameters to predict #558
  • modify EVM partitioning -- some gene models close to partition were not being called correctly #558.
  • support newly released eggnog-mapper >= v2.0.5 #566, note that v2.x - 2.0.4 will not be supported as the output is broken in the sense that it is impossible to parse the proper EggNog reference ID. If you are using any eggnog results from this code, strongly advised to re-run with new release.
  • add _pasa to PASA database names #398. Note -- this change could potentially cause pre-existing databases run with older funannotate code to not be recognized properly after v1.8.5.

- Python
Published by nextgenusfs almost 5 years ago

funannotate - funannotate v1.8.3

Bug fix release

  • remove auto fix for RNA-seq data if it does not have Trinity compatible headers -- warn user to fix instead #485
  • fix antiSMASH v5 parsing #490, update secmet GBK format of clusters
  • add funannotate util gff-rename script
  • use natsort for sorting contigs/names so gene names are correctly auto incremented #501
  • fix for signalp v5 #494
  • support fasta headers with spaces for clean #504
  • rstrip(*) from proteins if GFF/FASTA passed; #508
  • expose --p2g_prefilter to use tblastn for mapping protein evidence #495
  • add --anysymbol to mafft calls #514
  • use frameshift diamond if >2.0.5 #503
  • improve glimmerhmm parsing error report #519
  • add --no-progress option to clean up terminal output (ie log files from cluster)
  • use sorted bedtools intersection to reduce memory usage #522
  • add --trnascan option to predict #523

- Python
Published by nextgenusfs about 5 years ago

funannotate - funannotate v1.8.1

  • highly recommend all users upgrade
  • Code now python 2/3 compatible (only 9 months late....)
  • EVM now uses interlap for slicing input files, significant speed increase for larger genomes
  • if --organism other is passed, codingquarry is now disabled by default -- you can turn it on by specifying a valid weight, ie --weights codingquarry:2
  • support for signalp version 5
  • move resources to JSON download via GitHub, will allow for updating/changing resources without re-installing a new version of funannotate
  • several fixes for more robust GFF3 parsing
  • fix for cazyme assignments where results were duplicated; also upgraded to database version 8
  • fix bug where PASA results not properly passed to EVM if funannotate predict was re-run
  • many other bug fixes

- Python
Published by nextgenusfs over 5 years ago

funannotate - funannotate v1.7.4

Okay, so apparently I lied about bug fixes before py3 release. Here are some quick ones.

Also -- some users have said that conda recipe fails to find a solution, with the help of @reslp this is perhaps caused by ete3 package as a dependency. Since ete3 is only used in funannotate for the dN/dS calculation in funannotate compare, I'm going to remove it from conda recipe as a dependency. Users that would like to continue using the dN/dS function in compare will need to manually install ete3 (ie conda could work on your system or pip, etc).

  • update the log file copy/remove in funannotate annotate which causes a problem in singularity container
  • encoding bug fixes for some PFAM results
  • fix for funannotate iprscan which was not correctly combining XML files with newer versions of Interproscan.

- Python
Published by nextgenusfs about 6 years ago

funannotate - funannotate v1.7.3

  • bug fix release. This will be the last release before updating the code for py3
  • bug fix for remote #379
  • improve GFF3 parsing in funannotate annotate allow for non-funannotate identifiers (hopefully this catches most of them)
  • --aligners option was missing from help menu in train/update
  • add --debug flag to funannotate iprscan -- there is potentially still an issue here combining results from the newest release of Interproscan -- I need some intermediate results to see what the problem is
  • fix GO annotations parsing for more versions of goatools #363
  • fix for bug in the "gene start or end in a gap" function
  • make sure funannotate annotate re-runs tbl file generation if script is re-run
  • fix in PASA function logic where PASA db not found if using --gff and --fasta input.

- Python
Published by nextgenusfs about 6 years ago

funannotate - funannotate v1.7.2

  • busco internal run now uses the local augustus config path, so hopefully write access to $AUGUSTUS_CONFIG_PATH is no longer required
  • long reads are now processed to remove forward slashes from names if present -- this was causing problems with PASA #326 #250
  • added augustus hints file generation when using RNA-seq data. Also added a check for --transcript_evidence if found in --transcript_alignments, ie so extra transcripts can be added instead of only those used to generate alignments in funannotate train. #360
  • fix typo in funannotate check
  • fix for proteinortho issue in funannotate compare #350

- Python
Published by nextgenusfs about 6 years ago

funannotate - funannotate v1.7.1

A few updates to support bioconda integration -- namely either $TRINITYHOME or $TRINITY_HOME can be used. Also don't trust trinity packaged trimmomatic, use the conda version.

- Python
Published by nextgenusfs over 6 years ago

funannotate - funannotate v1.7.0

Code has been repackaged to conform to a "proper" python package -- which means it now also resides on PyPi and a Bioconda package can be built. Along with the repackaging there are many improvements/fixes.

  • funannotate now keeps track of "trained species" for all of the ab-initio gene predictors (Augustus, gene mark (optional), snap, GlimmerHMM, codingquarry). This requires all users to update their database, ie funannotate setup command. After running funannotate predict the software will output a JSON file containing the paths to the trained parameter files -- this can be used again for a different genome via the funannotate predict --parameters options. This parameter file can also be added to the database with the funannotate species -s genus_species -a parameters.json command. Running the funannotate species command will output a table in the command line of which species have training data. Addressed #320
  • antiSMASH remote script fixed and parser updated for v5 output.
  • added filtering for gene models that start/end in a gap that can sometimes show up after running funannotate update
  • added a check for diamond version of the database and current copy -- this results in many hidden errors by users, ie diamond databases were created with an older/incompatible version than what is running currently.
  • updated Augustus functional check
  • removed RepeatModeler/RepeatMasker as strict dependencies. Due to RepBase change in usage license, repeatmasker/modeler are not available to most users. The funannotate mask command can still run this routine if you have the necessary dependencies installed, however, the current default is simply to run tantan masking. This is probably not sufficient for most genomes, thus happy to integrate a robust solution once one exists for repeat masking.
  • augustus parameter training now done in the local output folder, so no longer need write access to $AUGUSTUSCONFIGPATH

- Python
Published by nextgenusfs over 6 years ago

funannotate - funannotate v1.6.0

  • support for antiSMASH v5.0 output #292 #299
  • add snap and glimmerhmm request from #240. BUSCO is now run by default in funannotate predict if there is no PASA data -- BUSCO results used to train glimmerhmm and snap
  • improved Phobius results parsing #259
  • multi-threaded funannotate clean, thanks to @bogemad
  • fix database links #300
  • multi-threaded hisat2-build #303
  • write all output files directly from tbl format -- fix bug associated with multi-transcript parsing from GenBank files
  • bug fix for protein2genome exonerate mapping
  • bug fixes for funannotate train and funannotate update
  • added --min_coverage option to trinity workflow and set default to 5
  • bug fixes for codingquarry predictions (RNA-seq only)
  • improved error message for repeatmodeler/masker #298
  • bug fix for remote searches
  • bug fix for parsing input folders in annotate #302
  • updated conda install docs --> which seems futile to keep this current....

- Python
Published by nextgenusfs over 6 years ago

funannotate - funannotate v1.5.3

  • updated the CAZYme dbCAN link
  • fix string formatting in GeneMark-ET function

- Python
Published by nextgenusfs almost 7 years ago

funannotate - funannotate v1.5.2

  • restructure augustus accessory scripts calls so that they don't have to be in same location as the exe, i.e. this used to be a bug if you used conda installed augustus as it puts the scripts in a different location than the default augustus release folder
  • added a check to busco routine to default to a single thread if tblastn version is found that might have multithread issues
  • allow a min number of gene models to use for Augustus training, --min_training_models in funannotate predict
  • limit genemark to 64 cpus -- it will die if you try to give it more
  • fix shortBAM declaration in funannotate update
  • for genemark-ET set score of introns to 500, otherwise seems to be dying.
  • updated weights for different gene models -- would be nice to have this be a customizable option....

- Python
Published by nextgenusfs about 7 years ago

funannotate - funannotate v1.5.1

  • updated dbCAN links
  • important bug fix for RNA-seq analysis. the bam2gff3 function was not outputting the proper coordinates for crick stranded alignments for PASA, resulting valid minimap2 alignments being thrown out.
  • several other minor bug fixes

- Python
Published by nextgenusfs over 7 years ago

funannotate - funannotate v1.5.0

  • add funannotate test as a unit test script to validate installation #212
  • add CodingQuarry and StringTie integration into funannotate. Note these are "silent" dependencies, meaning if not installed this method will be skipped. If both tools are installed and RNA-seq data is used they will be run automatically #200
  • fix contig number reporting in funannotate clean #210
  • fix bug in a few of the accessory tools funannotate util
  • update database to MiBIG v1.4

- Python
Published by nextgenusfs over 7 years ago

funannotate - funannotate v1.4.2

  • fix bug in train and update where script would die due to symlink error #189
  • for predict the --protein_alignments option now takes GFF3 input (not exonerate output). This is to make consistent with the --transcript_alignments. Scripts now write hints file for augustus from the GFF3 file.
  • similar to above, added funannotate util prot2genome which will run the diamond/exonerate mapping of proteins to the genome --> output is GFF3 file compatible with EVM. These data can then be passed to --protein_alignments

- Python
Published by nextgenusfs over 7 years ago

funannotate - funannotate v1.4.1

  • bug fix for funannotate predict during parsing the soft masked genome -- for large genomes this was slow and used too much memory. it is now multithreaded and has lower memory footprint. #197
  • bug fix for ncRNA models are now listed as full length, should no longer cause NCBI errors #195
  • support multiple inputs to --other_gff #191
  • make augustus use --softmasking=1 option
  • default value for --soft_mask is now set to 2 kb 2000
  • output fasta files are now wrapped at 80 characters
  • tbl2asn is now multithreaded on large genomes or those with more than 10000 contigs
  • several updates to parsing of GenBank files to deal with unexpected formatting #196

- Python
Published by nextgenusfs over 7 years ago

funannotate - funannotate v1.4.0

  • support for long-read RNA-seq data: funannotate train and funannotate update can take PacBio isoSeq (--pacbio_isoseq), Nanopore cDNA reads (--nanopore_cdna), and Nanopore direct mRNA (--nanopore_mrna).
  • fix for important bug in transcript alignments in funannotate predict -- bug in previous versions related to multi-exon crick alignments not getting correctly parsed into GFF3 alignments
  • soft masking is now decoupled from funannotate predict, this is now done with funannotate mask. Reason for this switch is to allow more flexibility in how the assembly is soft masked -- this can be done externally with another program. This change will allow users that don't have access to RepBase to use an alternative from RepeatMasker/RepeatModeler. One alternative is RED -- I wrote a wrapper for called RedMask
  • funannotate predict can now run without GeneMark being installed -- again to accommodate users that may be unable to use GeneMark due to licensing. Note you can pass gene predictions from any external program to --other_gff and they will be handed off to Evidence Modeler.
  • spaces in either strain or isolate name will be stripped #180
  • default program for funannotate clean changed to minimap2 #176
  • fix errors in partial gene models derived from using EVM script to generate proteins, this is now done internally using exact coordinates #184
  • added --soft_mask option to funannotate predict which will control the option with same name in GeneMark, i.e. default is --soft_mask 5000 which means that repeat regions less than 5 kb will be ignored for GeneMark prediction, those greater than 5 kb will be fed to Genemark. #185
  • bug fixes for tbl file generation. all tRNA models will be partial #184
  • improvement to how data from funannotate train is used in prediction steps
  • Slight changes for clarity to funannotate predict flags for evidence alignments: --protein_evidence Proteins to map to genome (prot1.fa prot2.fa uniprot.fa). Default: uniprot.fa --protein_alignments Pre-computed exonerate protein alignments (see docs for format) --transcript_evidence mRNA/ESTs to align to genome (trans1.fa ests.fa trinity.fa). Default: none --transcript_alignments Pre-computed transcript alignments in GFF3 format
  • added funannotate util bam2gff3 script to convert coordsorted RNA-seq BAM alignments to GFF3 compatible alignment file.
  • fix bug for input of files+weight in funannotate predict -- script would get hung up if you passed --other_gff snap_alignemnts.gff3:5 #191
  • allow for non-standard LocusTags - will now split on last underscore #191

- Python
Published by nextgenusfs over 7 years ago

funannotate - funannotate v1.3.4

  • bug fixes for sec met cluster output files and corresponding MiBIG cluster mapping
  • add tRNAscan-SE to funannotate check and predict
  • update menu with some params that were missing

- Python
Published by nextgenusfs over 7 years ago

funannotate - funannotate v1.3.3

  • bug fix for funannotate compare where GO enrichment not being run in parallel from last update
  • use diamond blastp search for ortholog detection --> speed increased.
  • don't run seqclean if file present
  • update docker release to newest version of funannotate as well as newest version of Trinity, PASA

- Python
Published by nextgenusfs almost 8 years ago

funannotate - funannotate v1.3.2

  • added several utility scripts --> accessible by funannotate util submenu. This includes funannotate util compare which will compare multiple annotations to a reference. ``` $ funannotate util

Usage: funannotate util version: 1.3.2

Commands: compare Compare annotations to reference (GFF3 or GBK annotations) tbl2gbk Convert TBL format to GenBank format gbk2parts Convert GBK file to individual components gff2proteins Convert GFF3 + FASTA files to protein FASTA gff2tbl Convert GFF3 format to NCBI annotation table (tbl) `` * bug fix forfunannotate remotemoving logfile * bug fix for mapping proteins to genome where tmp folder wasn't being properly removed * run GO enrichment in parallel infunannotate compare * update colors in some graphs fromfunannotate compareto 24-pack Crayola colors * add option to useiqtreeto draw ML phylogeny infunannotate compare * bug fix forfunannotate database` command where it was not displaying table correctly.

- Python
Published by nextgenusfs almost 8 years ago

funannotate - funannotate v1.3.1

  • bug fix for funannotate setup added missing shutil library import

- Python
Published by nextgenusfs almost 8 years ago

funannotate - funannotate v1.3.0

  • bug fix for weights being set for Augustus HiQ models in funannotate predict
  • bug fix for download_buscos function
  • bug fix for funannotate annotate where tbl file was occasionally not being parsed correctly --> re-write of parsing function
  • fix bug in antiSMASH/MiBIG parsing
  • add method to try to recover from failed GeneMark run
  • several bug fixes for funannotate update related to UTRs and multiple transcripts per locus.
  • added missing dependencies to funannotate check
  • updated code to work with PASA > v2.3 - this is important PASA update that allows SQLite usage instead of MySQL
  • improved terminal log output to tell user which files (with locations) are being re-used if they are found.

- Python
Published by nextgenusfs almost 8 years ago

funannotate - funannotate v1.2.0

  • v1.2.0 now supports multiple transcripts per gene locus. The funannotate pipeline will only generate multiple transcripts per locus if given evidence in the form of RNA-seq data, this is done in the funannotate update command. It should also now support input with multiple transcripts as well.
  • move installation of busco models to funannotate setup
  • added annotation edit distance (AED) to funannotate update to record the changes in annotation. As well the PASA annotation update text file is changed to incorporate these changes as well
  • accessory script util/compare2annotations.py can compare multiple annotations in either GFF3 or GBK format to a reference, generating summary stats as well as individual gene stats (AED per mRNA and CDS)
  • added a --drop option to funannotate fix that you can remove unwanted gene model annotations, to use pass a file containing locus_tag (1 per line) to the --drop parameter
  • fix bug in finding high-quality Augustus predictions (HiQ) models in funannotate predict
  • funannotate predict will now detect if a training folder exists in output directory, if it does it will find the correct PASA, BAM, and Trinity output and use automatically during the prediction step.

- Python
Published by nextgenusfs about 8 years ago

funannotate - funannotate v1.1.1

  • fix for braker to work on docker. For some reason (I don't know why) the symlinks that braker tries to create cause an error when run on docker. The error references too many levels of symlinks essentially. To circumvent this, I modified braker.pl code to copy instead of symlink. Also fixed the braker.pl --version option which was broke in most recent release.
  • Note for a "normal" system, v1.1.0 should work fine. The updated braker code was run on both docker and Mac native and runs fine on those, hopefully also working well on linux.

- Python
Published by nextgenusfs about 8 years ago

funannotate - funannotate v1.1.0

  • bumping version to 1.1.0 to highlight that v1.0.X versions have a bug in the tbl annotation file and will not pass GenBank specs. This was derived from dropping GAG from funannotate I had the tbl spec wrong for adding transcriptid and proteinid to both CDS and mRNA features.
  • fixes for funannotate update and properly filtering overlapping genes
  • fix for funannotate annotate that was switching the 5' and 3' partial gene designations on crick orientated gene models, causing them to look correct after predict step and then become errors after annotate step
  • added Braker 2.0.3 to funannotate.... this was necessary as braker.pl --version doesn't display the version number so I can't enforce a version requirement. The larger issue has to do with how the different versions of braker save the output data, there are at least 3 different behaviors in the last 4 or 5 versions which makes impossible for funannotate to determine where output will be.

- Python
Published by nextgenusfs about 8 years ago

funannotate - funannotate v1.0.2

  • update to GFF to TBL parser to catch some "common" errors in GFF files
  • added funannotate iprscan which will run Docker InterProScan searches or also local searches. It will split the job into chunks and run those in parallel which seems to be a faster way to run InterProScan. By default it will chunk the proteins into 1000 protein bins and then run 4 cpus each up to as many cpus as you give the script.
  • fix to docker build (hopefully)
  • bug fixes for parsing the ncbi error report, properly outputting which genes are causing errors
  • fix for antiSMASH parsing of plantismash data

- Python
Published by nextgenusfs about 8 years ago

funannotate - funannotate v1.0.1

  • Wrote a new GFF to TBL parser to accommodate running funannotate annotate on a fasta + GFF file.
  • Added COGs output to funannotate compare, these annotations are parsed from eggnog-mapper data
  • several minor bug fixes

- Python
Published by nextgenusfs about 8 years ago

funannotate - funannotate v1.0.0

Major update to funannotate with new RNA-seq modules, new database download and management, new gene name/product definition module, many bug fixes.

RNA-seq modules: 1. funannotate train: Module will run RNA-seq mediated methods for training of GeneMark/Augustus in gene prediction. It will take single or PE RNA-seq FASTQ files, run Trimmomatic quality trimming, run Trinity-mediated read normalization, run Trinity genome-guided RNAseq assembly, run PASA alignment methods. Output is BAM file, trinity transcripts, and PASA GFF3 for use in funannotate predict. 2. funannotate update: Module will run PASA mediated gene model updates. It can be run after running train --> predict --> update, which will add UTR models and refine gene models. The script can also be run on a pre-existing GenBank assembly where it will run the funannotate train methods (quality trimming, normalization, Trinity, PASA) and then followed by the update specific methods to add UTRs, refine models, etc.

funannotate predict enhancements: 1. Dropped use of GAG to write NCBI tbl file and wrote functions to do this natively in funannotate --> which was making mistakes on some partial gene models. 2. Simplified NCBI tbl generation and gene model filtering --> only running tbl2asn a single time now as bad gene models are properly filtered (previously a regex search was not working perfectly resulting in some gene models being removed arbitrarily) 3. tRNA gene length filter is now in compliance with NCBI rules (you can safely ignore tbl2asn tRNA gene length warnings --> they will eventually update tbl2asn source code) 4. Numbers of gene models for each "source" are now printed to terminal prior to running Evidence Modeler. 5. Script parses the NCBI error reports and show user which gene models need to be manually fixed, after the tbl file is updated, the GBK output files can be regenerated with the new funannotate fix command.

funannotate annotate enhancements: 1. Diamond search has replaced Blast wherever possible, results in large increase in speed. 2. HMMer searches are now split across multiple CPUs, results in increase speed. 3. Gene names and product definitions are now parsed from UniProtKb/SwissProt results and EggNog-Mapper results. The product definitions are cross references to a community resource called gene2product which will serve as a database of curated gene product definitions. 4. Native NCBI tbl generation results in proper annotation of partial gene models. 5. Script will parse tbl2asn errors and alert user of gene models that need to be fixed.

New Database Management modules: 1. Environmental variable addition: FUNANNOTATE_DB allows user to install databases locally, i.e. in a users home directly on an HPC.
2. funannotate setup script has been re-written from scratch to control the databases, keep track of versions, and allow user to update database. 3. funannotate database is a new command that shows you currently installed databases. 4. Databases have been trimmed down, occupy ~ 4 GB of space.

I would recommend that all users upgrade. After upgrading, you will need to re-download the databases from scratch. As always, many bugs have been fixed and likely some new ones introduced. Please let me know if you encounter errors.

Docs/Manual/Tutorials will be available soon at http://funannotate.readthedocs.io

- Python
Published by nextgenusfs about 8 years ago

funannotate - funannotate v0.7.2

  • fix bug in funannotate compare, string conversion to int failed on a check for number of genes
  • added better error message for duplicate locus_tag ids in funannotate compare

- Python
Published by nextgenusfs over 8 years ago

funannotate - funannotate v0.7.1

  • fix menu in funannotate annotate that still had --email as an option -> it is not longer an option, all remotes searches moved to funannotate remote
  • fix eggnog parsing issue where COG and Description are blank -> this happens if you run diamond search with eggnog-mapper. You should run HMM search with the appropriate EggNog database, i.e. for fungi that is the fuNOG database.

- Python
Published by nextgenusfs over 8 years ago

funannotate - funannotate v0.7.0

Release v0.7.0 notes:

funannotate predict

  • unified genbank conversion method
  • added support for repeatmasker_species option
  • added support for strain flag for genbank conversion
  • improved filtering of problematic gene models

funannotate annotate

  • removed all remote searches from script (now funannotate remote see below)
  • dropped EggNog search, instead —eggnog option will parse the results from eggnog-mapper. Eggnog-mapper does a more comprehensive search and provides some more functional annotation information than the simple HMMer search of EggNog 4.5 database
  • now outputs a tsv annotation file into the annotate_results output folder
  • improved functional annotation for Gene and Product names
  • added support for strain flag for genbank conversion

funannotate compare

  • increased speed of parsing GBK files
  • remove EggNog description mapping
  • fix links to MEROPS database in html output

funannotate remote

  • new sub command that will run remote searches
  • currently support Phobius, antiSMASH, and InterProScan
  • Note: these searches are a free service, don't abuse them. If you can install these software locally it will significantly decrease your run time. They are included here as some are Linux only and/or setup is very difficult.

funannotate setup

  • Eggnog 4.5 database no longer required

- Python
Published by nextgenusfs over 8 years ago

funannotate - funannotate v0.6.2

  • added support to funannotate predict for an --other_gff option that will pass annotation directly to EVM. You can control the weight for EVM, like this --other_gff my_predictions.gff3:10, which would give the gene models a weight of 10 in EVM
  • better support for --pasa_gff passed to funannotate predict where now input is not hardcoded to have transdecoder in column 2 of the gff file. You can also control the EVM weight like this: --pasa_gff my_pasa.gff:10 to give it a weight of 10
  • BRAKER1 method now pulls out high quality Augustus models (HiQ) that have >90% exon supported by evidence, these are given a weight of 5 in EVM
  • Added a few stats for repeat masking genome as well as number of transcripts mapped
  • updated funannotate so it is compatible with new version of GAG v2.01.

- Python
Published by nextgenusfs almost 9 years ago

funannotate - funannotate v0.6.1

  • Numerous bug fixes

    • Strip asterisks from protein fasta files to avoid problems with InterProScan
    • logfiles folder was not being created if --genbank was passed to funannotate annotate
    • Linux bug where last step of funannotate predict was terminating prematurely resulting in partial output files
  • Re-write of the InterProscan parsing scripts. Now script will parse IPR Domains and GO terms directly from XML file, instead of splitting XML file and then parsing 1 by 1.

  • Great update by John Longinotto on his pybam native BAM parser which is integrated into funannotate predict to quickly check BAM headers to make sure they match FASTA headers for input into Braker

- Python
Published by nextgenusfs almost 9 years ago

funannotate - funannotate v0.6.0

  • fix tRNA gene model filtering to deal with the tbl2asn >150 error
  • improve XML parsing in funannotate compare
  • add diamond alternative for exonerate pre-filtering in funannotate predict
  • make funannotate docker compatible and create docker image
  • EggNog and BUSCO2 database are now not downloaded in the initial setup, but you can manage EggNog databases with funannotate eggnog . This was due to problems in building docker image downloaded the large databases. The scripts will download on the fly if default database is not available.
  • added some external dependency versions in funannotate check

- Python
Published by nextgenusfs almost 9 years ago

funannotate - funannotate v0.5.7

  • bug fixes for logging
  • bug fix when multiple protein evidence files are passed
  • add phobius to funannotate annotate to predict secreted proteins in combination with signalp
  • add test data genome4.fasta that can be used to test the BUSCO2 augustus training method
  • added support for checking BAM reference sequence headers if they match the genome FASTA headers, this only happens if BAM file passed to --rna_bam

- Python
Published by nextgenusfs about 9 years ago

funannotate - funannotate v0.5.5

  • typo fixes for log file names
  • typo fix for fuNOG annotations in secondary metabolism module, this was now fixed to use the proper --eggnog_dboption
  • test for dN/dS ratio test to assert that the tree that was drawn by Phyml has the correct number of proteins
  • new feature for BUSCO models if --ploidy is greater than 1 in funannotate predict that duplicated BUSCO models are also parsed, the one that is picked has the highest score
  • Support for bypassing RepeatModeler/RepeatMasker, you can now enter --masked_genome and --repeatmasker_gff3 options to skip that step. Note that both options are required.

- Python
Published by nextgenusfs about 9 years ago

funannotate - funannotate v0.5.4

  • update to tblastn/exonerate protein mapping for better speed and more thorough searches
  • added --ploidy option to funannotate predict which controls the max number of hits for tblastn filter to pass to exonerate, which is set at 2 x ploidy. You should likely only increase this if your assembly is more than haploid - so perhaps newer assemblies with nanopore/pacbio may be able to resolve diploid chromosomes. It shouldn't have negative consequences in increase --ploidy, but will increase run time for protein mapping.
  • happy holidays...

- Python
Published by nextgenusfs about 9 years ago

funannotate - funannotate v0.5.3

  • re-organize output so that temporary folders are created in the "final" resting place and not in the current directory
  • modification of the multiprocessing function to include a simple progress percentage output
  • bug fixes in funannotate compare and the orthology dN/dS output hanging when the dN/dS calculation failed
  • modification to the logging to capture STDERR/STDOUT from many external tools into the log file, hopefully this will result in catching more errors than piping them to os.devnull
  • bug fix in funannotate annotate where output folders not being created if the input was GFF, proteins, and fasta.
  • made it a requirement to pass --species argument to funannotate annotate if you do not pass in a GenBank file. This is to prevent downstream problems in funannotate compare with how the scripts name the genome/isolates.
  • made a FAQ section in the docs that includes how to manually adjust gene models using the included tools
  • all internal tests passed, however guaranteed there are more bugs. please let me know when you find them.

- Python
Published by nextgenusfs about 9 years ago

funannotate - funannotate v0.5.2

  • update to dN/dS function to have two options: 1) --rundnds estimate (which runs the M0 model only), and 2) --rundnds full (which runs M0, M1, M2, M7, M8 and calculates the LTR of M1/M2 and M7/M8).
  • update to the multiprocessing progress function - a simple progress meter is used on most multiprocessing functions to let user know how many processes have finished
  • bug fix where transcripts and proteins were getting written to same file in funannotate predict
  • minor bug fix in funannotate clean where input number of scaffolds was not printed out correctly
  • change the default location of DB as per requested by some users, now defaults to $HOME/funannotate. Note you can set this to whatever directory you want.

- Python
Published by nextgenusfs about 9 years ago

funannotate - funannotate 0.5.1

  • bug fixes for funannotate compare
  • bug fix for funannotate predict during gene model filtering of large genomes occasional parent:child features would get missed
  • added feature of calculating dN/dS ratios in funannotate compare

- Python
Published by nextgenusfs about 9 years ago

funannotate - funannotate v0.4.0

  • integration of BUSCO2 script and models. Can see the BUSCO2 distribution here, funannotate uses a slightly modified version to be compatible with the BUSCO->EVM workflow.
  • BUSCO2 models have changed a bit, there are now a lot more options for various taxonomic groups. Something to keep in mind though is that the model names for dikarya are not the same as pezizomycotina so if you use an --outgroup option be sure that the outgroup was generated with same BUSCO DB
  • The funannotate setup script will remove previous BUSCO DB models and download the new ones because of the extensive change in the BUSCO2 structure.
  • addition of funannotate outgroups to help you mange the outgroups available to funannotate compare
  • the scripts will download and format any of the available BUSCO2 eukaryote models, to see a list in a taxonomic tree format you can type funannotate outgroups --show_buscos

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.14

  • funannotate annotate will now support a single XML file for InterProScan5, you either pass a folder of single XML files 1 per protein, or a single XML file containing all of the annotations to the --iprscan option
  • fix in funannotate annotate that did alert user that --email is required if using remote IPR5 search; this is default setting

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.12

  • bug fix for path issue when running EVM; discovered on new install on Mac - not sure why it wasn't found earlier, but resulted in failed EVM run
  • some improved logging for EVM module

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.11

  • bug fix for funannotate compare where the genome stats was not printing for all genomes
  • goatools changed their headers on the output of the GO enrichment (again), so re-wrote how the data is parsed, hopefully this fix applies to all versions.

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.10

  • explicitly run rmblast/ncbi engine for RepeatMasker to avoid problems if user has default setup as something else, i.e. DFAM. Note you still need to install RepBase Libraries, e.g.

``` wget --user name --password pass http://www.girinst.org/server/RepBase/protected/repeatmaskerlibraries/repeatmaskerlibraries-20150807.tar.gz tar zxvf repeatmaskerlibraries-20150807.tar.gz -C #{HOMEBREW_PREFIX}/opt/repeatmasker/libexec

    cd #{HOMEBREW_PREFIX}/opt/repeatmasker/libexec
    ./configure <config.txt

```

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.9

  • bug fix to funannotate compare that was not pulling orthology groups for the final annotation table
  • added transcription factors to output of all annotation table.

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.8

  • bug fix in funannotate compare that was calculating MEROPS summary stats incorrectly
  • added --minlen option to funannotate sort to discard short contigs

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.7

  • move install to the funannotate wrapper, funannotate setup
  • fix bug with custom input folders in funannotate annotate
  • output proteins/transcript files for both funannotate predict and funannotate annotate

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.6

  • bug fix in funannotate predict when using BUSCO the EVM input was pulling entire gene models instead of sliced models
  • bug fix where BUSCO models were within 100 bp of the start or end of contig resulting in a bedtools range slicing error
  • remove slicing of hints file for parallel AUGUSTUS method as splitting the hints file was a slow, faster to just pass the entire hints model to each contig chunk and let AUGUSTUS filter it.

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.5

  • update to the way that funannotate predict parses maker2 results, now using maker models directly as opposed to pulling out annotation from each predictor.
  • bug fix if running funannotate compare with a single species

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.4

  • fix to the braker1 method where augustus output was not properly found
  • minor update to --optimize_augustus training to align with method used in braker1

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.3

  • fix issue with parallel augustus where very large scaffolds would cause large memory usage, script now chunks the data into 500 kb sections with 10 kb overlaps on each side, runs in parallel, and then combines the results.
  • re-ordered transcript evidence in funannotate predict to address providing hints to augustus
  • some minor bug fixes

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.2

  • build a check for augustus version and test if it will function with busco and braker1
  • revamped busco mediated training of augustus to run busco quickly, filter evidence data corresponding to busco models, filter genemark-ES data, run evidence modeler to get high quality gene sets, filter EVM output with busco to build a final augustus training dataset, and finally train augustus
  • improved system info reporting
  • due to problems with installing augustus on different operating systems, augustus is not installed default via brew install funannotate. However, running funannotate predict without a version of augustus installed will give you some hints on how to install it for your system.

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.1

  • important bug fix for augustus, previous versions were running augustus with the stop codon inside the prediction, which results in the gene models to fail validation in evidence modeler, thus this update is recommended for all users
  • added high quality augustus models to be pulled out of annotation if they are represented by evidence using the --hintsfile, these models are passed to EVM with additional weight
  • fixed genemark bug where a single contig resulted in an error, thus funanntoate predict can now handle a single contig as input correctly.

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.3.0

  • improvements to the gene model filtering in funannotate predict, ability to keep gene models without proper stop codons if desired, --keep_no_stops
  • augustus is now multi-threaded
  • upgrade of packaged BUSCO to v1.2, slightly faster runtime and simplified code
  • non-fungal options are now included in funannotate, however use with non fungal genomes has not been extensively tested. Options of note are --organism, --busco_db, --eggnog_db
  • secondary metabolism enzymes added to funannotate compare if genomes were annotated with antiSMASH

- Python
Published by nextgenusfs over 9 years ago

funannotate - funannotate v0.2.11

  • fix bug where tRNA predictions lacking 'product' annotation would cause funannotate annotate to die
  • build in check for fasta header length for funannotate predict, max is 16 characters for GenBank format

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.10

  • bug fix for funannotate predict and an empty --transcript_evidence option
  • bug fix for NCBI error parsing in funannotate predict

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.8

  • fix bug that crashed checking input files for --protein_evidence or --transcript_evidence if passing multiple files separated by a ,
  • Add backbone secondary metabolism summary to funannotate compare

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.7

  • added summary of transcription factors to funannotate compare based off of InterProScan domain hits.
  • a few minor bug fixes

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.6

  • bug fix on some NCBI genbank files where protein ID/locus ID aren't entered in the way that funannotate expects them
  • bug fix for funannotate predict when generating augustus hints files

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.5

  • fix bug in processing antismash results, a typo on file handle
  • increase signalp chunks to 40 to further reduce memory consumption
  • fix naming of signalp graph and orientation of x-axis names

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.4

  • fix memory problem of running whole genome through signalp, now splits into 20 chunks, should result in < 200 proteins per run
  • work around for BUSCO/Augustus training problem if augustus species set to generic
  • fix typo in v0.2.3

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.2

  • yet another fix for goatools changing format. goatools v0.6.4 is now required, previous versions are not supported
  • Added signalp search to funanntoate annotate and funannotate compare. Since signalp has to be manually installed and configured, it will only be run if the program is detected and won't be listed as a dependency.

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.2.1

  • fix another bug due to goatools format changes

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.9

  • bug fix for GO enrichment using goatools as the developers changed the input parameters for the enrichment script
  • funannotate predict now uses hints for augustus prediction. hints are generated from BLAT transcript alignments from the --transcript_evidence option and protein hints are generated from exonerate alignments from the --protein_evidence option. The weights file from evidence modeler was adjusted slightly to account for better predictions from augustus.

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.8

  • fix bug in funannotate compare where script would crash if gene models that had the same base name
  • added funannotate check a little script to tell you if Python modules are up to date

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.7

  • fix install setup.sh script if first home-brew installation it defaults to asking user for DB install path
  • updated some docs to reflect some installation changes that were not properly documented in last release

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.6

update to setup.sh script.
- If update funanntoate through home-brew, script will automatically detect previous version database and setup symlink. This will avoid having to setup DB every time there is an update. - on first install, allows user to specify custom path for installation directory

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.5

Bug fixes: - fixed variable naming resulting in centOS error - allow user to specify database install folder in the setup.sh script as some systems don't have access to /usr/local/share - remove bundled proteinortho5 as some users don't have sudo privileges, now ProteinOrtho installs using HomeBrew or LinuxBrew

Enhancements - added support for --maker_gff for funannotate predict that will parse a MAKER2 GFF file to EVM inputs. Providing a MAKER2 GFF file will bypass the evidence mapping and predictions steps of funannotate and use the data in the MAKER2 file.

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.4

  • improved logging and error reporting to try to catch user input errors and/or funannotate setup

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.3

  • minor bug fixes and update of documentation

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.2

A few minor bug fixes. - moved the DB folder to install in /usr/local/share/funannotate to avoid having to reinstall all databases if upgraded funannotate through brew

- Python
Published by nextgenusfs almost 10 years ago

funannotate - funannotate v0.1.1

Update to setup.sh script to fix an error during installation.

- Python
Published by nextgenusfs almost 10 years ago