Recent Releases of funannotate
funannotate - funannotate v1.8.17
Bug fix release.
- fix FASTA splitter in iprscan local 83ceee5e4119b296aa6c2c33e796747d592dea54
- pin docker and pip installs to biopython<1.80; so we try to avoid any of the other deprecated or breaking code
- update to support EVM v1 or v2; back-port training data from funannotate2 (for testing)
- run signalp6 if in PATH
- fix pandas compatibility >v2
- add config-window parameter to protein2genome script
- Python
Published by nextgenusfs about 2 years ago
funannotate - funannotate v1.8.15
Bug fix release
- Trying hard to support
augustus v3.5 - fix sorting by length #896
- update dbCAN Cazyme filtering #890
- fixes for parsing eggnog mapper #868 #892
- signalp6 fixes #822
- and many others
- Python
Published by nextgenusfs almost 3 years ago
funannotate - funannotate v1.8.13
Bug fix release
- fix to internal BUSCO code to support Augustus v3.4 #742 @IanDMedeiros
- several fixes to
funannotate update#727 #740 - several patches to Docker build
- Python
Published by nextgenusfs over 3 years ago
funannotate - funannotate v1.8.11
Bug fix release
- support for signalp v6 parsing #650 #716
- improve protein2genome performance #652 #654
- add
--tmpdiroptions to most scripts (defaults to/tmp), ie use if have access to SSD to improve speed - upgrade
eggnog-mapperto newer version, use in memory if enough is available to improve speed - fix version check #655
- fix for abnormal contig identifiers #672
- fix for EVM contig memory parsing #646
- add
-m,--mito-pass-thrutofunannotate annotateto add back mitochondrial genome, mitochondrial contigs should be removed prior to running funannotate as uses different genetic code - antismash6 support #692
- update download urls to https from ftp
- fix for EVM if last line of GFF3 is blank #709
- simplify runSuprocess functions #707 #710 @mglubber
- move temporary files for check for soft masking to output folder #722
- conda augustus is apparently broken, fix in docker#722 #724, so need to install augustus a different way (ie apt-get on Debian)
- add missing help menu options #714
- fix for
funannotate updatecompare annotations function #727 - add
--ml_modeltofunannotate compareto control ML model selection iniqtree, default is to run modelfinder which is very slow.
- Python
Published by nextgenusfs over 3 years ago
funannotate - funannotate v1.8.9
- fix for EvidenceModeler partitioning -- on rare cases genes/contigs were not getting partitioned correctly
- fix logic for PASA training of glimmer/snap #591
- fix logic for pre-trained datasets in
funannotate predict#597 - re-organize tbl2asn and provide single threaded backup if failure #599 and fix log error #621
- remove perl script generating AGP file --> now python. which removes bioperl as dependency #617
- fix cds-transcripts coordinates (not used for any other processing) but on partial transcripts they were incorrect #614
- fix bug in EVM filtering if no genes to remove #620
- update eggnog parser to support versions >2.1.2 #566
- fix tmpdir error in
funannotate clean#594 - python2 is no longer supported.
- Python
Published by nextgenusfs over 4 years ago
funannotate - funannotate v1.8.7
Bug fix release
- bug fix for eggnog mapper v2 parsing, occasionally some records have more than 1 OG in the "best_OG" field -- this is probably a bug in emapper.py, but funannotate will now ignore those entries as there is no way to tell which is actually the best OG.
- make fasta2agp accept more IUPAC characters #532
- fix for diamond >v2.0.8 (yet another change in the database info format resulted in error).
- remove support for python 2
- Python
Published by nextgenusfs almost 5 years ago
funannotate - funannotate v1.8.5
Bug fix release, recommended for all users.
- use
uuidunique identifier for tmp file names (previous uses process id) - fix
diamondcommand in mapping proteins to genome #529 - add some contig name checking for inputs to predict -- some users erroneously pass GFF3 files that reference transcripts or some other assembly, warn user #528
- add support for antiSMASH v6 #531 #539
- check genome for IUPAC errors #532
- re-write long reads mapping to trinity transcripts #539
- allow for fewer busco models for training #556
- expose some EVM parameters to predict #558
- modify EVM partitioning -- some gene models close to partition were not being called correctly #558.
- support newly released eggnog-mapper >= v2.0.5 #566, note that v2.x - 2.0.4 will not be supported as the output is broken in the sense that it is impossible to parse the proper EggNog reference ID. If you are using any eggnog results from this code, strongly advised to re-run with new release.
- add
_pasato PASA database names #398. Note -- this change could potentially cause pre-existing databases run with older funannotate code to not be recognized properly after v1.8.5.
- Python
Published by nextgenusfs almost 5 years ago
funannotate - funannotate v1.8.3
Bug fix release
- remove auto fix for RNA-seq data if it does not have Trinity compatible headers -- warn user to fix instead #485
- fix antiSMASH v5 parsing #490, update secmet GBK format of clusters
- add
funannotate util gff-renamescript - use
natsortfor sorting contigs/names so gene names are correctly auto incremented #501 - fix for signalp v5 #494
- support fasta headers with spaces for clean #504
- rstrip(*) from proteins if GFF/FASTA passed; #508
- expose --p2g_prefilter to use tblastn for mapping protein evidence #495
- add --anysymbol to mafft calls #514
- use frameshift diamond if >2.0.5 #503
- improve glimmerhmm parsing error report #519
- add
--no-progressoption to clean up terminal output (ie log files from cluster) - use sorted bedtools intersection to reduce memory usage #522
- add --trnascan option to predict #523
- Python
Published by nextgenusfs about 5 years ago
funannotate - funannotate v1.8.1
- highly recommend all users upgrade
- Code now python 2/3 compatible (only 9 months late....)
- EVM now uses
interlapfor slicing input files, significant speed increase for larger genomes - if
--organism otheris passed,codingquarryis now disabled by default -- you can turn it on by specifying a valid weight, ie--weights codingquarry:2 - support for
signalpversion 5 - move resources to JSON download via GitHub, will allow for updating/changing resources without re-installing a new version of
funannotate - several fixes for more robust GFF3 parsing
- fix for cazyme assignments where results were duplicated; also upgraded to database version 8
- fix bug where PASA results not properly passed to EVM if funannotate predict was re-run
- many other bug fixes
- Python
Published by nextgenusfs over 5 years ago
funannotate - funannotate v1.7.4
Okay, so apparently I lied about bug fixes before py3 release. Here are some quick ones.
Also -- some users have said that conda recipe fails to find a solution, with the help of @reslp this is perhaps caused by ete3 package as a dependency. Since ete3 is only used in funannotate for the dN/dS calculation in funannotate compare, I'm going to remove it from conda recipe as a dependency. Users that would like to continue using the dN/dS function in compare will need to manually install ete3 (ie conda could work on your system or pip, etc).
- update the log file copy/remove in
funannotate annotatewhich causes a problem in singularity container - encoding bug fixes for some PFAM results
- fix for
funannotate iprscanwhich was not correctly combining XML files with newer versions of Interproscan.
- Python
Published by nextgenusfs about 6 years ago
funannotate - funannotate v1.7.3
- bug fix release. This will be the last release before updating the code for py3
- bug fix for remote #379
- improve GFF3 parsing in
funannotate annotateallow for non-funannotate identifiers (hopefully this catches most of them) --alignersoption was missing from help menu intrain/update- add
--debugflag tofunannotate iprscan-- there is potentially still an issue here combining results from the newest release of Interproscan -- I need some intermediate results to see what the problem is - fix GO annotations parsing for more versions of goatools #363
- fix for bug in the "gene start or end in a gap" function
- make sure
funannotate annotatere-runs tbl file generation if script is re-run - fix in PASA function logic where PASA db not found if using --gff and --fasta input.
- Python
Published by nextgenusfs about 6 years ago
funannotate - funannotate v1.7.2
- busco internal run now uses the local augustus config path, so hopefully write access to
$AUGUSTUS_CONFIG_PATHis no longer required - long reads are now processed to remove forward slashes from names if present -- this was causing problems with PASA #326 #250
- added augustus hints file generation when using RNA-seq data. Also added a check for
--transcript_evidenceif found in--transcript_alignments, ie so extra transcripts can be added instead of only those used to generate alignments infunannotate train. #360 - fix typo in
funannotate check - fix for proteinortho issue in
funannotate compare#350
- Python
Published by nextgenusfs about 6 years ago
funannotate - funannotate v1.7.1
A few updates to support bioconda integration -- namely either $TRINITYHOME or $TRINITY_HOME can be used. Also don't trust trinity packaged trimmomatic, use the conda version.
- Python
Published by nextgenusfs over 6 years ago
funannotate - funannotate v1.7.0
Code has been repackaged to conform to a "proper" python package -- which means it now also resides on PyPi and a Bioconda package can be built. Along with the repackaging there are many improvements/fixes.
- funannotate now keeps track of "trained species" for all of the ab-initio gene predictors (Augustus, gene mark (optional), snap, GlimmerHMM, codingquarry). This requires all users to update their database, ie
funannotate setupcommand. After runningfunannotate predictthe software will output a JSON file containing the paths to the trained parameter files -- this can be used again for a different genome via thefunannotate predict --parametersoptions. This parameter file can also be added to the database with thefunannotate species -s genus_species -a parameters.jsoncommand. Running thefunannotate speciescommand will output a table in the command line of which species have training data. Addressed #320 - antiSMASH remote script fixed and parser updated for v5 output.
- added filtering for gene models that start/end in a gap that can sometimes show up after running
funannotate update - added a check for
diamondversion of the database and current copy -- this results in many hidden errors by users, ie diamond databases were created with an older/incompatible version than what is running currently. - updated
Augustusfunctional check - removed
RepeatModeler/RepeatMaskeras strict dependencies. Due to RepBase change in usage license, repeatmasker/modeler are not available to most users. Thefunannotate maskcommand can still run this routine if you have the necessary dependencies installed, however, the current default is simply to runtantanmasking. This is probably not sufficient for most genomes, thus happy to integrate a robust solution once one exists for repeat masking. - augustus parameter training now done in the local output folder, so no longer need write access to $AUGUSTUSCONFIGPATH
- Python
Published by nextgenusfs over 6 years ago
funannotate - funannotate v1.6.0
- support for antiSMASH v5.0 output #292 #299
- add
snapandglimmerhmmrequest from #240. BUSCO is now run by default infunannotate predictif there is no PASA data -- BUSCO results used to trainglimmerhmmandsnap - improved Phobius results parsing #259
- multi-threaded
funannotate clean, thanks to @bogemad - fix database links #300
- multi-threaded
hisat2-build#303 - write all output files directly from
tblformat -- fix bug associated with multi-transcript parsing from GenBank files - bug fix for protein2genome exonerate mapping
- bug fixes for
funannotate trainandfunannotate update - added
--min_coverageoption to trinity workflow and set default to 5 - bug fixes for
codingquarrypredictions (RNA-seq only) - improved error message for repeatmodeler/masker #298
- bug fix for remote searches
- bug fix for parsing input folders in annotate #302
- updated conda install docs --> which seems futile to keep this current....
- Python
Published by nextgenusfs over 6 years ago
funannotate - funannotate v1.5.3
- updated the CAZYme dbCAN link
- fix string formatting in GeneMark-ET function
- Python
Published by nextgenusfs almost 7 years ago
funannotate - funannotate v1.5.2
- restructure
augustusaccessory scripts calls so that they don't have to be in same location as the exe, i.e. this used to be a bug if you used conda installedaugustusas it puts the scripts in a different location than the default augustus release folder - added a check to
buscoroutine to default to a single thread iftblastnversion is found that might have multithread issues - allow a min number of gene models to use for Augustus training,
--min_training_modelsinfunannotate predict - limit genemark to 64 cpus -- it will die if you try to give it more
- fix shortBAM declaration in
funannotate update - for
genemark-ETset score of introns to 500, otherwise seems to be dying. - updated weights for different gene models -- would be nice to have this be a customizable option....
- Python
Published by nextgenusfs about 7 years ago
funannotate - funannotate v1.5.1
- updated dbCAN links
- important bug fix for RNA-seq analysis. the bam2gff3 function was not outputting the proper coordinates for crick stranded alignments for PASA, resulting valid minimap2 alignments being thrown out.
- several other minor bug fixes
- Python
Published by nextgenusfs over 7 years ago
funannotate - funannotate v1.5.0
- add
funannotate testas a unit test script to validate installation #212 - add
CodingQuarryandStringTieintegration into funannotate. Note these are "silent" dependencies, meaning if not installed this method will be skipped. If both tools are installed and RNA-seq data is used they will be run automatically #200 - fix contig number reporting in
funannotate clean#210 - fix bug in a few of the accessory tools
funannotate util - update database to MiBIG v1.4
- Python
Published by nextgenusfs over 7 years ago
funannotate - funannotate v1.4.2
- fix bug in
trainandupdatewhere script would die due to symlink error #189 - for
predictthe--protein_alignmentsoption now takes GFF3 input (not exonerate output). This is to make consistent with the--transcript_alignments. Scripts now write hints file for augustus from the GFF3 file. - similar to above, added
funannotate util prot2genomewhich will run the diamond/exonerate mapping of proteins to the genome --> output is GFF3 file compatible with EVM. These data can then be passed to--protein_alignments
- Python
Published by nextgenusfs over 7 years ago
funannotate - funannotate v1.4.1
- bug fix for
funannotate predictduring parsing the soft masked genome -- for large genomes this was slow and used too much memory. it is now multithreaded and has lower memory footprint. #197 - bug fix for
ncRNAmodels are now listed as full length, should no longer cause NCBI errors #195 - support multiple inputs to
--other_gff#191 - make
augustususe--softmasking=1option - default value for
--soft_maskis now set to 2 kb2000 - output fasta files are now wrapped at 80 characters
tbl2asnis now multithreaded on large genomes or those with more than 10000 contigs- several updates to parsing of GenBank files to deal with unexpected formatting #196
- Python
Published by nextgenusfs over 7 years ago
funannotate - funannotate v1.4.0
- support for long-read RNA-seq data:
funannotate trainandfunannotate updatecan take PacBio isoSeq (--pacbio_isoseq), Nanopore cDNA reads (--nanopore_cdna), and Nanopore direct mRNA (--nanopore_mrna). - fix for important bug in transcript alignments in
funannotate predict-- bug in previous versions related to multi-exon crick alignments not getting correctly parsed into GFF3 alignments - soft masking is now decoupled from
funannotate predict, this is now done withfunannotate mask. Reason for this switch is to allow more flexibility in how the assembly is soft masked -- this can be done externally with another program. This change will allow users that don't have access to RepBase to use an alternative from RepeatMasker/RepeatModeler. One alternative is RED -- I wrote a wrapper for called RedMask funannotate predictcan now run without GeneMark being installed -- again to accommodate users that may be unable to use GeneMark due to licensing. Note you can pass gene predictions from any external program to--other_gffand they will be handed off to Evidence Modeler.- spaces in either strain or isolate name will be stripped #180
- default program for
funannotate cleanchanged to minimap2 #176 - fix errors in partial gene models derived from using EVM script to generate proteins, this is now done internally using exact coordinates #184
- added
--soft_maskoption tofunannotate predictwhich will control the option with same name in GeneMark, i.e. default is--soft_mask 5000which means that repeat regions less than 5 kb will be ignored for GeneMark prediction, those greater than 5 kb will be fed to Genemark. #185 - bug fixes for
tblfile generation. all tRNA models will be partial #184 - improvement to how data from
funannotate trainis used in prediction steps - Slight changes for clarity to
funannotate predictflags for evidence alignments:--protein_evidence Proteins to map to genome (prot1.fa prot2.fa uniprot.fa). Default: uniprot.fa --protein_alignments Pre-computed exonerate protein alignments (see docs for format) --transcript_evidence mRNA/ESTs to align to genome (trans1.fa ests.fa trinity.fa). Default: none --transcript_alignments Pre-computed transcript alignments in GFF3 format - added
funannotate util bam2gff3script to convert coordsorted RNA-seq BAM alignments to GFF3 compatible alignment file. - fix bug for input of files+weight in
funannotate predict-- script would get hung up if you passed--other_gff snap_alignemnts.gff3:5#191 - allow for non-standard LocusTags - will now split on last underscore #191
- Python
Published by nextgenusfs over 7 years ago
funannotate - funannotate v1.3.4
- bug fixes for sec met cluster output files and corresponding MiBIG cluster mapping
- add
tRNAscan-SEtofunannotate checkandpredict - update menu with some params that were missing
- Python
Published by nextgenusfs over 7 years ago
funannotate - funannotate v1.3.3
- bug fix for
funannotate comparewhere GO enrichment not being run in parallel from last update - use diamond blastp search for ortholog detection --> speed increased.
- don't run seqclean if file present
- update docker release to newest version of funannotate as well as newest version of Trinity, PASA
- Python
Published by nextgenusfs almost 8 years ago
funannotate - funannotate v1.3.2
- added several utility scripts --> accessible by
funannotate utilsubmenu. This includesfunannotate util comparewhich will compare multiple annotations to a reference. ``` $ funannotate util
Usage: funannotate util
Commands: compare Compare annotations to reference (GFF3 or GBK annotations)
tbl2gbk Convert TBL format to GenBank format
gbk2parts Convert GBK file to individual components
gff2proteins Convert GFF3 + FASTA files to protein FASTA
gff2tbl Convert GFF3 format to NCBI annotation table (tbl)
``
* bug fix forfunannotate remotemoving logfile
* bug fix for mapping proteins to genome where tmp folder wasn't being properly removed
* run GO enrichment in parallel infunannotate compare
* update colors in some graphs fromfunannotate compareto 24-pack Crayola colors
* add option to useiqtreeto draw ML phylogeny infunannotate compare
* bug fix forfunannotate database` command where it was not displaying table correctly.
- Python
Published by nextgenusfs almost 8 years ago
funannotate - funannotate v1.3.1
- bug fix for
funannotate setupadded missing shutil library import
- Python
Published by nextgenusfs almost 8 years ago
funannotate - funannotate v1.3.0
- bug fix for weights being set for Augustus HiQ models in
funannotate predict - bug fix for download_buscos function
- bug fix for
funannotate annotatewhere tbl file was occasionally not being parsed correctly --> re-write of parsing function - fix bug in antiSMASH/MiBIG parsing
- add method to try to recover from failed GeneMark run
- several bug fixes for
funannotate updaterelated to UTRs and multiple transcripts per locus. - added missing dependencies to
funannotate check - updated code to work with PASA > v2.3 - this is important PASA update that allows SQLite usage instead of MySQL
- improved terminal log output to tell user which files (with locations) are being re-used if they are found.
- Python
Published by nextgenusfs almost 8 years ago
funannotate - funannotate v1.2.0
- v1.2.0 now supports multiple transcripts per gene locus. The funannotate pipeline will only generate multiple transcripts per locus if given evidence in the form of RNA-seq data, this is done in the
funannotate updatecommand. It should also now support input with multiple transcripts as well. - move installation of busco models to
funannotate setup - added annotation edit distance (AED) to
funannotate updateto record the changes in annotation. As well the PASA annotation update text file is changed to incorporate these changes as well - accessory script
util/compare2annotations.pycan compare multiple annotations in either GFF3 or GBK format to a reference, generating summary stats as well as individual gene stats (AED per mRNA and CDS) - added a
--dropoption tofunannotate fixthat you can remove unwanted gene model annotations, to use pass a file containing locus_tag (1 per line) to the--dropparameter - fix bug in finding high-quality Augustus predictions (HiQ) models in
funannotate predict funannotate predictwill now detect if atrainingfolder exists in output directory, if it does it will find the correct PASA, BAM, and Trinity output and use automatically during the prediction step.
- Python
Published by nextgenusfs about 8 years ago
funannotate - funannotate v1.1.1
- fix for braker to work on docker. For some reason (I don't know why) the symlinks that braker tries to create cause an error when run on docker. The error references too many levels of symlinks essentially. To circumvent this, I modified
braker.plcode to copy instead of symlink. Also fixed thebraker.pl --versionoption which was broke in most recent release. - Note for a "normal" system, v1.1.0 should work fine. The updated braker code was run on both docker and Mac native and runs fine on those, hopefully also working well on linux.
- Python
Published by nextgenusfs about 8 years ago
funannotate - funannotate v1.1.0
- bumping version to 1.1.0 to highlight that v1.0.X versions have a bug in the tbl annotation file and will not pass GenBank specs. This was derived from dropping GAG from funannotate I had the tbl spec wrong for adding transcriptid and proteinid to both CDS and mRNA features.
- fixes for
funannotate updateand properly filtering overlapping genes - fix for
funannotate annotatethat was switching the 5' and 3' partial gene designations on crick orientated gene models, causing them to look correct after predict step and then become errors after annotate step - added Braker 2.0.3 to funannotate.... this was necessary as
braker.pl --versiondoesn't display the version number so I can't enforce a version requirement. The larger issue has to do with how the different versions of braker save the output data, there are at least 3 different behaviors in the last 4 or 5 versions which makes impossible for funannotate to determine where output will be.
- Python
Published by nextgenusfs about 8 years ago
funannotate - funannotate v1.0.2
- update to GFF to TBL parser to catch some "common" errors in GFF files
- added
funannotate iprscanwhich will run Docker InterProScan searches or also local searches. It will split the job into chunks and run those in parallel which seems to be a faster way to run InterProScan. By default it will chunk the proteins into 1000 protein bins and then run 4 cpus each up to as many cpus as you give the script. - fix to docker build (hopefully)
- bug fixes for parsing the ncbi error report, properly outputting which genes are causing errors
- fix for antiSMASH parsing of plantismash data
- Python
Published by nextgenusfs about 8 years ago
funannotate - funannotate v1.0.1
- Wrote a new GFF to TBL parser to accommodate running
funannotate annotateon a fasta + GFF file. - Added COGs output to
funannotate compare, these annotations are parsed from eggnog-mapper data - several minor bug fixes
- Python
Published by nextgenusfs about 8 years ago
funannotate - funannotate v1.0.0
Major update to funannotate with new RNA-seq modules, new database download and management, new gene name/product definition module, many bug fixes.
RNA-seq modules:
1. funannotate train: Module will run RNA-seq mediated methods for training of GeneMark/Augustus in gene prediction. It will take single or PE RNA-seq FASTQ files, run Trimmomatic quality trimming, run Trinity-mediated read normalization, run Trinity genome-guided RNAseq assembly, run PASA alignment methods. Output is BAM file, trinity transcripts, and PASA GFF3 for use in funannotate predict.
2. funannotate update: Module will run PASA mediated gene model updates. It can be run after running train --> predict --> update, which will add UTR models and refine gene models. The script can also be run on a pre-existing GenBank assembly where it will run the funannotate train methods (quality trimming, normalization, Trinity, PASA) and then followed by the update specific methods to add UTRs, refine models, etc.
funannotate predict enhancements:
1. Dropped use of GAG to write NCBI tbl file and wrote functions to do this natively in funannotate --> which was making mistakes on some partial gene models.
2. Simplified NCBI tbl generation and gene model filtering --> only running tbl2asn a single time now as bad gene models are properly filtered (previously a regex search was not working perfectly resulting in some gene models being removed arbitrarily)
3. tRNA gene length filter is now in compliance with NCBI rules (you can safely ignore tbl2asn tRNA gene length warnings --> they will eventually update tbl2asn source code)
4. Numbers of gene models for each "source" are now printed to terminal prior to running Evidence Modeler.
5. Script parses the NCBI error reports and show user which gene models need to be manually fixed, after the tbl file is updated, the GBK output files can be regenerated with the new funannotate fix command.
funannotate annotate enhancements:
1. Diamond search has replaced Blast wherever possible, results in large increase in speed.
2. HMMer searches are now split across multiple CPUs, results in increase speed.
3. Gene names and product definitions are now parsed from UniProtKb/SwissProt results and EggNog-Mapper results. The product definitions are cross references to a community resource called gene2product which will serve as a database of curated gene product definitions.
4. Native NCBI tbl generation results in proper annotation of partial gene models.
5. Script will parse tbl2asn errors and alert user of gene models that need to be fixed.
New Database Management modules:
1. Environmental variable addition: FUNANNOTATE_DB allows user to install databases locally, i.e. in a users home directly on an HPC.
2. funannotate setup script has been re-written from scratch to control the databases, keep track of versions, and allow user to update database.
3. funannotate database is a new command that shows you currently installed databases.
4. Databases have been trimmed down, occupy ~ 4 GB of space.
I would recommend that all users upgrade. After upgrading, you will need to re-download the databases from scratch. As always, many bugs have been fixed and likely some new ones introduced. Please let me know if you encounter errors.
Docs/Manual/Tutorials will be available soon at http://funannotate.readthedocs.io
- Python
Published by nextgenusfs about 8 years ago
funannotate - funannotate v0.7.2
- fix bug in
funannotate compare, string conversion to int failed on a check for number of genes - added better error message for duplicate locus_tag ids in
funannotate compare
- Python
Published by nextgenusfs over 8 years ago
funannotate - funannotate v0.7.1
- fix menu in
funannotate annotatethat still had--emailas an option -> it is not longer an option, all remotes searches moved tofunannotate remote - fix eggnog parsing issue where COG and Description are blank -> this happens if you run
diamondsearch with eggnog-mapper. You should run HMM search with the appropriate EggNog database, i.e. for fungi that is the fuNOG database.
- Python
Published by nextgenusfs over 8 years ago
funannotate - funannotate v0.7.0
Release v0.7.0 notes:
funannotate predict
- unified genbank conversion method
- added support for
repeatmasker_speciesoption - added support for strain flag for genbank conversion
- improved filtering of problematic gene models
funannotate annotate
- removed all remote searches from script (now
funannotate remotesee below) - dropped EggNog search, instead
—eggnogoption will parse the results from eggnog-mapper. Eggnog-mapper does a more comprehensive search and provides some more functional annotation information than the simple HMMer search of EggNog 4.5 database - now outputs a tsv annotation file into the
annotate_resultsoutput folder - improved functional annotation for Gene and Product names
- added support for strain flag for genbank conversion
funannotate compare
- increased speed of parsing GBK files
- remove EggNog description mapping
- fix links to MEROPS database in html output
funannotate remote
- new sub command that will run remote searches
- currently support Phobius, antiSMASH, and InterProScan
- Note: these searches are a free service, don't abuse them. If you can install these software locally it will significantly decrease your run time. They are included here as some are Linux only and/or setup is very difficult.
funannotate setup
- Eggnog 4.5 database no longer required
- Python
Published by nextgenusfs over 8 years ago
funannotate - funannotate v0.6.2
- added support to
funannotate predictfor an--other_gffoption that will pass annotation directly to EVM. You can control the weight for EVM, like this--other_gff my_predictions.gff3:10, which would give the gene models a weight of 10 in EVM - better support for
--pasa_gffpassed tofunannotate predictwhere now input is not hardcoded to havetransdecoderin column 2 of the gff file. You can also control the EVM weight like this:--pasa_gff my_pasa.gff:10to give it a weight of 10 - BRAKER1 method now pulls out high quality Augustus models (HiQ) that have >90% exon supported by evidence, these are given a weight of 5 in EVM
- Added a few stats for repeat masking genome as well as number of transcripts mapped
- updated funannotate so it is compatible with new version of GAG v2.01.
- Python
Published by nextgenusfs almost 9 years ago
funannotate - funannotate v0.6.1
Numerous bug fixes
- Strip asterisks from protein fasta files to avoid problems with InterProScan
- logfiles folder was not being created if
--genbankwas passed tofunannotate annotate - Linux bug where last step of
funannotate predictwas terminating prematurely resulting in partial output files
Re-write of the InterProscan parsing scripts. Now script will parse IPR Domains and GO terms directly from XML file, instead of splitting XML file and then parsing 1 by 1.
Great update by John Longinotto on his pybam native BAM parser which is integrated into
funannotate predictto quickly check BAM headers to make sure they match FASTA headers for input into Braker
- Python
Published by nextgenusfs almost 9 years ago
funannotate - funannotate v0.6.0
- fix tRNA gene model filtering to deal with the
tbl2asn>150 error - improve XML parsing in
funannotate compare - add
diamondalternative forexoneratepre-filtering infunannotate predict - make
funannotatedocker compatible and create docker image - EggNog and BUSCO2 database are now not downloaded in the initial setup, but you can manage EggNog databases with
funannotate eggnog. This was due to problems in building docker image downloaded the large databases. The scripts will download on the fly if default database is not available. - added some external dependency versions in
funannotate check
- Python
Published by nextgenusfs almost 9 years ago
funannotate - funannotate v0.5.7
- bug fixes for logging
- bug fix when multiple protein evidence files are passed
- add phobius to funannotate annotate to predict secreted proteins in combination with signalp
- add test data
genome4.fastathat can be used to test the BUSCO2 augustus training method - added support for checking BAM reference sequence headers if they match the genome FASTA headers, this only happens if BAM file passed to
--rna_bam
- Python
Published by nextgenusfs about 9 years ago
funannotate - funannotate v0.5.5
- typo fixes for log file names
- typo fix for fuNOG annotations in secondary metabolism module, this was now fixed to use the proper
--eggnog_dboption - test for dN/dS ratio test to assert that the tree that was drawn by Phyml has the correct number of proteins
- new feature for BUSCO models if
--ploidyis greater than 1 infunannotate predictthat duplicated BUSCO models are also parsed, the one that is picked has the highest score - Support for bypassing RepeatModeler/RepeatMasker, you can now enter
--masked_genomeand--repeatmasker_gff3options to skip that step. Note that both options are required.
- Python
Published by nextgenusfs about 9 years ago
funannotate - funannotate v0.5.4
- update to tblastn/exonerate protein mapping for better speed and more thorough searches
- added
--ploidyoption tofunannotate predictwhich controls the max number of hits for tblastn filter to pass to exonerate, which is set at 2 x ploidy. You should likely only increase this if your assembly is more than haploid - so perhaps newer assemblies with nanopore/pacbio may be able to resolve diploid chromosomes. It shouldn't have negative consequences in increase--ploidy, but will increase run time for protein mapping. - happy holidays...
- Python
Published by nextgenusfs about 9 years ago
funannotate - funannotate v0.5.3
- re-organize output so that temporary folders are created in the "final" resting place and not in the current directory
- modification of the multiprocessing function to include a simple progress percentage output
- bug fixes in
funannotate compareand the orthology dN/dS output hanging when the dN/dS calculation failed - modification to the logging to capture STDERR/STDOUT from many external tools into the log file, hopefully this will result in catching more errors than piping them to os.devnull
- bug fix in
funannotate annotatewhere output folders not being created if the input was GFF, proteins, and fasta. - made it a requirement to pass --species argument to
funannotate annotateif you do not pass in a GenBank file. This is to prevent downstream problems infunannotate comparewith how the scripts name the genome/isolates. - made a FAQ section in the docs that includes how to manually adjust gene models using the included tools
- all internal tests passed, however guaranteed there are more bugs. please let me know when you find them.
- Python
Published by nextgenusfs about 9 years ago
funannotate - funannotate v0.5.2
- update to dN/dS function to have two options: 1) --rundnds estimate (which runs the M0 model only), and 2) --rundnds full (which runs M0, M1, M2, M7, M8 and calculates the LTR of M1/M2 and M7/M8).
- update to the multiprocessing progress function - a simple progress meter is used on most multiprocessing functions to let user know how many processes have finished
- bug fix where transcripts and proteins were getting written to same file in
funannotate predict - minor bug fix in
funannotate cleanwhere input number of scaffolds was not printed out correctly - change the default location of DB as per requested by some users, now defaults to $HOME/funannotate. Note you can set this to whatever directory you want.
- Python
Published by nextgenusfs about 9 years ago
funannotate - funannotate 0.5.1
- bug fixes for
funannotate compare - bug fix for
funannotate predictduring gene model filtering of large genomes occasional parent:child features would get missed - added feature of calculating dN/dS ratios in
funannotate compare
- Python
Published by nextgenusfs about 9 years ago
funannotate - funannotate v0.4.0
- integration of BUSCO2 script and models. Can see the BUSCO2 distribution here, funannotate uses a slightly modified version to be compatible with the BUSCO->EVM workflow.
- BUSCO2 models have changed a bit, there are now a lot more options for various taxonomic groups. Something to keep in mind though is that the model names for
dikaryaare not the same aspezizomycotinaso if you use an--outgroupoption be sure that the outgroup was generated with same BUSCO DB - The
funannotate setupscript will remove previous BUSCO DB models and download the new ones because of the extensive change in the BUSCO2 structure. - addition of
funannotate outgroupsto help you mange the outgroups available tofunannotate compare - the scripts will download and format any of the available BUSCO2 eukaryote models, to see a list in a taxonomic tree format you can type
funannotate outgroups --show_buscos
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.14
funannotate annotatewill now support a single XML file for InterProScan5, you either pass a folder of single XML files 1 per protein, or a single XML file containing all of the annotations to the--iprscanoption- fix in
funannotate annotatethat did alert user that--emailis required if using remote IPR5 search; this is default setting
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.12
- bug fix for path issue when running EVM; discovered on new install on Mac - not sure why it wasn't found earlier, but resulted in failed EVM run
- some improved logging for EVM module
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.11
- bug fix for
funannotate comparewhere the genome stats was not printing for all genomes - goatools changed their headers on the output of the GO enrichment (again), so re-wrote how the data is parsed, hopefully this fix applies to all versions.
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.10
- explicitly run
rmblast/ncbiengine for RepeatMasker to avoid problems if user has default setup as something else, i.e. DFAM. Note you still need to install RepBase Libraries, e.g.
``` wget --user name --password pass http://www.girinst.org/server/RepBase/protected/repeatmaskerlibraries/repeatmaskerlibraries-20150807.tar.gz tar zxvf repeatmaskerlibraries-20150807.tar.gz -C #{HOMEBREW_PREFIX}/opt/repeatmasker/libexec
cd #{HOMEBREW_PREFIX}/opt/repeatmasker/libexec
./configure <config.txt
```
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.9
- bug fix to
funannotate comparethat was not pulling orthology groups for the final annotation table - added
transcription factorsto output of all annotation table.
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.8
- bug fix in
funannotate comparethat was calculating MEROPS summary stats incorrectly - added
--minlenoption tofunannotate sortto discard short contigs
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.7
- move install to the funannotate wrapper,
funannotate setup - fix bug with custom input folders in
funannotate annotate - output proteins/transcript files for both
funannotate predictandfunannotate annotate
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.6
- bug fix in
funannotate predictwhen usingBUSCOthe EVM input was pulling entire gene models instead of sliced models - bug fix where
BUSCOmodels were within 100 bp of the start or end of contig resulting in abedtoolsrange slicing error - remove slicing of hints file for parallel AUGUSTUS method as splitting the hints file was a slow, faster to just pass the entire hints model to each contig chunk and let AUGUSTUS filter it.
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.5
- update to the way that
funannotate predictparsesmaker2results, now using maker models directly as opposed to pulling out annotation from each predictor. - bug fix if running
funannotate comparewith a single species
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.4
- fix to the
braker1method where augustus output was not properly found - minor update to
--optimize_augustustraining to align with method used inbraker1
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.3
- fix issue with parallel
augustuswhere very large scaffolds would cause large memory usage, script now chunks the data into 500 kb sections with 10 kb overlaps on each side, runs in parallel, and then combines the results. - re-ordered transcript evidence in
funannotate predictto address providing hints toaugustus - some minor bug fixes
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.2
- build a check for
augustusversion and test if it will function withbuscoandbraker1 - revamped
buscomediated training of augustus to runbuscoquickly, filter evidence data corresponding to busco models, filter genemark-ES data, runevidence modelerto get high quality gene sets, filter EVM output with busco to build a final augustus training dataset, and finally trainaugustus - improved system info reporting
- due to problems with installing
augustuson different operating systems,augustusis not installed default viabrew install funannotate. However, runningfunannotate predictwithout a version ofaugustusinstalled will give you some hints on how to install it for your system.
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.1
- important bug fix for
augustus, previous versions were runningaugustuswith the stop codon inside the prediction, which results in the gene models to fail validation inevidence modeler, thus this update is recommended for all users - added high quality
augustusmodels to be pulled out of annotation if they are represented by evidence using the--hintsfile, these models are passed to EVM with additional weight - fixed
genemarkbug where a single contig resulted in an error, thusfunanntoate predictcan now handle a single contig as input correctly.
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.3.0
- improvements to the gene model filtering in
funannotate predict, ability to keep gene models without proper stop codons if desired,--keep_no_stops augustusis now multi-threaded- upgrade of packaged
BUSCOto v1.2, slightly faster runtime and simplified code - non-fungal options are now included in
funannotate, however use with non fungal genomes has not been extensively tested. Options of note are--organism,--busco_db,--eggnog_db - secondary metabolism enzymes added to
funannotate compareif genomes were annotated withantiSMASH
- Python
Published by nextgenusfs over 9 years ago
funannotate - funannotate v0.2.11
- fix bug where tRNA predictions lacking 'product' annotation would cause
funannotate annotateto die - build in check for fasta header length for
funannotate predict, max is 16 characters for GenBank format
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.10
- bug fix for
funannotate predictand an empty--transcript_evidenceoption - bug fix for NCBI error parsing in
funannotate predict
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.8
- fix bug that crashed checking input files for
--protein_evidenceor--transcript_evidenceif passing multiple files separated by a, - Add backbone secondary metabolism summary to
funannotate compare
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.7
- added summary of transcription factors to
funannotate comparebased off of InterProScan domain hits. - a few minor bug fixes
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.6
- bug fix on some NCBI genbank files where protein ID/locus ID aren't entered in the way that funannotate expects them
- bug fix for
funannotate predictwhen generating augustus hints files
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.5
- fix bug in processing antismash results, a typo on file handle
- increase
signalpchunks to 40 to further reduce memory consumption - fix naming of signalp graph and orientation of x-axis names
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.4
- fix memory problem of running whole genome through
signalp, now splits into 20 chunks, should result in < 200 proteins per run - work around for BUSCO/Augustus training problem if augustus species set to generic
- fix typo in v0.2.3
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.2
- yet another fix for
goatoolschanging format.goatools v0.6.4is now required, previous versions are not supported - Added
signalpsearch tofunanntoate annotateandfunannotate compare. Sincesignalphas to be manually installed and configured, it will only be run if the program is detected and won't be listed as a dependency.
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.2.1
- fix another bug due to
goatoolsformat changes
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.9
- bug fix for GO enrichment using
goatoolsas the developers changed the input parameters for the enrichment script funannotate predictnow uses hints foraugustusprediction. hints are generated fromBLATtranscript alignments from the--transcript_evidenceoption and protein hints are generated fromexoneratealignments from the--protein_evidenceoption. Theweightsfile fromevidence modelerwas adjusted slightly to account for better predictions fromaugustus.
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.8
- fix bug in
funannotate comparewhere script would crash if gene models that had the same base name - added
funannotate checka little script to tell you if Python modules are up to date
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.7
- fix install
setup.shscript if first home-brew installation it defaults to asking user for DB install path - updated some docs to reflect some installation changes that were not properly documented in last release
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.6
update to setup.sh script.
- If update funanntoate through home-brew, script will automatically detect previous version database and setup symlink. This will avoid having to setup DB every time there is an update.
- on first install, allows user to specify custom path for installation directory
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.5
Bug fixes:
- fixed variable naming resulting in centOS error
- allow user to specify database install folder in the setup.sh script as some systems don't have access to /usr/local/share
- remove bundled proteinortho5 as some users don't have sudo privileges, now ProteinOrtho installs using HomeBrew or LinuxBrew
Enhancements
- added support for --maker_gff for funannotate predict that will parse a MAKER2 GFF file to EVM inputs. Providing a MAKER2 GFF file will bypass the evidence mapping and predictions steps of funannotate and use the data in the MAKER2 file.
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.4
- improved logging and error reporting to try to catch user input errors and/or funannotate setup
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.3
- minor bug fixes and update of documentation
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.2
A few minor bug fixes.
- moved the DB folder to install in /usr/local/share/funannotate to avoid having to reinstall all databases if upgraded funannotate through brew
- Python
Published by nextgenusfs almost 10 years ago
funannotate - funannotate v0.1.1
Update to setup.sh script to fix an error during installation.
- Python
Published by nextgenusfs almost 10 years ago