Recent Releases of vak
vak - 1.0.4.post1
Full Changelog: https://github.com/vocalpy/vak/compare/1.0.4...1.0.4.post1
- Python
Published by NickleDave 11 months ago
vak - 1.0.4
What's Changed
- docs: add @henricombrink as a contributor for bug by @allcontributors[bot] in https://github.com/vocalpy/vak/pull/782
- docs: add @milaXT as a contributor for bug by @allcontributors[bot] in https://github.com/vocalpy/vak/pull/786
- Added a configuration file for improved model performance along with supplemental parameter documentation by @Tingyan-Guo in https://github.com/vocalpy/vak/pull/788
- docs: add milaXT as a contributor for doc by @allcontributors[bot] in https://github.com/vocalpy/vak/pull/789
- DEV: Require Python >= 3.11, bump lower bounds on deps by @NickleDave in https://github.com/vocalpy/vak/pull/804
New Contributors
- @Tingyan-Guo made their first contribution in https://github.com/vocalpy/vak/pull/788
Full Changelog: https://github.com/vocalpy/vak/compare/1.0.3...1.0.4
- Python
Published by NickleDave 11 months ago
vak - 1.0.2
What's Changed
- DOC: Update autoannoate tutorial, fix #768 by @NickleDave in https://github.com/vocalpy/vak/pull/769
- docs: add meriablue as a contributor for doc by @allcontributors in https://github.com/vocalpy/vak/pull/770
- ENH: fix frame classification model to work with BioSoundSegBench by @NickleDave in https://github.com/vocalpy/vak/pull/774
Full Changelog: https://github.com/vocalpy/vak/compare/1.0.1...1.0.2
- Python
Published by NickleDave over 1 year ago
vak - 1.0.1
What's Changed
- DEV: Require Python>=3.10, update other dependencies by @NickleDave in https://github.com/vocalpy/vak/pull/765
- BUG: minor fixes to run with biosoundsegbench dataset by @NickleDave in https://github.com/vocalpy/vak/pull/766
Full Changelog: https://github.com/vocalpy/vak/compare/1.0.0...1.0.1
- Python
Published by NickleDave almost 2 years ago
vak - 1.0.0
What's Changed
- Version 1.0 by @NickleDave in https://github.com/vocalpy/vak/pull/639
- DOC: Add version 1.0 announcement [skip ci] by @NickleDave in https://github.com/vocalpy/vak/pull/640
- DOC: Add version 1.0 announcement to README [skip ci] by @NickleDave in https://github.com/vocalpy/vak/pull/642
- DOC: Add API reference, fix #441 by @NickleDave in https://github.com/vocalpy/vak/pull/644
- DOC: Revise reference/about.md [skip ci] by @NickleDave in https://github.com/vocalpy/vak/pull/646
- CLN: Refactor window dataset, add tests by @NickleDave in https://github.com/vocalpy/vak/pull/654
- ENH: Use crowsetta 5.0, fixes #526 by @NickleDave in https://github.com/vocalpy/vak/pull/657
- Prep datasets as directories, fixes #649 #650 #651 by @NickleDave in https://github.com/vocalpy/vak/pull/658
- BUG: Fix usage of crowsetta 5.0 by @NickleDave in https://github.com/vocalpy/vak/pull/659
- BUG/CLN: Fixup prepare dataset as directory by @NickleDave in https://github.com/vocalpy/vak/pull/660
- replace union to future syntax by @Ja-sonYun in https://github.com/vocalpy/vak/pull/661
- docs: add Ja-sonYun as a contributor for code by @allcontributors in https://github.com/vocalpy/vak/pull/662
- BUG: Change labelmap for validation step of FrameWindowClassificationModel only, fix #664 by @NickleDave in https://github.com/vocalpy/vak/pull/665
- Refactor api, fixes #663 by @NickleDave in https://github.com/vocalpy/vak/pull/666
- Refactor frame classification, add audio datasets + DAS model, fixes #630 #652 #667 by @NickleDave in https://github.com/vocalpy/vak/pull/670
- Remove initial implementation of DAS model by @NickleDave in https://github.com/vocalpy/vak/pull/675
- Add decorators to register models and model families, fix #623 by @NickleDave in https://github.com/vocalpy/vak/pull/676
- ENH: Add ED-TCN model by @NickleDave in https://github.com/vocalpy/vak/pull/677
- ENH: Determine network kwargs dynamically in
vak.models.getby @NickleDave in https://github.com/vocalpy/vak/pull/680 - BUG/CLN: Add/fix log statements in FrameClassificationModel by @NickleDave in https://github.com/vocalpy/vak/pull/681
- Add parametric UMAP model family by @NickleDave in https://github.com/vocalpy/vak/pull/688
- Fix and run linting by @NickleDave in https://github.com/vocalpy/vak/pull/690
- TST/CLN: Fix unit tests by @NickleDave in https://github.com/vocalpy/vak/pull/693
- docs: add JacquelineGoe as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/695
- BUG: Make distance metrics return tensors, fix #700 #701 by @NickleDave in https://github.com/vocalpy/vak/pull/702
- docs: add VenetianRed as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/703
- docs: add zhileiz1992 as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/705
- DOC, Make minor doc fixes, fixes #704 by @NickleDave in https://github.com/vocalpy/vak/pull/706
- docs: add zhileiz1992 as a contributor for code by @allcontributors in https://github.com/vocalpy/vak/pull/711
- docs: add marisbasha as a contributor for ideas by @allcontributors in https://github.com/vocalpy/vak/pull/712
- docs: add marisbasha as a contributor for code by @allcontributors in https://github.com/vocalpy/vak/pull/713
- DOC: Add eval to autoannotate, fix #460 by @NickleDave in https://github.com/vocalpy/vak/pull/715
- docs: add vivinastase as a contributor for ideas by @allcontributors in https://github.com/vocalpy/vak/pull/716
- ENH: Minimize frame classification dataset size, fix #717 by @NickleDave in https://github.com/vocalpy/vak/pull/718
- BUG: Fix models to log train loss on step, fixes #720 by @NickleDave in https://github.com/vocalpy/vak/pull/722
- CLN: Rename "segment error rate" to "character error rate", fix #721 by @NickleDave in https://github.com/vocalpy/vak/pull/723
- docs: add danielmk as a contributor for doc by @allcontributors in https://github.com/vocalpy/vak/pull/732
- DOC: fix autoannotate tutorial configs, fixes #734 by @NickleDave in https://github.com/vocalpy/vak/pull/735
- TST: Fix test in
common.labelsby @NickleDave in https://github.com/vocalpy/vak/pull/747 - ENH: Switch to version 1.0 of config file format, fix #685 #345 #748 by @NickleDave in https://github.com/vocalpy/vak/pull/750
- ENH: Add lightning.Trainer config, fix #691 #687 #742 #745 by @NickleDave in https://github.com/vocalpy/vak/pull/752
- CLN: Refactor model abstraction, fix #737 #726 by @NickleDave in https://github.com/vocalpy/vak/pull/753
- CLN/ENH: Rename and refactor datapipes, add datasets; fix #574 #724 #754 by @NickleDave in https://github.com/vocalpy/vak/pull/755
New Contributors
- @Ja-sonYun made their first contribution in https://github.com/vocalpy/vak/pull/661
Full Changelog: https://github.com/vocalpy/vak/compare/0.8.2...1.0.0
- Python
Published by NickleDave about 2 years ago
vak - 0.8.2
What's Changed
- BUG: fix default for posttfmkwargs, fixes Inconsistent syllable error by @zhileiz1992 in https://github.com/vocalpy/vak/pull/710
New Contributors
- @zhileiz1992 made their first contribution in https://github.com/vocalpy/vak/pull/710
Full Changelog: https://github.com/vocalpy/vak/compare/0.8.1...0.8.2
- Python
Published by NickleDave over 2 years ago
vak - 0.8.1
What's Changed
- BUG: Have
to_segmentsreturn all Nones for no segments, fix #634 by @NickleDave in https://github.com/vocalpy/vak/pull/636 - docs: add nhoglen as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/637
Full Changelog: https://github.com/vocalpy/vak/compare/0.8.0...0.8.1
- Python
Published by NickleDave over 2 years ago
vak - 1.0.0a3
What's Changed
- BUG: Make distance metrics return tensors, fix #700 #701 by @NickleDave in https://github.com/vocalpy/vak/pull/702
- docs: add VenetianRed as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/703
Full Changelog: https://github.com/vocalpy/vak/compare/1.0.0a2...1.0.0a3
- Python
Published by NickleDave over 2 years ago
vak - 1.0.0a2
What's Changed
- BUG: Have
to_segmentsreturn all Nones for no segments, fix #634 by @NickleDave in https://github.com/vocalpy/vak/pull/636 - docs: add nhoglen as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/637
- Version 1.0 by @NickleDave in https://github.com/vocalpy/vak/pull/639
- DOC: Add version 1.0 announcement [skip ci] by @NickleDave in https://github.com/vocalpy/vak/pull/640
- DOC: Add version 1.0 announcement to README [skip ci] by @NickleDave in https://github.com/vocalpy/vak/pull/642
- DOC: Add API reference, fix #441 by @NickleDave in https://github.com/vocalpy/vak/pull/644
- DOC: Revise reference/about.md [skip ci] by @NickleDave in https://github.com/vocalpy/vak/pull/646
- CLN: Refactor window dataset, add tests by @NickleDave in https://github.com/vocalpy/vak/pull/654
- ENH: Use crowsetta 5.0, fixes #526 by @NickleDave in https://github.com/vocalpy/vak/pull/657
- Prep datasets as directories, fixes #649 #650 #651 by @NickleDave in https://github.com/vocalpy/vak/pull/658
- BUG: Fix usage of crowsetta 5.0 by @NickleDave in https://github.com/vocalpy/vak/pull/659
- BUG/CLN: Fixup prepare dataset as directory by @NickleDave in https://github.com/vocalpy/vak/pull/660
- replace union to future syntax by @Ja-sonYun in https://github.com/vocalpy/vak/pull/661
- docs: add Ja-sonYun as a contributor for code by @allcontributors in https://github.com/vocalpy/vak/pull/662
- BUG: Change labelmap for validation step of FrameWindowClassificationModel only, fix #664 by @NickleDave in https://github.com/vocalpy/vak/pull/665
- Refactor api, fixes #663 by @NickleDave in https://github.com/vocalpy/vak/pull/666
- Refactor frame classification, add audio datasets + DAS model, fixes #630 #652 #667 by @NickleDave in https://github.com/vocalpy/vak/pull/670
- Remove initial implementation of DAS model by @NickleDave in https://github.com/vocalpy/vak/pull/675
- Add decorators to register models and model families, fix #623 by @NickleDave in https://github.com/vocalpy/vak/pull/676
- ENH: Add ED-TCN model by @NickleDave in https://github.com/vocalpy/vak/pull/677
- ENH: Determine network kwargs dynamically in
vak.models.getby @NickleDave in https://github.com/vocalpy/vak/pull/680 - BUG/CLN: Add/fix log statements in FrameClassificationModel by @NickleDave in https://github.com/vocalpy/vak/pull/681
- Add parametric UMAP model family by @NickleDave in https://github.com/vocalpy/vak/pull/688
- Fix and run linting by @NickleDave in https://github.com/vocalpy/vak/pull/690
- TST/CLN: Fix unit tests by @NickleDave in https://github.com/vocalpy/vak/pull/693
- docs: add JacquelineGoe as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/695
New Contributors
- @Ja-sonYun made their first contribution in https://github.com/vocalpy/vak/pull/661
Full Changelog: https://github.com/vocalpy/vak/compare/0.8.0...1.0.0a2
- Python
Published by NickleDave over 2 years ago
vak - 1.0.0a1
What's Changed
- BUG: Have
to_segmentsreturn all Nones for no segments, fix #634 by @NickleDave in https://github.com/vocalpy/vak/pull/636 - docs: add nhoglen as a contributor for bug by @allcontributors in https://github.com/vocalpy/vak/pull/637
- Version 1.0 by @NickleDave in https://github.com/vocalpy/vak/pull/639
Full Changelog: https://github.com/vocalpy/vak/compare/0.8.0...1.0.0a1
- Python
Published by NickleDave almost 3 years ago
vak -
0.8.0 release notes
2023-02-09
Added
- Add options for how
audio.to_spectcallsdask.bag, to help with memory issues when processing large files #611. Fixes #580. - Add ability to run evaluation of models with and without post-processing transforms. This is done by specifying an option
post_tfm_kwargsin the[EVAL]or[LEARNCURVE]sections of a .toml configuration file. If the option is not specified, then models are evaluated as they were previously, by converting the predicted label for each time bin to a label for each continuous segment, represented as a string. If the option is specified, then the post-processing is applied to the model predictions before converting to strings. Metrics are computed for outputs with and without post-processing, to be able to compare the two. #621. Fixes #472. vak.core.evalnow logs computed evaluation metrics so they can be quickly inspected in the terminal or log files before full analysis #621. Fixes #471.
Changed
- Rewrite post-processing transforms applied to network outputs as transforms, with functional and class implementations, to make it possible to compose these transforms, and more easily evaluate model performance with and without them #621. Fixes #537.
- Python
Published by NickleDave over 3 years ago
vak -
vak 0.7.0 release notes
vak 0.7.0 is a maintenance release, but it does include some new features and bug fixes. Highlights: - For annotation formats that have one annotation file per annotated file, vak can now recognize when the annotation files are named by removing the annotated file extension (e.g., .wav or .npz) and replacing it with the annotation format extension, e.g. .txt or .csv. (Other ways of relating annotations and annotated files are still valid, e.g. by including the original source audio file in both filenames.) - The transform that normalizes spectrograms is now fit only to the training set; previously no split was specified and in some cases the entire dataset was used, which could potentially reduce the error on the test set because of dataset leakage (the model "knows" about the distribution of the test set because the parameters used to normalize the spectrograms take it into account). For training sets large enough to achieve good performance with current models, there is probably not a big enough difference between their distribution and that of the test set for this to seriously impact evaluation, but we have not tested this extensively. - Several other clean ups, additional unit tests, and minor bug fixes that should not have impacted performance but do make the library more efficient and robust.
Added
- Add unit tests for
csv.has_unlabled#541. Fixes #102. - Add unit tests for
__main__#542. Fixes #337. - Add validation of
labelsargument tovak.split.algorithms.brute_force, to prevent conditions where algorithm can fail to converge because of bad input #562. Fixes #288. - Add a "Frequently Asked Questions" page to the documentation, and a page to the "Reference" section on file naming conventions #564. Fixes #524 and #424.
- Add a new way for vak to map annotation files to annotated files when preparing datasets, e.g. for training models. For annotation formats that have one annotation file per annotated file, vak can now recognize when the annotation files are named by removing the annotated file extension (e.g., .wav or .npz) and replacing it with the annotation format extension, e.g. .txt or .csv. (Other ways of relating annotations and annotated files are still valid, e.g. by including the original source audio file in both filenames.) #572. Fixes #563.
- Have runs from command-line interface log version to logfile #587. Fixes #216.
Changed
- Rewrite unit tests in
tests/test_cli/to use mocks forvak.corefunctions #544. Fixes #543. - It is now possible to load configuration files
and work with them programmatically even if the paths
they point to do not exist.
The
corefunctions handle validation instead. E.g., thePrepConfigclass does not check whetheroutput_direxist is a directory, butvak.core.prepdoes. #550. Fixes #459. - Refactor and speed up logic for determining whether a
dataset with sequence annotations has unlabeled segments
that should be assigned a "background" label
#559.
Fixes #243.
- Adds a new sub-sub-package,
datasets.seqwith avalidatorsmodule, which is where the re-writtenhas_unlabeledfunction now lives. Replaces thevak.csvmodule which was not well named. - Also adds a
has_unlabeledfunction tovak.annotationthat is used byvak.datasets.seq.validators.has_unlabeled; this function handles edge cases outlined in #243.
- Adds a new sub-sub-package,
- Rename and refactor functions in
vak.annotationthat map annotations to the files that they annotate, so that the purpose of the functions is clearer, and add clearer error messages with links to documentation about file naming conventions #566. Fixes #525. - Revise "autoannotate" tutorial to use .wav audio and .csv annotation files from new release of Bengalese Finch Song Repository, and to suggest that Windows users unpack archives with tar, not other programs such as WinZip #578. Fixes #560 and #576.
- Change
vak.files.find_fnameandvak.files.spect.find_audio_fnameso they work when spaces are in filename and/or path #594. Fixes #589.
Fixed
- Fix how
vak.core.prephandleslabelsetparameter. Add pre-condition that raises a ValueError whenlabelsetisNonebut the .toml config is one of {'train', 'learncurve', 'eval'} #545. Avoids running computationally expensive step of generating and validating spectrograms before crashing when trying to split the dataset usinglabelset. Also avoids silent failures for datasets that do not require splitting, e.g., an 'eval' set that could contain labels not in the training set. Fixes #468. - Fix how
cliandcorefunctions that have thecsv_pathparameter handles it. The parameter points to a dataset .csv generated byvak prepthat othercore/clifunction use:train,learncurve,eval,predict. They now validate that it exists, and if it doesn't, theclifunctions politely suggest runningvak prepfirst; thecorefunctions raise a FileNotFoundError. #546. Fixes #469. - Fix bug where
labelmap_pathparameter was ignored bycore.train. Change function so that eitherlabelmap_pathorlabelsetmust be passed in, both passing in both will raise an error. Also changecli.trainto only pass in one of those and set the other toNone. #552. Fixes #547. - Fix
vak.annotation.has_unlabeledto handle the edge case where an annotation file has no annotated segments #583. Fixes #378. - Fix
StandardizeSpectmethodfit_dfso that it computes parameters for standardization from a specific split of the dataset--the training split, by default--instead of using the entire dataset, which could technically give rise to data leakage #584. Fixes #575. - Fix error message in
vak.core.eval#589. Fixes #588.
- Python
Published by NickleDave over 3 years ago
vak -
0.4.0 -- 2021-12-29
Added
- add a CITATION.cff file #407.
- add an all-contributors table to README, using their bot to adopt the spec. E.g., #395. Fixes #387.
- add description of command-line interface to reference section of documentation. #417. Fixes #270.
- add how-to on using an annotation format that's not built in #421. Fixes #397.
- add how-to on using custom spectrograms #421. Fixes #413.
Changed
- updated the .toml configuration files in the tutorial to match what was used for TweetyNet paper. #416. Fixes #414.
- move tutorial into "getting started" section of docs, and revise landing page of docs #419.
- revise the documentation for the configuration file format. Show valid options for each section by including docstrings from the classes that represents the different sections #428. Fixes #271.
Fixed
- make further fixes + add unit tests for handling predictions where all timebins
are the background "unlabeled" class #409.
Fixes bug in
remove_short_segments#403. Related to #393 and #386. - fix docs so entries appear in navbar #427. Fixes #426.
- Python
Published by NickleDave over 3 years ago
vak -
0.6.0 -- 2022-07-07
Added
- better document
condainstall #528. Fixes #527. - Add tests for console script, i.e., the command-line interface #533. Fixes #369.
Changed
- switch from using
maketonoxfor running tasks #532. Fixes #440. - Refactor logging so that it can be configured by
clifunctions when runningvakthrough command-line interface, and by users that are working with the API directly #535.
Fixed
- Fix bug that prevented creating spectrogram files with non-default keys
(e.g. 'spect' instead of the default 's'). Needed to pass keys from
spect_paramsintospect.to_dataframeinsidevak.io.dataframe.from_files. #531. Fixes #412. - Fix logging so a single message is not logged multiple times. #535. Fixes #258.
- Python
Published by NickleDave almost 4 years ago
vak - sparrows-gathering
Added
- add helper function to TestLearncurve that multiple unit tests can use to assert all outputs were generated. Now being used to make sure bug fixed in 0.1.0a8 stays fixed.
- error checking in cli that raises ValueError when cli command is
learncurveand the option 'resultsdirmadebymain_script' is already defined in [OUTPUT] section, since running 'learncurve' would overwrite it. datasetsubpackage that housesVocalizationDatasetand related classes that facilitate creating data sets for training neural networks from heterogeneous data: audio files, files of arrays containing spectrograms, different annotation types, etc.- also includes modules for handling each data source
- e.g.
audio.to_spectcreates spectrograms from audio files spect.from_filescreates aVocalizationDatasetfrom spectrogram files
coresub-package that contains / will contain functions that do heavy lifting:learning_curve,train,predictlearning_curveis a sub-sub-module that does bothtrainandtestof models, instead of having a separatelearncurveandsummaryfunction (i.e. train and test). Still will confuse some ML/AI people that this "learning curve" has a test data step but whatevsclisub-package calls / will call these functions and handle any command-line-interface specific logic (e.g. making changes toconfig.inifiles)
Changed
- change name of
vak.cli.make_datatovak.cli.prep - structure of
config.inifile- now specify either
audio_formatorspect_formatin[DATA]section - and
annot_formatfor annotations
- now specify either
- refactor
utilssub-package- move several functions from
dataandgeneralinto alabelsmodule
- move several functions from
Removed
- remove unused options from command-line interface:
--glob,--txt,--dataset skip_files_with_labels_not_in_labelsetoption- now happens whenever
labelsetis specified; if nolabelsetis given then no filtering is done
- now happens whenever
summarycommand-line option, sincelearncurvenow runs trains models and also tests them on separate data setsilent_label_gapoption, becauseVocalizationDatasetclass determines if a label for unlabeled segments between other segments is needed, and if so automatically assigns this a label of 0 when mapping user labels to consecutive integers- this way user does not have to think about it
- and program doesn't have to keep track of a
labels_mappingfile that saves what user specified
- Python
Published by NickleDave almost 7 years ago
vak -
Fixed
- Fix how main loop in
learncurvere-loads indices for grabbing subsets of training data after generating them, and do so in a way that still allows for re-using subsets from previous runs
- Python
Published by NickleDave about 7 years ago
vak -
Added
vak.cli.summaryhassave_transformed_dataparameter andvak.clipassed value fromconfig.data.save_transformed_dataas the argument when callingvak.cli.summary
Changed
vak.cli.summaryonly saves transformed train/test data ifsave_transformed_dataisTrue- move a test from tests/unittests/testutils.py into tests/unittests/testutils/test_data.py
Removed
vak.cli.summaryno longer saves copy of test data in results directory
- Python
Published by NickleDave about 7 years ago
vak -
Added
- add test for utils.data.getindsfor_dur
Changed
- learncurve gets indices for all train data subsets before starting training
- Python
Published by NickleDave about 7 years ago
vak -
Fixed
- add missing 'savetransformeddata' option to Data config parsing
- Python
Published by NickleDave about 7 years ago
vak -
Added
- Use
attrs-based classes to represent sections of config.ini files
Changed
- rewrite
vak.cliso it can deal with state of config.ini files- e.g. doesn't throw an error if
train_data_pathnot declared as an option in [TRAIN] when runningvak prep(since training data won't exist yet, doesn't make sense to throw an error).
- e.g. doesn't throw an error if
Removed
- remove code about
freq_binsin a couple of places, since the number of frequency bins in spectrograms is now just determined programmaticallyvak.config.datano longer hasfreq_binsfield in DataConfig namedtuplemake_datano longer addsfreq_binsoption to [DATA] section after making data sets
- Python
Published by NickleDave about 7 years ago
vak -
Changed
- checkpoints saved in individual directories by
learncurveso they are more cleanly segregated, e.g. if user wants to point to a specific checkpoint when callingpredict - calling
vak prep config.iniwill runvak.cli.make_datafunction- so to generate a learning curve, the three steps now are:
bash vak prep config.ini vak learncurve config.ini vak summary config.ini
- so to generate a learning curve, the three steps now are:
Fixed
vak.cli.trainruns all the way through, passes basic "does not crash" testvak.cli.predictruns all the way through, passes basic "does not crash" test
- Python
Published by NickleDave about 7 years ago