Recent Releases of Jury
Jury - v2.3.1
What's Changed
- Update CI actions versions. by @devrimcavusoglu in https://github.com/obss/jury/pull/134
- Update dev installation to allow for e.g. Zsh by @KennethEnevoldsen in https://github.com/obss/jury/pull/136
- Update README.md by @devrimcavusoglu in https://github.com/obss/jury/pull/137
New Contributors
- @KennethEnevoldsen made their first contribution in https://github.com/obss/jury/pull/136
Full Changelog: https://github.com/obss/jury/compare/2.3...2.3.1
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu over 1 year ago
Jury - v2.3
What's Changed
- Comet version update, according changes have been made. by @devrimcavusoglu in https://github.com/obss/jury/pull/129
- Update README.md by @eltociear in https://github.com/obss/jury/pull/130
- Drop py3.7 support, change CI. by @devrimcavusoglu in https://github.com/obss/jury/pull/132
- README.md updated. Jury paper added. by @devrimcavusoglu in https://github.com/obss/jury/pull/133
New Contributors
- @eltociear made their first contribution in https://github.com/obss/jury/pull/130
Full Changelog: https://github.com/obss/jury/compare/2.2.4...2.3
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 2 years ago
Jury - v2.2.4
What's Changed
- datasets dependency added with constraint. by @devrimcavusoglu in https://github.com/obss/jury/pull/126
- Add try/catch block across ZeroDivisionError for AccuracyForLanguageGeneration.computesinglepredsingle_ref by @NISH1001 in https://github.com/obss/jury/pull/123
- Package
evaluateupdated to 0.4 (from <0.3). by @devrimcavusoglu in https://github.com/obss/jury/pull/128
New Contributors
- @NISH1001 made their first contribution in https://github.com/obss/jury/pull/123
Full Changelog: https://github.com/obss/jury/compare/2.2.3...2.2.4
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu over 2 years ago
Jury - v2.2.3
What's Changed
flake8error on python3.7 by @devrimcavusoglu in https://github.com/obss/jury/pull/118- Seqeval typo fix by @devrimcavusoglu in https://github.com/obss/jury/pull/117
- Refactored requirements (sklearn). by @devrimcavusoglu in https://github.com/obss/jury/pull/121
Full Changelog: https://github.com/obss/jury/compare/2.2.2...2.2.3
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 3 years ago
Jury - v2.2.1
What's Changed
- Fixed warning message in BLEURT default initialization by @zafercavdar in https://github.com/obss/jury/pull/110
ZeroDivisionErroron precision and recall values. by @devrimcavusoglu in https://github.com/obss/jury/pull/112- validators added to the requirements. by @devrimcavusoglu in https://github.com/obss/jury/pull/113
- Intermediate patch, fixes, updates. by @devrimcavusoglu in https://github.com/obss/jury/pull/114
New Contributors
- @zafercavdar made their first contribution in https://github.com/obss/jury/pull/110
Full Changelog: https://github.com/obss/jury/compare/2.2...2.2.1
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu over 3 years ago
Jury - v2.2
What's Changed
- Fix Reference Structure for Basic BLEU calculation by @Sophylax in https://github.com/obss/jury/pull/74
- Added BLEURT. by @devrimcavusoglu in https://github.com/obss/jury/pull/78
- README.md updated with doi badge and citation inforamtion. by @devrimcavusoglu in https://github.com/obss/jury/pull/81
- Add VSCode Folder to Gitignore by @Sophylax in https://github.com/obss/jury/pull/82
- Change one BERTScore test Device to CPU by @Sophylax in https://github.com/obss/jury/pull/84
- Add Prism metric by @devrimcavusoglu in https://github.com/obss/jury/pull/79
- Update issue templates by @devrimcavusoglu in https://github.com/obss/jury/pull/85
- Dl manager rework by @devrimcavusoglu in https://github.com/obss/jury/pull/86
- Nltk upgrade by @devrimcavusoglu in https://github.com/obss/jury/pull/88
- CER metric implementation. by @devrimcavusoglu in https://github.com/obss/jury/pull/90
- Prism checkpoint URL updated. by @devrimcavusoglu in https://github.com/obss/jury/pull/92
- Test cases refactored. by @devrimcavusoglu in https://github.com/obss/jury/pull/96
- Added BARTScore by @Sophylax in https://github.com/obss/jury/pull/89
- License information added for prism and bleurt. by @devrimcavusoglu in https://github.com/obss/jury/pull/97
- Remove Unused Imports by @Sophylax in https://github.com/obss/jury/pull/98
- Added WER metric. by @devrimcavusoglu in https://github.com/obss/jury/pull/103
- Add TER metric by @devrimcavusoglu in https://github.com/obss/jury/pull/104
- CHRF metric added. by @devrimcavusoglu in https://github.com/obss/jury/pull/105
- Add comet by @devrimcavusoglu in https://github.com/obss/jury/pull/107
- Doc refactor by @devrimcavusoglu in https://github.com/obss/jury/pull/108
- Pypi fix by @devrimcavusoglu in https://github.com/obss/jury/pull/109
New Contributors
- @Sophylax made their first contribution in https://github.com/obss/jury/pull/74
Full Changelog: https://github.com/obss/jury/compare/2.1.5...2.2
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu over 3 years ago
Jury - v2.1.5
What's Changed
- Bug fix: Typo corrected in removeempty() in core.py. by @devrimcavusoglu in https://github.com/obss/jury/pull/67
- Metric name path bug fix. by @devrimcavusoglu in https://github.com/obss/jury/pull/69
Full Changelog: https://github.com/obss/jury/compare/2.1.4...2.1.5
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 4 years ago
Jury - v2.1.4
What's Changed
- Handle for empty predictions & references on Jury (skipping empty). by @devrimcavusoglu in https://github.com/obss/jury/pull/65
Full Changelog: https://github.com/obss/jury/compare/2.1.3...2.1.4
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 4 years ago
Jury - v2.1.2
What's Changed
- Bug fix: bleu returning same score with different max_order is fixed. by @devrimcavusoglu in https://github.com/obss/jury/pull/59
- nltk version upgraded as >=3.6.4 (from >=3.6.2). by @devrimcavusoglu in https://github.com/obss/jury/pull/61
Full Changelog: https://github.com/obss/jury/compare/2.1.1...2.1.2
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 4 years ago
Jury - v2.1.1
What's Changed
- Seqeval: json normalization added. by @devrimcavusoglu in https://github.com/obss/jury/pull/55
- Read support from folders by @devrimcavusoglu in https://github.com/obss/jury/pull/57
Full Changelog: https://github.com/obss/jury/compare/2.1.0...2.1.1
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 4 years ago
Jury - v2.1.0
What's New π
Tasks π
We added task based new metric system which allows us to evaluate different type of inputs rather than old system which could only evaluate from strings (generated text) for only language generation tasks. Hence, jury now is able to support broader set of metrics works with different types of input.
With this, on jury.Jury API, the consistency of set of tasks given is under control. Jury will raise an error if any pair of metrics are not consistent with each other in terms of task (evaluation input).
AutoMetric β¨
- AutoMetric is introduced as a main factory class for automatically loading metrics, as a side note
load_metricis still available for backward compatibility and is preferred (it uses AutoMetric under the hood). - Tasks are now distinguished within metrics. For example, precision can be used for
language-generationorsequence-classificationtask, where one evaluates from string (generated text) while other one evaluates from integers (class labels). - On configuration file, metrics can be now stated with HuggingFace's datasets' metrics initializiation parameters. The keyword arguments for metrics that are used on computation are now separated in
"compute_kwargs"key.
Full Changelog: https://github.com/obss/jury/compare/2.0.0...2.1.0
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 4 years ago
Jury - v2.0.0
Jury 2.0.0 is out ππ₯³
New Metric System
- datasets package Metric implementation is adopted (and extended) to provide high performance π― and more unified interface π€.
- Custom metric implementation changed accordingly (it now requires 3 abstract methods to be implemented).
- Jury class is now callable (implements call() method to be used thoroughly) though evaluate() method is still available for backward compatibility.
- In the usage of evaluate of Jury,
predictionsandreferencesparameters are restricted to be passed as keyword arguments to prevent confusion/wrong computations (like datasets' metrics). - MetricCollator is removed, the methods for metrics are attached directly to Jury class. Now, metric addition and removal can be performed from a Jury instance directly.
- Jury now supports reading metrics from string, list and dictionaries. It is more generic to input type of metrics given along with parameters.
New metrics
- Accuracy, F1, Precision, Recall are added to Jury metrics.
- All metrics on datasets package are still available on jury through the use of
jury.load_metric()
Development
- Test cases are improved with fixtures, and test structure is enchanced.
- Expected outputs are now required for tests as a json with proper name.
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu about 4 years ago
Jury - v1.0.0
Release Notes
- New metric structure is completed.
- Custom metric support is improved and no longer required to extend
datasets.Metric, rather usesjury.metrics.Metric. - Metric usage is unified with
compute,preprocessandpostprocessfunctions, which the only required implementation for custom metric iscompute. - Both string and
Metricobjects can be passed toJury(metrics=metrics)now in a mixed fashion. load_metricfunction was rearranged to capture end score results and several metrics added accordingly (e.g.load_metric("squad_f1")will load squad metric which returns F1-score).
- Custom metric support is improved and no longer required to extend
- Example notebook has added to example.
- MT and QA tasks were illustrated.
- Custom metric creation added as example.
Acknowledgments
@fcakyon @cemilcengiz @devrimcavusoglu
Scientific Software - Peer-reviewed
- Python
Published by devrimcavusoglu over 4 years ago