https://github.com/argilla-io/argilla-plugins

🔌 Open-source plugins for with practical features for Argilla using listeners.

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

🔌 Open-source plugins for with practical features for Argilla using listeners.

Basic Info

Host: GitHub
Owner: argilla-io
License: apache-2.0
Language: Python
Default Branch: main
Size: 456 KB

Statistics

Stars: 6
Watchers: 4
Forks: 3
Open Issues: 28
Releases: 0

Created over 3 years ago · Last pushed about 3 years ago

Metadata Files

Readme License

Argilla Plugins

🔌 Open-source plugins for extra features and workflows

Why? The design of Argilla is intentionally programmable (i.e., developers can build complex workflows for reading and updating datasets). However, there are certain workflows and features which are shared across different use cases and could be simplified from a developer experience perspective. In order to facilitate the reuse of key workflows and empower the community, Argilla Plugins provides a collection of extensions to super power your Argilla use cases. Some of this pluggable method could be eventually integrated into the core of Argilla.

Quickstart

bash pip install argilla-plugins

```python from argillaplugins.datasets import endof_life

plugin = endoflife( name="plugin-test", endoflifeinseconds=100, executionintervalinseconds=5, discardonly=False ) plugin.start() ```

How to develop a plugin

Pick a cool plugin from the list of topics or our issue overview.
Think about an abstraction for the plugin as shown below.
Refer to the solution in the issue.
1. fork the repo.
2. commit your code
3. open a PR.
Keep it simple.
Have fun.

Development requirements

Function

We want to to keep the plugins as abstract as possible, hence they have to be able to be used within 3 lines of code. python from argilla_plugins.topic import plugin plugin(name="dataset_name", ws="workspace" query="query", interval=1.0) plugin.start()

Variables

variables name, ws, and query are supposed to be re-used as much as possible throughout all plugins. Similarly, some functions might contain adaptations like name_from or query_from. Whenever possible re-use variables as much as possible.

Ohh, and don`t forget to have fun! 🤓

Topics

Reporting

What is it? Create interactive reports about dataset activity, dataset features, annotation tasks, model predictions, and more.

Plugins: - [ ] automated reporting pluging using datapane. issue - [ ] automated reporting pluging for great-expectations. issue

Datasets

What is it? Everything that involves operations on a dataset level, like dividing work, syncing datasets, and deduplicating records.

Plugins: - [ ] sync data between datasets. - [ ] directional A->B. issue - [ ] bi-directional A <-> B. issue - [ ] remove duplicate records. issue - [ ] create train test splits. issue - [ ] set limits to records in datasets - [X] end of life time. issue - [ ] max # of records. issue

End of Life

Automatically delete or discard records after x seconds.

```python from argillaplugins.datasets import endof_life

plugin = endoflife( name="plugin-test", endoflifeinseconds=100, executionintervalinseconds=5, discardonly=False ) plugin.start() ```

Programmatic Labelling

What is it? Automatically update annotations and predictions labels and predictions of records based on heuristics.

Plugins: - [X] annotated spans as gazzetteer for labelling. issue - [ ] vector search queries and similarity threshold. issue - [ ] use gazzetteer for labelling. issue - [ ] materialize annotations/predictions from rules using Snorkel or a MajorityVoter issue

Token Copycat

If we annotate spans for texts like NER, we are relatively certain that these spans should be annotated the same throughout the entire dataset. We could use this assumption to already start annotating or predicting previously unseen data.

```python from argillaplugins import tokencopycat

plugin = tokencopycat( name="plugin-test", query=None, copypredictions=True, worddictkbpredictions={"key": {"label": "label", "score": 0}}, copyannotations=True, worddictkbannotations={"key": {"label": "label", "score": 0}}, includedlabels=["label"], casesensitive=True, executionintervalinseconds=1, ) plugin.start() ```

Active learning

What is it? A process during which a learning algorithm can interactively query a user (or some other information source) to label new data points.

Plugins: - [ ] active learning for TextClassification. - [X] classy-classification. issue - [ ] small-text. issue - [ ] active learning for TokenClassification. issue

```python from argillaplugins import classylearner

plugin = classylearner( name="plugin-test", query=None, model="all-MiniLM-L6-v2", classyconfig=None, certaintythreshold=0, overwritepredictions=True, samplestrategy="fifo", minnsamples=6, maxnsamples=20, batchsize=1000, executionintervalin_seconds=5, ) plugin.start() ```

Inference endpoints

What is it? Automatically add predictions to records as they are logged into Argilla. This can be used for making it really easy to pre-annotated a dataset with an existing model or service.

[ ] inference with un-authenticated endpoint. issue
[ ] embed incoming records in the background. issue

Training endpoints

What is it? Automatically train a model based on dataset annotations.

[ ] TBD

Suggestions

Do you have any suggestions? Please open an issue 🤓

Owner

Name: Argilla
Login: argilla-io
Kind: organization
Email: contact@argilla.io

Website: https://argilla.io
Twitter: argilla_io
Repositories: 12
Profile: https://github.com/argilla-io

Building the open-source tool for data-centric NLP

GitHub Events

Total

Fork event: 1

Last Year

Fork event: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 30
Total pull requests: 13
Average time to close issues: about 1 month
Average time to close pull requests: 3 months
Total issue authors: 3
Total pull request authors: 5
Average comments per issue: 0.67
Average comments per pull request: 1.92
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 7

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

davidberenstein1957 (27)
bengsoon (1)

Pull Request Authors

dependabot[bot] (4)
frascuchon (2)
FaridChouchane (1)
Gnonpi (1)
dvsrepo (1)

Top Labels

Issue Labels

datasets (7) programmatic-labelling (6) inference (6) structure (6) help wanted (5) active-learning (3) reporting (2)

Pull Request Labels

dependencies (4)

Packages

Total packages: 1
Total downloads:
- pypi 43 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 4
Total maintainers: 1

pypi.org: argilla-plugins

🔌 Open-source plugins for with practical features for Argilla using listeners.

Documentation: https://argilla-plugins.readthedocs.io/
License: Apache 2.0
Latest release: 0.1.3
published over 3 years ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 43 Last month

Rankings

Dependent packages count: 6.6%

Downloads: 13.6%

Average: 17.0%

Dependent repos count: 30.6%

Maintainers (1)

David.Berenstein

Last synced: 10 months ago

Dependencies

poetry.lock pypi

154 dependencies

pyproject.toml pypi

argilla ^1.1.1
datapane ^0.15.5
great-expectations ^0.15
python >=3.8,<3.11.0
rich ^13.0.0
typer ^0.7.0

https://github.com/argilla-io/argilla-plugins

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Argilla Plugins

Quickstart

How to develop a plugin

Development requirements

Function

Variables

Topics

Reporting

Datasets

End of Life

Programmatic Labelling

Token Copycat

Active learning

Inference endpoints

Training endpoints

Suggestions

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: argilla-plugins

Rankings

Maintainers (1)

Dependencies