https://github.com/acdh-oeaw/acdh-prodigy-utils
custom loaders for spaCy's prodigy
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
2 of 2 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.5%) to scientific vocabulary
Keywords
prodigy
spacy
Last synced: 6 months ago
·
JSON representation
Repository
custom loaders for spaCy's prodigy
Basic Info
Statistics
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Archived
Topics
prodigy
spacy
Created over 6 years ago
· Last pushed almost 4 years ago
https://github.com/acdh-oeaw/acdh-prodigy-utils/blob/master/
# prodigy_utils
A bunch of custom loaders for prodigy
* dsebaseapp
* transkribus
* sketch-engine
* django-rest-framework based APIs
# install
* clone the repo
* build the package (in your virtual environment) `python setup.py develop`
* add needed api-credentials to your `prodigy.json` config file like
```python
"api_keys": {
"ske_user": "someusername",
"ske_pw": "somepassword",
"transkribus_user": "someusername",
"transkribus_pw": "somepassword"
}
```
also install lxml and requests
## example dsebaseapp
annotate TEI documents stored in a dsebaseapp instance
### create dataset
`python -m prodigy dataset asbw "ASBW-Retro for gold annotations"`
### Make NER-Gold-Data
`python -m prodigy ner.make-gold asbw de_core_news_sm https://asbw-retro.acdh-dev.oeaw.ac.at::asbw-retro::editions --loader from_dsebaseapp --label PER,ORG,LOC -U`
## example django-rest-framework
`python -m prodigy ner.make-gold drf de_core_news_sm https://annotator.acdh-dev.oeaw.ac.at/api/nersampletodo/?format=json::text::50 --loader from_drf --label PER,ORG,LOC,MISC -U`
## example transkribus
### Make NER-Gold-Data
`python -m prodigy ner.make-gold asbw de_core_news_sm 44688::181839 --loader from_transkribus --label PER,ORG,LOC,MISC -U`
### text classifier
#### make a dataset
`python -m prodigy dataset mpr_retro_ungarn_textcat "MPR-Ungarn for text classification"`
#### start prodigy
`python -m prodigy textcat.manual mpr_retro_ungarn_textcat de_core_news_sm 45410::187485 --loader from_transkribus_regions --label PB,P,REGEST,NOTE,MINUTEH,OTHER`
## example sketch-engine
### text classifier
#### make a dataset
`python -m prodigy dataset ske-amc "AMC for text classification"`
#### start prodigy
`python -m prodigy textcat.manual ske-amc de_core_news_sm amc_3.1 --loader from_ske_docs --label SPORT,CHRONIK,SONST`
### NER
`python -m prodigy ner.make-gold ske-amc de_core_news_sm amc3_demo --loader from_ske_docs`
### stand-alones
In the folder 'example_prodigy_standalones' additional examples on prodigy's usage are shown, namely such were all the configuration is done within python code itself. More info in the README of that subfolder.
Owner
- Name: Austrian Centre for Digital Humanities & Cultural Heritage
- Login: acdh-oeaw
- Kind: organization
- Email: acdh@oeaw.ac.at
- Location: Vienna, Austria
- Website: https://www.oeaw.ac.at/acdh
- Repositories: 476
- Profile: https://github.com/acdh-oeaw
GitHub Events
Total
Last Year
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Peter Andorfer | P****r@o****t | 22 |
| steff-vm | s****h@o****t | 5 |
Committer Domains (Top 20 + Academic)
oeaw.ac.at: 2
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
setup.py
pypi
- lxml >=4.6.1