https://github.com/ccoreilly/spacy-catala-generator
Training and dataset used for the catalan spacy model
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.8%) to scientific vocabulary
Keywords
Repository
Training and dataset used for the catalan spacy model
Basic Info
Statistics
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Training script and dataset for spacy-catala
Note: this repository uses Git LFS for the files under train directory
$ ./train.sh
Usage: train.sh <model_name> <model_version> <vectors_path> <train_dir> <dev_dir> --nvectors [vector_size]
E.g.: train.sh ca_fasttext_md 1.2.0 cc.ca.300.vec.gz train dev --nvectors 50000
Train a spacy model.
vectors_path expects gzipped text vectors. These are not included, you can download them with:
$ curl -x -o cc.ca.300.vec.gz https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.ca.300.vec.gz
In order to recreate the large model in spacy-catala run:
$ ./train.sh ca_fasttext_wiki_lg 1.0.0 cc.ca.300.vec.gz train dev
The medium sized model has been pruned to the most common 20000 vectors using:
$ ./train.sh ca_fasttext_wiki_md 1.0.0 cc.ca.300.vec.gz train dev --nvectors 20000
Owner
- Name: Ciaran O'Reilly
- Login: ccoreilly
- Kind: user
- Location: Berlin
- Company: @parloa
- Website: https://oreilly.cat
- Repositories: 51
- Profile: https://github.com/ccoreilly
GitHub Events
Total
Last Year
Committers
Last synced: 11 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Ciaran O'Reilly | c****n@f****m | 5 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0