https://github.com/ccoreilly/spacy-catala
Spacy NLP Model for the Catalan language
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Keywords
Repository
Spacy NLP Model for the Catalan language
Basic Info
- Host: GitHub
- Owner: ccoreilly
- License: agpl-3.0
- Language: Python
- Default Branch: master
- Size: 13.7 KB
Statistics
- Stars: 16
- Watchers: 5
- Forks: 0
- Open Issues: 0
- Releases: 5
Topics
Metadata Files
README.md
[CA] Model pel processament del llenguatge natural en Català per a spaCy
Model per a spaCy de la llengua catalana generat a partir de:
- Vectors de paraules de fastText
- Gramàtica, morfologia i sintaxi fent servir dades del corpus d'AnCora
- Annotacions per a l'extracció d'entitats derivades de la wikipedia (Cross-lingual Name Tagging and Linking for 282 Languages)
Els models es poden descarregar a la secció Publicacions (Releases).
Instal·lació i ús
Podeu escollir entre dos models. El model gran és més precís però com que spaCy carrega tot el model a memòria assegureu-vos de tenir-ne suficient.
| Dada | Mitjà | Gran |
|---|---|---|
| Nom | ca_fasttext_wiki_md | ca_fasttext_wiki_lg |
| Versió | 1.0.0 | 1.0.0 |
| spaCy | >=2.3.2,<2.4.0| >=2.3.2,<2.4.0|
| Mida | 62 MB| 1,16 GB |
| Pipeline | tagger, parser, ner | tagger, parser, ner |
| Vectors | 20.000 | 2.000.000 |
| Llicència | AGPL-3.0 |AGPL-3.0 |
| Autor | Ciaran O'Reilly |Ciaran O'Reilly |
Podeu instal·lar el model i fer-lo servir amb spaCy executant les següents ordres a l'interfície de línia d'ordres:
```sh
Per instal·lar el model mitjà
pip install https://github.com/ccoreilly/spacy-catala/releases/download/cafasttextwikimd-1.0.0/cafasttextwikimd-1.0.0-py3-none-any.whl python -m spacy link cafasttextwiki_md ca
Per instal·lar el model gran
pip install https://github.com/ccoreilly/spacy-catala/releases/download/cafasttextwikilg-1.0.0/cafasttextwikilg-1.0.0-py3-none-any.whl python -m spacy link cafasttextwiki_lg ca ```
[EN] spaCy NLP Model for the Catalan language
spaCy NLP model for the Catalan language generated from:
- fastText word vectors
- The AnCora corpus for parts of speech, morphological features, and syntactic dependencies.
- Wikipedia annotations for named entity extraction (Cross-lingual Name Tagging and Linking for 282 Languages)
Models can be found in the releases section of the repository.
Installing and using the model
You can choose between two models. The larger one is more accurate but make sure to have enough memory as spaCy will load the whole model into it.
| Dada | Medium | Large |
|---|---|---|
| Name | ca_fasttext_wiki_md | ca_fasttext_wiki_lg |
| Version | 1.0.0 | 1.0.0 |
| spaCy | >=2.3.2,<2.4.0| >=2.3.2,<2.4.0|
| Size | 62 MB| 1,16 GB |
| Pipeline | tagger, parser, ner | tagger, parser, ner |
| Vectors | 20.000 | 2.000.000 |
| License | AGPL-3.0 |AGPL-3.0 |
| Author | Ciaran O'Reilly |Ciaran O'Reilly |
```sh
To install the medium sized model
pip install https://github.com/ccoreilly/spacy-catala/releases/download/cafasttextwikimd-1.0.0/cafasttextwikimd-1.0.0-py3-none-any.whl python -m spacy link cafasttextwiki_md ca
To install the larger model
pip install https://github.com/ccoreilly/spacy-catala/releases/download/cafasttextwikilg-1.0.0/cafasttextwikilg-1.0.0-py3-none-any.whl python -m spacy link cafasttextwiki_lg ca ```
Owner
- Name: Ciaran O'Reilly
- Login: ccoreilly
- Kind: user
- Location: Berlin
- Company: @parloa
- Website: https://oreilly.cat
- Repositories: 51
- Profile: https://github.com/ccoreilly
GitHub Events
Total
Last Year
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Ciaran O'Reilly | c****n@f****m | 3 |
| Ciaran O'Reilly | c****n@o****t | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 4
- Total pull requests: 0
- Average time to close issues: 5 months
- Average time to close pull requests: N/A
- Total issue authors: 4
- Total pull request authors: 0
- Average comments per issue: 2.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- josejuanmartinez (1)
- jaumemir (1)
- juleur (1)
- jmsabate (1)