Science Score: 62.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 5 committers (20.0%) from academic institutions -
✓Institutional organization owner
Organization ucrel has institutional domain (ucrel.lancs.ac.uk) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary
Keywords
Repository
Python Multilingual Ucrel Semantic Analysis System
Basic Info
- Host: GitHub
- Owner: UCREL
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://ucrel.github.io/pymusas/
- Size: 3.06 MB
Statistics
- Stars: 31
- Watchers: 8
- Forks: 14
- Open Issues: 20
- Releases: 3
Topics
Metadata Files
README.md
PyMUSAS
Python Multilingual Ucrel Semantic Analysis System, is a rule based token and Multi Word Expression semantic tagger. The tagger can support any semantic tagset, however the tagset we have concentrated on and released pre-configured spaCy components for is the Ucrel Semantic Analysis System (USAS).
Documentation
- 📚 Usage Guides - What the package is, tutorials, how to guides, and explanations.
- 🔎 API Reference - The docstrings of the library, with minimum working examples.
- 🚀 Roadmap
Language support
PyMUSAS currently support 10 different languages with pre-configured spaCy components that can be downloaded, each language has it's own guide on how to tag text using PyMUSAS. Below we show the languages supported, if the model for that language supports Multi Word Expression (MWE) identification and tagging (all languages support token level tagging by default), and size of the model:
| Language (BCP 47 language code) | MWE Support | Size | | --- | --- | --- | | Mandarin Chinese (cmn) | :heavycheckmark: | 1.28MB | | Welsh (cy) | :heavycheckmark: | 1.09MB | | Spanish, Castilian (es) | :heavycheckmark: | 0.20MB | | Finnish (fi) | :x: | 0.63MB | | French (fr) | :x: | 0.08MB | | Indonesian (id) | :x: | 0.24MB | | Italian (it) | :heavycheckmark: | 0.50MB | | Dutch, Flemish (nl) | :x: | 0.15MB | | Portuguese (pt) | :heavycheckmark: | 0.27MB | | English (en) | :heavycheckmark: | 0.88MB |
Install PyMUSAS
Can be installed on all operating systems and supports Python version >= 3.7, to install run:
pip install pymusas
Development
When developing on the project you will want to install the Python package locally in editable format with all the extra requirements, this can be done like so:
bash
pip install -e .[tests]
For a zsh shell, which is the default shell for the new Macs you will need to escape with \ the brackets:
zsh
pip install -e .\[tests\]
Running linters and tests
This code base uses flake8 and mypy to ensure that the format of the code is consistent and contain type hints. The flake8 settings can be found in ./setup.cfg and the mypy settings within ./pyproject.toml. To run these linters:
bash
isort pymusas tests scripts
flake8
mypy
To run the tests with code coverage (NOTE these are the code coverage tests that the Continuos Integration (CI) reports at the top of this README, the doc tests are not part of this report):
bash
coverage run # Runs the tests (uses pytest)
coverage report # Produces a report on the test coverage
To run the doc tests, these are tests to ensure that examples within the documentation run as expected:
bash
coverage run -m pytest --doctest-modules pymusas/ # Runs the doc tests
coverage report # Produces a report on the doc tests coverage
Team
PyMUSAS is an open-source project that has been created and funded by the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University. For more information on who has contributed to this code base see the contributions page.
Owner
- Name: UCREL
- Login: UCREL
- Kind: organization
- Email: ucrel@lancaster.ac.uk
- Location: Lancaster, UK
- Website: https://ucrel.lancs.ac.uk/
- Repositories: 48
- Profile: https://github.com/UCREL
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
PyMUSAS: Python Multilingual Ucrel Semantic
Analysis System
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Andrew
family-names: Moore
email: a.moore@lancaster.ac.uk
affiliation: Lancaster University
orcid: 'https://orcid.org/0000-0002-3395-0841'
- given-names: Paul
family-names: Rayson
orcid: 'https://orcid.org/0000-0002-1257-2191'
email: p.rayson@lancaster.ac.uk
affiliation: Lancaster University
repository-code: 'https://github.com/ucrel/pymusas'
url: 'https://ucrel.github.io/pymusas/'
license: Apache-2.0
version: 0.3.0
date-released: '2022-04-04'
GitHub Events
Total
- Watch event: 2
- Issue comment event: 1
- Pull request event: 1
- Fork event: 2
Last Year
- Watch event: 2
- Issue comment event: 1
- Pull request event: 1
- Fork event: 2
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Andrew Moore | a****4@g****m | 377 |
| Robin Long | r****1@h****k | 11 |
| Paul Rayson | p****n@l****k | 7 |
| Nathan Ellis Rasmussen | e****n | 1 |
| Daisy Lal | 1****1 | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 32
- Total pull requests: 11
- Average time to close issues: about 2 months
- Average time to close pull requests: about 20 hours
- Total issue authors: 5
- Total pull request authors: 3
- Average comments per issue: 1.09
- Average comments per pull request: 1.0
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 3.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- apmoore1 (26)
- FahdCodes (3)
- jasp9559 (1)
- karmalet (1)
- MarcRoigVilamala (1)
Pull Request Authors
- apmoore1 (8)
- longr (3)
- eritain (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 272 last-month
- Total docker downloads: 439
- Total dependent packages: 0
- Total dependent repositories: 3
- Total versions: 3
- Total maintainers: 1
pypi.org: pymusas
PYthon Multilingual Ucrel Semantic Analysis System
- Homepage: https://ucrel.github.io/pymusas/
- Documentation: https://pymusas.readthedocs.io/
- License: Apache License 2.0
-
Latest release: 0.3.0
published almost 4 years ago
Rankings
Maintainers (1)
Dependencies
- @docusaurus/core 2.0.0-beta.14
- @docusaurus/preset-classic 2.0.0-beta.14
- @mdx-js/react ^1.6.21
- clsx ^1.1.1
- prism-react-renderer ^1.2.1
- react ^17.0.1
- react-dom ^17.0.1
- 1132 dependencies
- conllu ==4.4.1
- datasets ==1.18.3
- it_core_news_sm *
- coverage >=6.0.0 development
- flake8 >=3.8.0,<3.10.0 development
- isort >=5.5.4 development
- mypy ==0.910 development
- pydoc-markdown >=4.0.0,<4.6.0 development
- pytest >=6.0.0, development
- responses >=0.16.0 development
- types-requests * development
- click <8.1.0
- requests >=2.13.0,<3.0.0
- spacy >=3.0
- spacy >=3.1.4
- srsly >=2.4.1,<3.0.0
- tqdm >=4.50.0,<5.0.0
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/checkout v3 composite
- actions/setup-python v2 composite
- actions/upload-artifact v2.3.0 composite
- codecov/codecov-action v2 composite
- dieghernan/cff-validator main composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/checkout v3 composite
- actions/setup-node v2 composite
- actions/setup-node v3 composite
- actions/setup-python v2 composite
- peaceiris/actions-gh-pages v3 composite
- pip
- python 3.9.*