mahad

Maha is a text processing library specially developed to deal with Arabic text.

https://github.com/troboto/maha

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.8%) to scientific vocabulary

Keywords

arabic-cleaners arabic-nlp arabic-parsers arabic-text

Keywords from Contributors

kinetic-modeling
Last synced: 7 months ago · JSON representation ·

Repository

Maha is a text processing library specially developed to deal with Arabic text.

Basic Info
  • Host: GitHub
  • Owner: TRoboto
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 3.14 MB
Statistics
  • Stars: 208
  • Watchers: 9
  • Forks: 16
  • Open Issues: 13
  • Releases: 3
Topics
arabic-cleaners arabic-nlp arabic-parsers arabic-text
Created almost 5 years ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md




CI Documentation Status codecov Language grade: Python Discord Downloads License PyPI version Code style: black Checked with mypy PyPI - Python Version

An Arabic text processing library intended for use in NLP applications


Maha is a text processing library specially developed to deal with Arabic text. The beta version can be used to clean and parse text, files, and folders with or without streaming capability.

If you need help or want to discuss topics related to Maha, feel free to reach out to our Discord server. If you would like to submit a bug report or feature request, please open an issue.

Installation

Simply run the following to install Maha:

bash pip install mahad # pronounced maha d

For source installation, check the documentation.

Overview

Check out the overview section in the documentation to get started with Maha.

Documentation

Documentation is hosted at ReadTheDocs.

Contributing

Maha welcomes and encourages everyone to contribute. Contributions are always appreciated. Feel free to take a look at our contribution guidelines in the documentation.

License

Maha is BSD-licensed.

Owner

  • Name: Mohammad Al-Fetyani
  • Login: TRoboto
  • Kind: user
  • Location: Jordan

Machine Learning Engineer

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Al-Fetyani"
  given-names: "Mohammad"
  orcid: "https://orcid.org/0000-0002-7360-2403"
title: "Maha Processing Library"
version: v0.3.0
date-released: 2022-04-04
url: "https://github.com/TRoboto/Maha"

GitHub Events

Total
  • Watch event: 4
  • Push event: 3
  • Fork event: 1
Last Year
  • Watch event: 4
  • Push event: 3
  • Fork event: 1

Committers

Last synced: about 3 years ago

All Time
  • Total Commits: 666
  • Total Committers: 9
  • Avg Commits per committer: 74.0
  • Development Distribution Score (DDS): 0.072
Top Committers
Name Email Commits
TRoboto m****h@h****m 618
Mohammad Al-Fetyani m****i@a****o 17
pre-commit-ci[bot] 6****]@u****m 11
Muhammad Al-Barham m****8@g****m 11
github-actions[bot] 4****]@u****m 3
Saed SayedAhmed 3****1@u****m 3
Khaleel Jaber 9****l@u****m 1
0xRar 3****r@u****m 1
dependabot[bot] 4****]@u****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 27
  • Total pull requests: 74
  • Average time to close issues: 3 months
  • Average time to close pull requests: 9 days
  • Total issue authors: 4
  • Total pull request authors: 9
  • Average comments per issue: 0.15
  • Average comments per pull request: 1.38
  • Merged pull requests: 57
  • Bot issues: 0
  • Bot pull requests: 26
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • PAIN-BARHAM (13)
  • TRoboto (12)
  • xaleel (1)
  • abzmkob (1)
Pull Request Authors
  • TRoboto (34)
  • github-actions[bot] (13)
  • pre-commit-ci[bot] (12)
  • PAIN-BARHAM (8)
  • saedx1 (3)
  • 0xRar (1)
  • dependabot[bot] (1)
  • xaleel (1)
  • mohamadmansourX (1)
Top Labels
Issue Labels
feature request (20) parsing (18) bug (6) good first issue (5) documentation (1) new feature (1)
Pull Request Labels
enhancement (9) new feature (7) bugfix (7) parsing (6) highlight (6) development (5) documentation (3) testing (1) maintenance (1) breaking changes (1) deprecation (1) dependencies (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 384 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 4
  • Total maintainers: 1
pypi.org: mahad

An Arabic text processing library intended for use in NLP applications.

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 384 Last month
Rankings
Stargazers count: 5.0%
Forks count: 9.3%
Dependent packages count: 10.1%
Average: 12.8%
Downloads: 17.9%
Dependent repos count: 21.5%
Maintainers (1)
Last synced: 7 months ago

Dependencies

docs/requirements.txt pypi
  • furo *
  • linuxdoc *
  • sphinx *
  • sphinx-autoapi *
  • sphinx-copybutton *
poetry.lock pypi
  • aiohttp 3.8.1 develop
  • aiosignal 1.2.0 develop
  • alabaster 0.7.12 develop
  • astroid 2.10.0 develop
  • async-timeout 4.0.2 develop
  • asynctest 0.13.0 develop
  • atomicwrites 1.4.0 develop
  • attrs 21.4.0 develop
  • babel 2.9.1 develop
  • beautifulsoup4 4.10.0 develop
  • black 21.12b0 develop
  • blacken-docs 1.12.1 develop
  • certifi 2021.10.8 develop
  • cffi 1.15.0 develop
  • cfgv 3.3.1 develop
  • charset-normalizer 2.0.12 develop
  • click 8.0.4 develop
  • coverage 6.3.2 develop
  • datasets 1.18.4 develop
  • deprecated 1.2.13 develop
  • dill 0.3.4 develop
  • distlib 0.3.4 develop
  • docutils 0.17.1 develop
  • filelock 3.6.0 develop
  • frozenlist 1.3.0 develop
  • fspath 20190323 develop
  • fsspec 2022.5.0 develop
  • furo 2021.11.23 develop
  • gitdb 4.0.9 develop
  • gitpython 3.1.27 develop
  • huggingface-hub 0.4.0 develop
  • identify 2.4.11 develop
  • idna 3.3 develop
  • imagesize 1.3.0 develop
  • importlib-metadata 4.11.2 develop
  • iniconfig 1.1.1 develop
  • isort 5.10.1 develop
  • jinja2 3.0.3 develop
  • lazy-object-proxy 1.7.1 develop
  • linuxdoc 20210324 develop
  • markupsafe 2.1.0 develop
  • multidict 6.0.2 develop
  • multiprocess 0.70.12.2 develop
  • mypy 0.910 develop
  • mypy-extensions 0.4.3 develop
  • nodeenv 1.6.0 develop
  • numpy 1.21.1 develop
  • packaging 21.3 develop
  • pandas 1.3.5 develop
  • pathspec 0.9.0 develop
  • platformdirs 2.5.1 develop
  • pluggy 1.0.0 develop
  • pre-commit 2.17.0 develop
  • py 1.11.0 develop
  • pyarrow 7.0.0 develop
  • pycparser 2.21 develop
  • pygithub 1.55 develop
  • pygments 2.11.2 develop
  • pyjwt 2.4.0 develop
  • pynacl 1.5.0 develop
  • pyparsing 3.0.7 develop
  • pytest 6.2.5 develop
  • pytest-cov 2.12.1 develop
  • pytz 2021.3 develop
  • pyyaml 6.0 develop
  • requests 2.27.1 develop
  • responses 0.18.0 develop
  • smmap 5.0.0 develop
  • snowballstemmer 2.2.0 develop
  • soupsieve 2.3.1 develop
  • sphinx 4.4.0 develop
  • sphinx-autoapi 1.8.4 develop
  • sphinx-copybutton 0.3.3 develop
  • sphinxcontrib-applehelp 1.0.2 develop
  • sphinxcontrib-devhelp 1.0.2 develop
  • sphinxcontrib-htmlhelp 2.0.0 develop
  • sphinxcontrib-jsmath 1.0.1 develop
  • sphinxcontrib-qthelp 1.0.3 develop
  • sphinxcontrib-serializinghtml 1.1.5 develop
  • toml 0.10.2 develop
  • tomli 1.2.3 develop
  • tox 3.24.5 develop
  • typed-ast 1.4.3 develop
  • types-python-dateutil 2.8.9 develop
  • unidecode 1.3.4 develop
  • urllib3 1.26.8 develop
  • virtualenv 20.13.3 develop
  • wrapt 1.13.3 develop
  • xxhash 3.0.0 develop
  • yarl 1.7.2 develop
  • zipp 3.7.0 develop
  • colorama 0.4.4
  • hijri-converter 2.2.3
  • python-dateutil 2.8.2
  • regex 2021.11.10
  • six 1.16.0
  • tqdm 4.63.0
  • typing-extensions 3.10.0.2
pyproject.toml pypi
  • GitPython ^3.1.24 develop
  • PyGithub ^1.55 develop
  • Sphinx ^4.1.2 develop
  • black ^21.5b1 develop
  • blacken-docs ^1.10.0 develop
  • datasets ^1.18.2 develop
  • furo ^2021.8.31 develop
  • isort ^5.8.0 develop
  • linuxdoc ^20210324 develop
  • mypy ^0.910 develop
  • pre-commit ^2.13.0 develop
  • pytest ^6.2.4 develop
  • pytest-cov ^2.12.1 develop
  • sphinx-autoapi ^1.8.4 develop
  • sphinx-copybutton ^0.3.1 develop
  • tox ^3.24.3 develop
  • types-python-dateutil ^2.8.0 develop
  • hijri-converter ^2.2.3
  • python ^3.7.1
  • python-dateutil ^2.8.2
  • regex ^2021.8.28
  • tqdm ^4.61.1
  • typing-extensions ^3.10.0
.github/workflows/ci.yml actions
  • Gr1N/setup-poetry v7 composite
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v1 composite
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/prepare_release.yml actions
  • Gr1N/setup-poetry v7 composite
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • peter-evans/create-or-update-comment v1 composite
  • peter-evans/create-pull-request v3 composite
.github/workflows/publish_pypi.yml actions
  • JRubics/poetry-publish v1.8 composite
  • actions/checkout v2 composite