https://github.com/camel-lab/camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

https://github.com/camel-lab/camel_tools

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    3 of 8 committers (37.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.6%) to scientific vocabulary

Keywords

arabic arabic-dialects dialect-identification morphological-analysis morphological-disambiguation morphological-generation morphological-reinflection named-entity-recognition nlp nlp-apis nlp-library pos-tagging sentiment-analysis stemming
Last synced: 5 months ago · JSON representation

Repository

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Basic Info
  • Host: GitHub
  • Owner: CAMeL-Lab
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 11.5 MB
Statistics
  • Stars: 482
  • Watchers: 18
  • Forks: 77
  • Open Issues: 27
  • Releases: 19
Topics
arabic arabic-dialects dialect-identification morphological-analysis morphological-disambiguation morphological-generation morphological-reinflection named-entity-recognition nlp nlp-apis nlp-library pos-tagging sentiment-analysis stemming
Created over 8 years ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License

README.rst

CAMeL Tools
===========


.. image:: https://img.shields.io/pypi/v/camel-tools.svg
   :target: https://pypi.org/project/camel-tools
   :alt: PyPI Version

.. image:: https://img.shields.io/pypi/pyversions/camel-tools.svg
   :target: https://pypi.org/project/camel-tools
   :alt: PyPI Python Version

.. image:: https://readthedocs.org/projects/camel-tools/badge/?version=latest
   :target: https://camel-tools.readthedocs.io/en/latest/?badge=latest
   :alt: Documentation Status

.. image:: https://img.shields.io/pypi/l/camel-tools.svg
   :target: https://opensource.org/licenses/MIT
   :alt: MIT License

|

.. image:: camel_tools_logo.png
   :target: camel_tools_logo.png
   :alt: CAMeL Tools Logo


Introduction
------------

CAMeL Tools is  suite of Arabic natural language processing tools developed by
the
`CAMeL Lab `_
at `New York University Abu Dhabi `_.

    **Please use** `GitHub Issues `_
    **to report a bug or if you need help using CAMeL Tools.**


Installation
------------

You will need Python 3.8 - 3.12 (64-bit) as well as
`the Rust compiler `_ installed.

Linux/macOS
~~~~~~~~~~~

You will need to install some additional dependencies on Linux and macOS.
Primarily CMake, and Boost.

On Ubuntu/Debian you can install these dependencies by running:

.. code-block:: bash

   sudo apt-get install cmake libboost-all-dev

On macOS you can install them using Homewbrew by running:

.. code-block:: bash

   brew install cmake boost

.. _linux-macos-install-pip:

Install using pip
^^^^^^^^^^^^^^^^^

.. code-block:: bash

   pip install camel-tools

   # or run the following if you already have camel_tools installed
   pip install camel-tools --upgrade

On Apple silicon Macs you may have to run the following instead:

.. code-block:: bash

   CMAKE_OSX_ARCHITECTURES=arm64 pip install camel-tools

   # or run the following if you already have camel_tools installed
   CMAKE_OSX_ARCHITECTURES=arm64 pip install camel-tools --upgrade

.. _linux-macos-install-source:

Install from source
^^^^^^^^^^^^^^^^^^^

.. code-block:: bash

   # Clone the repo
   git clone https://github.com/CAMeL-Lab/camel_tools.git
   cd camel_tools

   # Install from source
   pip install .

   # or run the following if you already have camel_tools installed
   pip install --upgrade .

.. _linux-macos-install-data:

Installing data
^^^^^^^^^^^^^^^

To install the datasets required by CAMeL Tools components run one of the
following:

.. code-block:: bash

   # To install all datasets
   camel_data -i all

   # or just the datasets for morphology and MLE disambiguation only
   camel_data -i light

   # or just the default datasets for each component
   camel_data -i defaults

See `Available Packages `_
for a list of all available datasets.

By default, data is stored in ``~/.camel_tools``.
Alternatively, if you would like to install the data in a different location,
you need to set the :code:`CAMELTOOLS_DATA` environment variable to the desired
path.

Add the following to your :code:`.bashrc`, :code:`.zshrc`, :code:`.profile`,
etc:

.. code-block:: bash

   export CAMELTOOLS_DATA=/path/to/camel_tools_data

Windows
~~~~~~~

**Note:** CAMeL Tools has been tested on Windows 10. The Dialect Identification
component is not available on Windows at this time.

.. _windows-install-pip:

Install using pip
^^^^^^^^^^^^^^^^^

.. code-block:: bash

   pip install camel-tools -f https://download.pytorch.org/whl/torch_stable.html

   # or run the following if you already have camel_tools installed
   pip install --upgrade -f https://download.pytorch.org/whl/torch_stable.html camel-tools

.. _windows-install-source:

Install from source
^^^^^^^^^^^^^^^^^^^

.. code-block:: bash

   # Clone the repo
   git clone https://github.com/CAMeL-Lab/camel_tools.git
   cd camel_tools

   # Install from source
   pip install -f https://download.pytorch.org/whl/torch_stable.html .
   pip install --upgrade -f https://download.pytorch.org/whl/torch_stable.html .

.. _windows-install-data:

Installing data
^^^^^^^^^^^^^^^

To install the data packages required by CAMeL Tools components, run one of the
following commands:

.. code-block:: bash

   # To install all datasets
   camel_data -i all

   # or just the datasets for morphology and MLE disambiguation only
   camel_data -i light

   # or just the default datasets for each component
   camel_data -i defaults

See `Available Packages `_
for a list of all available datasets.

By default, data is stored in
``C:\Users\your_user_name\AppData\Roaming\camel_tools``.
Alternatively, if you would like to install the data in a different location,
you need to set the ``CAMELTOOLS_DATA`` environment variable to the desired
path. Below are the instructions to do so (on Windows 10):

* Press the **Windows** button and type ``env``.
* Click on **Edit the system environment variables (Control panel)**.
* Click on the **Environment Variables...** button.
* Click on the **New...** button under the **User variables** panel.
* Type ``CAMELTOOLS_DATA`` in the **Variable name** input box and the
  desired data path in **Variable value**. Alternatively, you can browse for the
  data directory by clicking on the **Browse Directory...** button.
* Click **OK** on all the opened windows.


Documentation
-------------

To get started, you can follow along
`the Guided Tour `_
for a quick overview of the components provided by CAMeL Tools.

You can find the
`full online documentation here `_ for both
the command-line tools and the Python API.

Alternatively, you can build your own local copy of the documentation as
follows:

.. code-block:: bash

   # Install dependencies
   pip install sphinx myst-parser sphinx-rtd-theme

   # Go to docs subdirectory
   cd docs

   # Build HTML docs
   make html

This should compile all the HTML documentation in to ``docs/build/html``.


Citation
--------

If you find CAMeL Tools useful in your research, please cite
`our paper `_:

.. code-block:: bibtex

   @inproceedings{obeid-etal-2020-camel,
      title = "{CAM}e{L} Tools: An Open Source Python Toolkit for {A}rabic Natural Language Processing",
      author = "Obeid, Ossama  and
         Zalmout, Nasser  and
         Khalifa, Salam  and
         Taji, Dima  and
         Oudah, Mai  and
         Alhafni, Bashar  and
         Inoue, Go  and
         Eryani, Fadhl  and
         Erdmann, Alexander  and
         Habash, Nizar",
      booktitle = "Proceedings of the 12th Language Resources and Evaluation Conference",
      month = may,
      year = "2020",
      address = "Marseille, France",
      publisher = "European Language Resources Association",
      url = "https://www.aclweb.org/anthology/2020.lrec-1.868",
      pages = "7022--7032",
      abstract = "We present CAMeL Tools, a collection of open-source tools for Arabic natural language processing in Python. CAMeL Tools currently provides utilities for pre-processing, morphological modeling, Dialect Identification, Named Entity Recognition and Sentiment Analysis. In this paper, we describe the design of CAMeL Tools and the functionalities it provides.",
      language = "English",
      ISBN = "979-10-95546-34-4",
   }


License
-------

CAMeL Tools is available under the MIT license.
See the `LICENSE file
`_
for more info.


Contribute
----------

If you would like to contribute to CAMeL Tools, please read the
`CONTRIBUTE.rst
`_
file.


Contributors
------------

* `Ossama Obeid `_
* `Go Inoue `_
* `Bashar Alhafni `_
* `Salam Khalifa `_
* `Dima Taji `_
* `Nasser Zalmout `_
* `Nizar Habash `_

Owner

  • Name: CAMeL Lab
  • Login: CAMeL-Lab
  • Kind: organization
  • Location: Abu Dhabi, UAE

The Computational Approaches to Modeling Language (CAMeL) Lab at New York University Abu Dhabi

GitHub Events

Total
  • Create event: 2
  • Release event: 2
  • Issues event: 14
  • Watch event: 69
  • Issue comment event: 10
  • Push event: 2
  • Pull request event: 1
  • Fork event: 7
Last Year
  • Create event: 2
  • Release event: 2
  • Issues event: 14
  • Watch event: 69
  • Issue comment event: 10
  • Push event: 2
  • Pull request event: 1
  • Fork event: 7

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 396
  • Total Committers: 8
  • Avg Commits per committer: 49.5
  • Development Distribution Score (DDS): 0.192
Top Committers
Name Email Commits
Ossama W. Obeid o****o@o****m 320
Salam s****a@g****m 28
Go Inoue g****e@n****u 26
balhafni a****i@u****u 13
balhafni b****3@n****u 4
go-inoue i****4@i****p 3
fadhleryani 3****i@u****m 1
Go Inoue g****e@m****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 104
  • Total pull requests: 35
  • Average time to close issues: 2 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 79
  • Total pull request authors: 11
  • Average comments per issue: 1.94
  • Average comments per pull request: 0.17
  • Merged pull requests: 24
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 14
  • Pull requests: 4
  • Average time to close issues: 10 days
  • Average time to close pull requests: 2 days
  • Issue authors: 13
  • Pull request authors: 2
  • Average comments per issue: 0.29
  • Average comments per pull request: 0.5
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • fadhleryani (5)
  • lancioni (5)
  • christios (3)
  • muhammed-abuodeh (3)
  • KarmelShehadeh (3)
  • omarabb315 (2)
  • ahmadabousetta (2)
  • israaexol (2)
  • JayR7 (2)
  • esulaiman (2)
  • Salah856 (2)
  • rrpelgrim (2)
  • 00SoCRaT00 (2)
  • awadomar34 (2)
  • mustafa0x (2)
Pull Request Authors
  • slkh (12)
  • go-inoue (8)
  • balhafni (4)
  • muhammed-abuodeh (2)
  • hadikhamoud (2)
  • taomoh (2)
  • AliAbedMohsen (1)
  • mustafa0x (1)
  • mansern (1)
  • AbdallahNasir (1)
  • fadhleryani (1)
Top Labels
Issue Labels
bug (44) question (36) enhancement (9) new feature (2)
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 10,247 last-month
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 8
    (may contain duplicates)
  • Total versions: 53
  • Total maintainers: 1
pypi.org: camel-tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

  • Versions: 23
  • Dependent Packages: 2
  • Dependent Repositories: 8
  • Downloads: 10,247 Last month
  • Docker Downloads: 0
Rankings
Docker downloads count: 2.6%
Dependent packages count: 3.2%
Stargazers count: 3.4%
Average: 4.2%
Dependent repos count: 5.2%
Forks count: 5.2%
Downloads: 5.4%
Maintainers (1)
Last synced: 6 months ago
proxy.golang.org: github.com/camel-lab/camel_tools
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 6 months ago
proxy.golang.org: github.com/CAMeL-Lab/camel_tools
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
pyproject.toml pypi
setup.py pypi