https://github.com/camel-lab/arabic-atb-closed-class-list

A Modern Standard Arabic Closed-Class Word List

https://github.com/camel-lab/arabic-atb-closed-class-list

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

A Modern Standard Arabic Closed-Class Word List

Basic Info
  • Host: GitHub
  • Owner: CAMeL-Lab
  • Default Branch: main
  • Size: 423 KB
Statistics
  • Stars: 1
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 4 years ago · Last pushed over 4 years ago

https://github.com/CAMeL-Lab/Arabic-ATB-closed-class-list/blob/main/

# A Modern Standard Arabic Closed-Class Word List

This repository contains a list of Modern Standard Arabic closed-class words, 
which can be used as a stop list for a variety of natural language processing 
applications. The list contains 740 inflected words and clitics in the Arabic 
Treebank (ATB) tokenization scheme (Maamouri et al., 2004; Habash, 2010). 
The inflected words are based on 309 lemmas from the Standard Arabic Morphological 
Analyzer, SAMA (Graff et al., 2009). 

The list was create by Wael Salloum and Nizar Habash.
The repository contains a technical report detailing its design decisions.

If you use this resource, please cite:

* Wael Salloum and Nizar Habash. 2012. A Modern Standard Arabic Closed-Class Word List. [Columbia University's Center for Computational Learning Systems Tech Report #CCLS-12-03](https://academiccommons.columbia.edu/doi/10.7916/D8K93GSN).

## References 
1. D. Graff, M. Maamouri, B. Bouziri, S. Krouna, S. Kulick, and T. Buckwalter. Standard Arabic Morphological Analyzer (SAMA) Version 3.1, 2009. Linguistic Data Consortium LDC2009E73.
2. N. Habash. Introduction to Arabic Natural Language Processing. Morgan & Claypool Publishers, 2010.
3. M. Maamouri, A. Bies, T. Buckwalter, and W. Mekki. The Penn Arabic Treebank: Building a Large- Scale Annotated Arabic Corpus. In NEMLAR Conference on Arabic Language Resources and Tools, pages 102109, Cairo, Egypt, 2004.

Owner

  • Name: CAMeL Lab
  • Login: CAMeL-Lab
  • Kind: organization
  • Location: Abu Dhabi, UAE

The Computational Approaches to Modeling Language (CAMeL) Lab at New York University Abu Dhabi

GitHub Events

Total
Last Year