https://github.com/camel-lab/arabic-atb-closed-class-list
A Modern Standard Arabic Closed-Class Word List
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.3%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
A Modern Standard Arabic Closed-Class Word List
Basic Info
- Host: GitHub
- Owner: CAMeL-Lab
- Default Branch: main
- Size: 423 KB
Statistics
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
- Releases: 0
Created over 4 years ago
· Last pushed over 4 years ago
https://github.com/CAMeL-Lab/Arabic-ATB-closed-class-list/blob/main/
# A Modern Standard Arabic Closed-Class Word List This repository contains a list of Modern Standard Arabic closed-class words, which can be used as a stop list for a variety of natural language processing applications. The list contains 740 inflected words and clitics in the Arabic Treebank (ATB) tokenization scheme (Maamouri et al., 2004; Habash, 2010). The inflected words are based on 309 lemmas from the Standard Arabic Morphological Analyzer, SAMA (Graff et al., 2009). The list was create by Wael Salloum and Nizar Habash. The repository contains a technical report detailing its design decisions. If you use this resource, please cite: * Wael Salloum and Nizar Habash. 2012. A Modern Standard Arabic Closed-Class Word List. [Columbia University's Center for Computational Learning Systems Tech Report #CCLS-12-03](https://academiccommons.columbia.edu/doi/10.7916/D8K93GSN). ## References 1. D. Graff, M. Maamouri, B. Bouziri, S. Krouna, S. Kulick, and T. Buckwalter. Standard Arabic Morphological Analyzer (SAMA) Version 3.1, 2009. Linguistic Data Consortium LDC2009E73. 2. N. Habash. Introduction to Arabic Natural Language Processing. Morgan & Claypool Publishers, 2010. 3. M. Maamouri, A. Bies, T. Buckwalter, and W. Mekki. The Penn Arabic Treebank: Building a Large- Scale Annotated Arabic Corpus. In NEMLAR Conference on Arabic Language Resources and Tools, pages 102109, Cairo, Egypt, 2004.
Owner
- Name: CAMeL Lab
- Login: CAMeL-Lab
- Kind: organization
- Location: Abu Dhabi, UAE
- Website: http://camel-lab.com
- Repositories: 22
- Profile: https://github.com/CAMeL-Lab
The Computational Approaches to Modeling Language (CAMeL) Lab at New York University Abu Dhabi