https://github.com/linuxscout/naftawayh
Naftawayh: arabic word tagger
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Repository
Naftawayh: arabic word tagger
Basic Info
- Host: GitHub
- Owner: linuxscout
- License: gpl-3.0
- Language: Python
- Default Branch: master
- Size: 2.99 MB
Statistics
- Stars: 13
- Watchers: 3
- Forks: 3
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
نفطويه: تصنيف الكلمات العربية
Naftawayh: Arabic Word Tagger
Naftawayh is a python library for Arabic word tagging (word classification) into types (nouns, verbs, stopwords), which is useful in language processing, especially for text mining. Naftawayh works according to the Arabic word structure, and the ability to guess the word class, through certain signs. For example, a word which ends Teh Marbuta, is a noun. Hamza Below Alef, class the word as a noun. We can identify many kins of words, by patterns especially for verbs in present tense and defined words.
نفطويه هو برنامج ومكتبة لتصنيف الكلمات إلى أنواعها (اسم، فعل، حرف)، ويفيد في المعالجة الآلية للغة وخصوصا التنقيب عن المعلومات، ومبدأه يعمل على بنية الكلمة العربية، وقدرتنا على تخمين نوعها، من خلال علامات معينة. فمثلا كل كلمة تنتهي بتاء مربوطة فهي اسم، وكل كلمة فيها همزة تحت الألف اسم. ويمكننا التعرف على كثير من الكلمات المعرّفة بالألف واللام، وبعض أنماط الأفعال المضارعة.
Developpers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail dot com
Features | value ---------|--------------------------------------------------------------------------------- Authors | Taha Zerrouki: http://tahadz.com, taha dot zerrouki at gmail dot com Release | 0.3 License |GPL Tracker |linuxscout/naftawayh/Issues Website |https://pypi.python.org/pypi/naftawayh Doc |package Documentaion Source |Github Download |pypi.python.org Feedbacks |Comments Accounts |@Twitter @Sourceforge
Citation
If you would cite it in academic work, can you use this citation
T. Zerrouki, Naftawayh, Arabic Word Tagger,
https://pypi.python.org/pypi/naftawayh/, 2010
or in bibtex format
bibtex
@misc{zerrouki2012naftawayh,
title={Naftawayh : Arabic Word Tagger},
author={Zerrouki, Taha},
url={https://pypi.python.org/pypi/naftawayh,
year={2010}
}
Applications
- Text mining.
- Text summarizing.
- Sentences identification.
- Grammar analysis.
- Morphological analysis acceleration.
- Extraction of ngrams..
تطبيقات
==== * التنقيب عن المعلومات. * تلخيص النص. * التعرف على الجمل. * التحليل النحوي. * تسريع التحليل الصرفي. * استخراج المصطلحات والمسكوكات والمتلازمات.
من هو نفطويه Who is Naftawayh

Demo جرّب
يمكن التجربة على موقع مشكال
، اختر أدوات، ثم استخلاص ثم تصنيف
You can test it on Mishkal Site, choose: Tool > extraction > Classify.

Installation
pip install naftawayh
Usage
python
import naftawayh.wordtag as wordtag
Test word list
```python
import naftawayh.wordtag word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', u'التطرف', u'اقتصادي', ) tagger = naftawayh.wordtag.WordTagger();
test all words
listtags = tagger.wordtagging(wordlist) for word, tag in zip(wordlist, list_tags): print word, tag بالبلاد n بينما vn3 أو t انسحاب n انعدام n انفجار n البرنامج n بانفعالاتها n العربي n الصرفي n التطرف n اقتصادي n ``` * Test word by word
```python
import naftawayh.wordtag word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', u'التطرف', u'اقتصادي', ) tagger = naftawayh.wordtag.WordTagger();
test word by word
for word in wordlist: if tagger.isnoun(word): print(u'%s is noun'%word) if tagger.isverb(word): print(u'%s is verb'%word) if tagger.isstopword(word): print(u'%s is stopword'%word) بالبلاد is noun بينما is noun بينما is verb أو is noun أو is verb أو is stopword انسحاب is noun انعدام is noun انفجار is noun البرنامج is noun بانفعالاتها is noun العربي is noun الصرفي is noun التطرف is noun اقتصادي is noun
``` * Test word in context
```python
import naftawayh.wordtag wordlist=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', u'التطرف', u'اقتصادي', ) tagger = naftawayh.wordtag.WordTagger(); previousword="" print (" **** test words in context***")
test words in context
for word in wordlist: tag=tagger.contextanalyse(previousword,word); print(u"%s from context is %s "%(word,tag)) previousword=word; **** test words in context*** بالبلاد from context is vn بينما from context is vn أو from context is vn انسحاب from context is vn انعدام from context is vn انفجار from context is vn البرنامج from context is vn بانفعالاتها from context is vn العربي from context is vn الصرفي from context is vn التطرف from context is vn اقتصادي from context is vn
```
Owner
- Name: Taha Zerrouki (طه زروقي )
- Login: linuxscout
- Kind: user
- Location: Bouira, Algeria
- Company: Bouira University
- Website: tahadz.com
- Twitter: linuxscout
- Repositories: 22
- Profile: https://github.com/linuxscout
PhD, Computer Science Professor, Interest : Arabic Natural Language processing
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| linuxscout | t****i@h****m | 13 |
| Mohab Elsheikh | m****h@g****m | 1 |
| Taha Zerrouki (طه زروقي ) | t****i@g****m | 1 |
Issues and Pull Requests
Last synced: 5 months ago
All Time
- Total issues: 1
- Total pull requests: 1
- Average time to close issues: 1 day
- Average time to close pull requests: 25 minutes
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- linuxscout (1)
Pull Request Authors
- mohabmes (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 2,688 last-month
- Total dependent packages: 1
- Total dependent repositories: 4
- Total versions: 4
- Total maintainers: 1
pypi.org: naftawayh
Naftawayh: Arabic word tagger
- Homepage: http://naftawayh.sourceforge.net/
- Documentation: https://naftawayh.readthedocs.io/
- License: GPL
-
Latest release: 0.4
published over 5 years ago