https://github.com/linuxscout/naftawayh

Naftawayh: arabic word tagger

https://github.com/linuxscout/naftawayh

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.2%) to scientific vocabulary
Last synced: 3 months ago · JSON representation

Repository

Naftawayh: arabic word tagger

Basic Info
  • Host: GitHub
  • Owner: linuxscout
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Size: 2.99 MB
Statistics
  • Stars: 13
  • Watchers: 3
  • Forks: 3
  • Open Issues: 0
  • Releases: 0
Created over 7 years ago · Last pushed over 5 years ago
Metadata Files
Readme Funding License

README.md

نفطويه: تصنيف الكلمات العربية

Naftawayh: Arabic Word Tagger

Naftawayh is a python library for Arabic word tagging (word classification) into types (nouns, verbs, stopwords), which is useful in language processing, especially for text mining. Naftawayh works according to the Arabic word structure, and the ability to guess the word class, through certain signs. For example, a word which ends Teh Marbuta, is a noun. Hamza Below Alef, class the word as a noun. We can identify many kins of words, by patterns especially for verbs in present tense and defined words.

نفطويه هو برنامج ومكتبة لتصنيف الكلمات إلى أنواعها (اسم، فعل، حرف)، ويفيد في المعالجة الآلية للغة وخصوصا التنقيب عن المعلومات، ومبدأه يعمل على بنية الكلمة العربية، وقدرتنا على تخمين نوعها، من خلال علامات معينة. فمثلا كل كلمة تنتهي بتاء مربوطة فهي اسم، وكل كلمة فيها همزة تحت الألف اسم. ويمكننا التعرف على كثير من الكلمات المعرّفة بالألف واللام، وبعض أنماط الأفعال المضارعة.

Developpers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail dot com

Features | value ---------|--------------------------------------------------------------------------------- Authors | Taha Zerrouki: http://tahadz.com, taha dot zerrouki at gmail dot com Release | 0.3 License |GPL Tracker |linuxscout/naftawayh/Issues Website |https://pypi.python.org/pypi/naftawayh Doc |package Documentaion Source |Github Download |pypi.python.org Feedbacks |Comments Accounts |@Twitter @Sourceforge

Citation

If you would cite it in academic work, can you use this citation T. Zerrouki‏, Naftawayh, Arabic Word Tagger, https://pypi.python.org/pypi/naftawayh/, 2010 or in bibtex format

bibtex @misc{zerrouki2012naftawayh, title={Naftawayh : Arabic Word Tagger}, author={Zerrouki, Taha}, url={https://pypi.python.org/pypi/naftawayh, year={2010} }

Applications

  • Text mining.
  • Text summarizing.
  • Sentences identification.
  • Grammar analysis.
  • Morphological analysis acceleration.
  • Extraction of ngrams..

تطبيقات

==== * التنقيب عن المعلومات. * تلخيص النص. * التعرف على الجمل. * التحليل النحوي. * تسريع التحليل الصرفي. * استخراج المصطلحات والمسكوكات والمتلازمات.

من هو نفطويه Who is Naftawayh

Who is Naftawayh?

Demo جرّب

يمكن التجربة على موقع مشكال ، اختر أدوات، ثم استخلاص ثم تصنيف You can test it on Mishkal Site, choose: Tool > extraction > Classify. Naftawayh Demo

Installation

pip install naftawayh

Usage

python import naftawayh.wordtag as wordtag

Test word list

```python

import naftawayh.wordtag word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', u'التطرف', u'اقتصادي', ) tagger = naftawayh.wordtag.WordTagger();

test all words

listtags = tagger.wordtagging(wordlist) for word, tag in zip(wordlist, list_tags): print word, tag بالبلاد n بينما vn3 أو t انسحاب n انعدام n انفجار n البرنامج n بانفعالاتها n العربي n الصرفي n التطرف n اقتصادي n ``` * Test word by word

```python

import naftawayh.wordtag word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', u'التطرف', u'اقتصادي', ) tagger = naftawayh.wordtag.WordTagger();

test word by word

for word in wordlist: if tagger.isnoun(word): print(u'%s is noun'%word) if tagger.isverb(word): print(u'%s is verb'%word) if tagger.isstopword(word): print(u'%s is stopword'%word) بالبلاد is noun بينما is noun بينما is verb أو is noun أو is verb أو is stopword انسحاب is noun انعدام is noun انفجار is noun البرنامج is noun بانفعالاتها is noun العربي is noun الصرفي is noun التطرف is noun اقتصادي is noun

``` * Test word in context

```python

import naftawayh.wordtag wordlist=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', u'التطرف', u'اقتصادي', ) tagger = naftawayh.wordtag.WordTagger(); previousword="" print (" **** test words in context***")

test words in context

for word in wordlist: tag=tagger.contextanalyse(previousword,word); print(u"%s from context is %s "%(word,tag)) previousword=word; **** test words in context*** بالبلاد from context is vn بينما from context is vn أو from context is vn انسحاب from context is vn انعدام from context is vn انفجار from context is vn البرنامج from context is vn بانفعالاتها from context is vn العربي from context is vn الصرفي from context is vn التطرف from context is vn اقتصادي from context is vn

```

Owner

  • Name: Taha Zerrouki (طه زروقي )
  • Login: linuxscout
  • Kind: user
  • Location: Bouira, Algeria
  • Company: Bouira University

PhD, Computer Science Professor, Interest : Arabic Natural Language processing

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 15
  • Total Committers: 3
  • Avg Commits per committer: 5.0
  • Development Distribution Score (DDS): 0.133
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
linuxscout t****i@h****m 13
Mohab Elsheikh m****h@g****m 1
Taha Zerrouki (طه زروقي ) t****i@g****m 1

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 1
  • Total pull requests: 1
  • Average time to close issues: 1 day
  • Average time to close pull requests: 25 minutes
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • linuxscout (1)
Pull Request Authors
  • mohabmes (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 2,688 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 4
  • Total versions: 4
  • Total maintainers: 1
pypi.org: naftawayh

Naftawayh: Arabic word tagger

  • Versions: 4
  • Dependent Packages: 1
  • Dependent Repositories: 4
  • Downloads: 2,688 Last month
Rankings
Dependent packages count: 4.8%
Average: 6.3%
Downloads: 6.7%
Dependent repos count: 7.5%
Maintainers (1)
Last synced: 4 months ago