arabicdataset

~7,000 Arabic sentences.

https://github.com/sssiiisssiii/arabicdataset

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.8%) to scientific vocabulary

Keywords

arabic arabic-nlp classification dataset ml sentiment-analysis
Last synced: 4 months ago · JSON representation ·

Repository

~7,000 Arabic sentences.

Basic Info
  • Host: GitHub
  • Owner: SssiiiSssiii
  • Default Branch: main
  • Homepage:
  • Size: 583 KB
Statistics
  • Stars: 14
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
arabic arabic-nlp classification dataset ml sentiment-analysis
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme Citation

README.md

ArabicDataset

About 7,000 Arabic sentences * Economy * Sport * Culture * General * Political

Why is this dataset different from others?

  • Higher accuracy and clarity.
  • Multiple manual reviews. # License
  • ASTD (https://github.com/mahmoudnabil/ASTD)
  • ArSenTD (https://arxiv.org/abs/1906.01830v1) arXiv:1906.01830 [cs.CL]
  • Khaleej-2004 (https://metatext.io/datasets/khaleej-2004-corpus)
  • Abdulla,N.. (2014). Twitter Data set for Arabic Sentiment Analysis. UCI Machine Learning Repository. https://doi.org/10.24432/C5F31B. (https://archive.ics.uci.edu/dataset/293/twitter+data+set+for+arabic+sentiment+analysis)
  • Three computer science students decided to generate more than a thousand sentences in peace with a cup of coffee ;)
    • MOHAMED AHMED ABDEL FATTAH
    • YOUSSEF HAMADA IBRAHIM
    • MOHAMED GHAREEB MOHAMED # citation
  • If you use this dataset in your work, please cite it as follows: c Abdel Fattah, M., Hamada, Y., & Ghareeb, M. (2023). Arabic Dataset [Computer software]. https://github.com/SssiiiSssiii/ArabicDataset

⚠️CAUTION * This dataset includes sentences for NLP training, but it does not reflect any particular viewpoint. It is provided for research and educational purposes, and I am not responsible for the content of the sentences.

Owner

  • Name: Si Si
  • Login: SssiiiSssiii
  • Kind: user

Software engineer , Problem solver , Computer science student , NLP, ML

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this dataset, please cite it as below."
authors:
  - family-names: "Abdel Fattah"
    given-names: "Mohamed"
  - family-names: "Hamada"
    given-names: "Youssef"
  - family-names: "Ghareeb"
    given-names: "Mohamed" 
title: "Arabic Dataset"
date-released: 2023-09-27
url: "https://github.com/SssiiiSssiii/ArabicDataset"

GitHub Events

Total
  • Watch event: 1
  • Fork event: 1
Last Year
  • Watch event: 1
  • Fork event: 1