Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (3.4%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: salmujaiwel
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 96 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme Citation

README.md

Talafeef

This project was built entirely from scratch, starting with the collection of Arabic corpora and the development of automatic POS tagging and lemmatization tools, while also creating word-embedding models using the project's data to empirically observe how these models function.

Information about the corpora

  • Standard--Academic: 10,512 tokens
  • Standard-- Khutbah: 10,380 tokens
  • Standard--Newspapers: 10,408 tokens
  • Standard--Official-Issues: 10,093 tokens
  • Standard--Web: 10,097 tokens

Arabic Language Models

  • CRF Model: (crf_model.sav)
  • LSTM (RNN) Model: (RNN-model.h5, RNNtag2index.pkl, RNNword2index)
  • Skip-Gram Model: (SkipGram_model.pt)
  • CBOW Model: (cbowmodel.h5, CBOWEmbeddings.npz)
  • araBERTv02: (tokenvecscatarray.pkl, tokenvecscatarray.npz, tokenized_text.pkl)

Owner

  • Login: salmujaiwel
  • Kind: user
  • Company: King Saud University

Sultan Almujaiwel currently works as an associate professor of Computational Corpus Linguistics at King Saud University's Arabic Language Dept.

Citation (CITATION.cff)

Almujaiwe, S. (2023). Talafeef [GitHub repository]. https://github.com/salmujaiwel/talafeef

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1