areeg-an-open-access-arabic-inner-speech-eeg-dataset
Repository contains all code needed to work with ArEEG dataset
https://github.com/eslam21/areeg-an-open-access-arabic-inner-speech-eeg-dataset
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Keywords
Repository
Repository contains all code needed to work with ArEEG dataset
Basic Info
Statistics
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
🧠 ArEEG: An Open-Access Arabic Inner Speech EEG Dataset
Welcome to the official repository for ArEEG, the first open-access EEG dataset capturing inner speech in Arabic.
This dataset enables research in brain-computer interfaces (BCI), Arabic language processing, and neuro-linguistics.
The repository provides all the necessary code and scripts for:
- Data loading & preprocessing: dataloader.py
- Getting started quickly: areeg-starter.ipynb
- Reproducing the experiments
🚀 Key Features
- 👥 12 native Arabic participants (balanced gender distribution, aged 17–25)
- 🧩 5 inner speech commands: Up, Down, Left, Right, Select
- 🎧 8-channel EEG headset (Unicorn Hybrid Black+, 250 Hz sampling rate)
- 🧪 4650 trials across 15 sessions per subject (one with 21 sessions)
- 💻 Open-source preprocessing & ML pipelines (Python, NumPy, Pandas, Scikit-learn, MNE)
- 🌍 First-ever open Arabic inner speech dataset for BCI research
📂 Dataset Access and Notebooks
The dataset is hosted publicly on:
- 📦 OpenNeuro
- 📊 Kaggle (data can be found in RecordedSessions folder)
To make it easy to get started, we provide a multi-version Kaggle notebook of preliminary results
Publication
📄 Official publication in Scientific Data (Nature)
📜 Citation
If you use the ArEEG dataset in your work, please cite it as follows:
bibtex
@article{Metwalli2025,
title = {ArEEG: an Open-Access Arabic Inner Speech EEG Dataset},
volume = {12},
ISSN = {2052-4463},
url = {http://dx.doi.org/10.1038/s41597-025-05387-w},
DOI = {10.1038/s41597-025-05387-w},
number = {1},
journal = {Scientific Data},
publisher = {Springer Science and Business Media LLC},
author = {Metwalli, Donia and Kiroles, Antony E. and Radwan, Yousef A. and Mohamed, Eslam Ahmed and Barakat, Mariam and Ahmed, Anas and Omar, Amr M. and Selim, Sahar},
year = {2025},
month = aug
}
Owner
- Name: Eslam Mohamed
- Login: Eslam21
- Kind: user
- Location: Egypt
- Repositories: 3
- Profile: https://github.com/Eslam21
Just a nerd
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this dataset, please cite it using the following metadata."
title: "ArEEG: Arabic Inner Speech EEG dataset"
authors:
- family-names: Metwalli
given-names: Donia
- family-names: Ahmed
given-names: Eslam
- family-names: Emil
given-names: Antony
- family-names: Radwan
given-names: Yousef A.
- family-names: Barakat
given-names: Mariam
- family-names: Ahmed
given-names: Anas
- family-names: Omar
given-names: Amro
- family-names: Selim
given-names: Sahar
date-released: 2025-01-01
doi: 10.18112/openneuro.ds005262.v1.0.1
version: 1.0.1
repository-code: "https://openneuro.org/datasets/ds005262"
publisher: OpenNeuro
GitHub Events
Total
- Watch event: 3
- Push event: 11
- Fork event: 1
Last Year
- Watch event: 3
- Push event: 11
- Fork event: 1
Dependencies
- numpy ==1.26.4
- pandas ==2.1.4
- scikit-learn ==1.2.2
- scipy ==1.11.4