voice-based-emotion-recognition-using-ai-and-speech-processing

A deep learning project that detects human emotions from audio using MFCC feature extraction and a neural network classifier. Built using Python and TensorFlow, this tool leverages speech signal processing to predict emotions like happy, angry, or sad, helping machines better understand human sentiment.

https://github.com/captaincodercool/voice-based-emotion-recognition-using-ai-and-speech-processing

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: CAPTAINCODERCOOL
License: apache-2.0
Language: Python
Default Branch: master
Size: 12.4 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

README.md

🎙️ Voice-Based Emotion Recognition Using AI

This project focuses on recognizing human emotions from speech using deep learning techniques and signal processing. It combines MFCC-based feature extraction with a powerful neural network model to classify emotions like happy, sad, angry, and neutral from voice recordings.

🚀 Features

🎧 Detects emotion from voice recordings
🔍 MFCC feature extraction for audio signal processing
🧠 Deep neural network for multi-class emotion classification
🗂 Real-time audio classification & dataset preprocessing
📊 Accuracy visualization using confusion matrices

🛠 Tech Stack

Python 3
TensorFlow / Keras
NumPy, Pandas
librosa
scikit-learn
Matplotlib, Seaborn

📦 Installation

Clone this repository: ```bash git clone https://github.com/CAPTAINCODERCOOL/emotion-recognition-from-speech.git cd emotion-recognition-from-speech Install the required dependencies:

bash Copy Edit pip install -r requirements.txt (Optional) Use your own audio dataset, or rely on the pre-organized folders.

🎯 How It Works Loads WAV audio files from datasets

Extracts MFCCs (Mel-frequency cepstral coefficients)

Trains a neural network to classify emotions into:

Happy 😊

Angry 😠

Sad 😢

Fear 😨

Neutral 😐

📂 Project Structure bash Copy Edit emotion-recognition/ ├── audio/ # Sample audio files for testing ├── datasets/ # Training datasets (e.g., RAVDESS, SAVEE) ├── extract_features.py # MFCC + label extraction ├── model.py # Training and prediction model ├── predict.py # Predict emotion from new audio ├── requirements.txt └── README.md 🔬 Example Usage bash Copy Edit python model.py python predict.py --input audio/sample.wav 📊 Evaluation Confusion Matrix

Accuracy & loss plots

Precision, recall, F1-score

Test on unseen audio files

Future Enhancements Add real-time microphone input

Expand to multilingual datasets

Convert into a Streamlit Web App

Deploy via Flask for real-world applications

Dataset Link https://drive.google.com/drive/folders/1fTN2VkAcTycXCCXmQ71k5ioQmxTMoLm7?usp=sharing

📜 License This repository follows the original MIT License.

🌐 Connect With Me GitHub: CAPTAINCODERCOOL

LinkedIn: chiragpatil04

Email: chiragpatilprofessional@gmail.com

Owner

Login: CAPTAINCODERCOOL
Kind: user

Repositories: 1
Profile: https://github.com/CAPTAINCODERCOOL

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
  - family-names: Abdeladim
    given-names: Fadheli
title: Speech Emotion Recognition
version: 1.0.0
date-released: 2019-04-28
abstract: "This repository presents a comprehensive SER framework that employs various machine learning and deep learning techniques to accurately detect and classify human emotions from speech. The framework utilizes four datasets, including RAVDESS, TESS, EMO-DB, and a custom dataset, comprising a diverse range of emotions such as neutral, calm, happy, sad, angry, fear, disgust, pleasant surprise, and boredom. Feature extraction is performed using widely adopted audio features, including MFCC, Chromagram, MEL Spectrogram Frequency, Contrast, and Tonnetz. The repository also supports grid search for hyperparameter tuning and offers a range of classifiers and regressors such as SVC, RandomForest, GradientBoosting, KNeighbors, MLP, Bagging, and Recurrent Neural Networks. The developed SER system demonstrates promising accuracy in emotion classification, making it a valuable tool for researchers and practitioners in the field of affective computing and related domains."
repository-code: https://github.com/x4nth055/emotion-recognition-using-speech
license: MIT

GitHub Events

Total

Push event: 3
Create event: 3

Last Year

Push event: 3
Create event: 3

Dependencies

requirements.txt pypi

librosa ==0.6.3
matplotlib ==2.2.3
numpy *
pandas *
pyaudio ==0.2.11
scikit-learn ==0.24.2
soundfile ==0.9.0
tensorflow ==2.5.2
tqdm ==4.28.1
wave *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science