contextual-music-recommender
https://github.com/diegovalduran/contextual-music-recommender
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.8%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: diegovalduran
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 4.61 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Emotion-Aware Music Recommendation System
This is a ML project that I implemented for CS229 - Machine Learning at Stanford. This project predicts emotional changes in music listening sessions and provides personalized recommendations based on learned preferences from users and song lyrics.
The full paper I wrote for this project can viewed here: CS2229FinalReport.pdf
Project Overview
There are three main components to this project: 1. Mood Context Awareness: Predicts how songs will affect users' emotional states (valance and arousal) using situational and psychological data 2. Lyrical Sentiment Analysis: Extracts semantic features from lyrics using a fine-tuned LLaMA model 3. User Preference Clustering: Groups users based on emotional responses and lyrical preferences based on data rated
The system achieved 87.8% accuracy using GBDT in predicting mood shifts post-music listening and provides personalized recommendations through agglomerative clustering.
Technical Components
1. Mood Prediction Models
- Gradient Boosted Decision Trees (GBDT): Best performer with 87.8% accuracy
- Random Forest (RF): Strong performance with interpretable results
- Multi-Layer Perceptron (MLP): Neural network approach for complex patterns
- Ensemble Model: Stacking classifier combining multiple models (the above)
2. Lyrical Sentiment Analysis
A fine-tuned LLaMA 3.2-3B model extracts semantic features from scraped song lyrics: - Narrative Complexity: Structural sophistication of lyrics - Emotional Sophistication: Depth and subtlety of emotional content - Thematic Elements: Identifies love, life, social themes, etc. - Temporal Focus: Measures emphasis on past, present, or future tense
3. User Preference Learning & Clustering
Hierarchical clustering groups users based on:
- Rating-Based Features: User interaction patterns with songs
- Emotional Preferences: Valence and arousal responses to music
- Lyrical Affinities: Preferences for narrative styles and themes
Project Structure
.
├── src/
│ ├── models/
│ │ ├── LR.py # Logistic Regression implementation
│ │ ├── MLP.py # Multi-Layer Perceptron implementation
│ │ ├── RF.py # Random Forest implementation
│ │ └── GBDT.py # Gradient Boosting implementation
│ ├── recs/
│ │ └── learn_preference.py # User preference learning system
│ ├── main.py # Main training pipeline
│ └── ensemble.py # Ensemble model implementation
├── script/
│ ├── LR.sh # Logistic Regression experiments
│ ├── MLP.sh # MLP experiments
│ ├── RF.sh # Random Forest experiments
│ ├── GBDT.sh # Gradient Boosting experiments
│ └── ensemble.sh # Ensemble experiments
├── datasets/
│ └── Psychological_Datasets/ # Dataset directory
├── logs/ # Experiment logs
├── models/ # Saved model checkpoints
└── requirements.txt # Project dependencies
Dataset
This project uses the SiTunes dataset [1], which includes:
- Situational Data (Obj.): Environmental data (weather, location) and physiological features (heart rate, activity)
- Emotional State Data (Sub.): User-reported emotional annotations (valence, arousal)
- Song Metadata: Basic information about tracks
To do sentiment analysis, I also built a scraper using lyricsgenius and used the song metadata to extract the lyrics.
Results
Mood Prediction Performance
| Model | Feature Set | Accuracy | Macro F1 | Micro F1 | |-------|-------------|----------|----------|----------| | GBDT | Obj. + Sub. | 87.8% | 0.8646 | 0.8777 | | Ensemble | Obj. + Sub. | 75.7% | 0.5855 | 0.7565 | | RF | Obj. + Sub. | 65.1% | 0.5216 | 0.6511 | | LR | Obj. + Sub. | 59.2% | 0.4593 | 0.5919 |
User Clustering
Four distinct user clusters were formed based on emotional responses and lyrical preferences:
- Cluster 0: High emotional variability, prefers complex narratives
- Cluster 1: Balanced feature profile, favors socially themed songs
- Cluster 2: Favors highly rated tracks with moderate emotional content
- Cluster 3: Greater valence range, higher emotional sophistication
Recommendation Quality
Recommendations achieved similarity scores ≥0.95, showing strong alignment with user preferences.
Transferability to Video Content
While this project is aimed at music, the core principles can also be applied to video content:
- Emotional Response Prediction: Predict viewer emotional responses to videos
- Content Analysis: Extract thematic elements from video metadata/transcripts
- User Preference Clustering: Group viewers based on emotional preferences
- Contextual Recommendations: Provide recommendations based on viewing situation
Setup Instructions
- Environment Setup: ```bash # Create and activate a virtual environment python -m venv env source env/bin/activate
# Install dependencies pip install -r requirements.txt ```
Data Preparation (optional):
- Place your music response datasets in the
datasets/Psychological_Datasets/directory - Make sure that the data follows the expected format (see above file for details)
- Place your music response datasets in the
Configuration:
- Each model's hyperparameters can be configured in their respective shell scripts
- Logging settings can be adjusted in
main.py
Running Experiments
Individual Models
Logistic Regression:
bash chmod +x script/LR.sh ./script/LR.sh- Settings:
- Setting2: l1 regularization, C=10
- Setting3: l1 regularization, C=10
Multi-Layer Perceptron:
bash chmod +x script/MLP.sh ./script/MLP.sh- Settings:
- Setting2: lr=5e-3, batch_size=256
- Setting3: lr=1e-3, batch_size=512
Random Forest:
bash chmod +x script/RF.sh ./script/RF.sh- Settings:
- Setting2: maxdepth=3, nestimators=300
- Setting3: maxdepth=5, nestimators=200
Gradient Boosting:
bash chmod +x script/GBDT.sh ./script/GBDT.sh- Settings:
- Setting2: lr=0.1, n_estimators=200
- Setting3: lr=0.05, n_estimators=100
Ensemble Model
bash
chmod +x script/ensemble.sh
./script/ensemble.sh
- Uses stacking classifier combining all individual models
Experiment Settings
Each experiment runs with: - 10 different random seeds (101-110) - 3 context groups (all, sub, obj) - 2 settings (Setting2, Setting3) - Total of 60 experiments per model
Metrics
The following metrics are computed for each experiment: - Accuracy - Macro F1 Score - Micro F1 Score
Output and Logging
- Results are saved in the
logs/directory - Model checkpoints are saved in the
models/directory - Final averages are computed across all runs
Troubleshooting
Common issues and solutions:
1. Permission Denied for Shell Scripts:
bash
chmod +x script/*.sh
Missing Dependencies:
bash pip install -r requirements.txtInvalid Metrics:
- Check if the dataset is properly formatted
- Ensure all required columns are present
- Verify the context group specifications
License
MIT License
Author Contributions
This project was developed solely by Diego Valdez Duran as part of CS229 - Machine Learning at Stanford University. All code components were written by the author except for: - Libraries: Standard ML libraries (scikit-learn, PyTorch) were used - LLaMA Model: The base model is from Meta AI - Dataset: SiTunes dataset [1]
Citation
This work uses the SiTunes dataset [1]:
bibtex
@inproceedings{10.1145/3627508.3638343,
author = {Grigorev, Vadim and Li, Jiayu and Ma, Weizhi and He, Zhiyu and Zhang, Min and Liu, Yiqun and Yan, Ming and Zhang, Ji},
title = {SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals},
year = {2024},
isbn = {9798400704345},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3627508.3638343},
doi = {10.1145/3627508.3638343},
booktitle = {Proceedings of the 2024 Conference on Human Information Interaction and Retrieval},
pages = {417–421},
numpages = {5},
location = {Sheffield, United Kingdom},
series = {CHIIR '24}
}
References
[1] Grigorev, V., Li, J., Ma, W., He, Z., Zhang, M., Liu, Y., Yan, M., & Zhang, J. (2024). SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals. In Proceedings of the 2024 Conference on Human Information Interaction and Retrieval (CHIIR '24). ACM, New York, NY, USA.
Contact
diegoval@stanford.edu
Owner
- Name: Diego Valdez Duran
- Login: diegovalduran
- Kind: user
- Repositories: 1
- Profile: https://github.com/diegovalduran
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Valdez Duran"
given-names: "Diego"
email: "diegoval@stanford.edu"
title: "Emotion-Aware Music Recommendation System"
version: 1.0.0
date-released: 2024-01-01
url: "[Your repository URL]"
repository-code: "[Your repository URL]"
license: MIT
references:
- authors:
- family-names: "Grigorev"
given-names: "Vadim"
- family-names: "Li"
given-names: "Jiayu"
- family-names: "Ma"
given-names: "Weizhi"
- family-names: "He"
given-names: "Zhiyu"
- family-names: "Zhang"
given-names: "Min"
- family-names: "Liu"
given-names: "Yiqun"
- family-names: "Yan"
given-names: "Ming"
- family-names: "Zhang"
given-names: "Ji"
title: "SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals"
type: conference-paper
doi: "10.1145/3627508.3638343"
year: 2024
conference:
name: "Conference on Human Information Interaction and Retrieval"
location: "Sheffield, United Kingdom"
collection-title: "CHIIR '24"
publisher:
name: "Association for Computing Machinery"
address: "New York, NY, USA"
url: "https://doi.org/10.1145/3627508.3638343"
GitHub Events
Total
- Delete event: 1
- Push event: 3
- Create event: 3
Last Year
- Delete event: 1
- Push event: 3
- Create event: 3
Dependencies
- lyricsgenius >=3.0.0
- matplotlib >=3.7.0
- numpy >=1.24.0
- pandas >=2.0.0
- pyarrow >=12.0.0
- python-dotenv >=1.0.0
- scikit-learn >=1.2.0
- scipy >=1.10.0
- seaborn >=0.12.0
- tqdm >=4.65.0