Nkululeko 1.0: A Python package to predict speaker characteristics with a high-level interface

Nkululeko 1.0: A Python package to predict speaker characteristics with a high-level interface - Published in JOSS (2025)

https://github.com/felixbur/nkululeko

Keywords

machine-learning pytorch speech

Last synced: 7 months ago · JSON representation

Repository

Machine learning speaker characteristics

Basic Info

Host: GitHub
Owner: felixbur
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 46.3 MB

Statistics

Stars: 41
Watchers: 3
Forks: 10
Open Issues: 22
Releases: 24

Topics

machine-learning pytorch speech

Created almost 5 years ago · Last pushed 7 months ago

Metadata Files

Readme Changelog Contributing License Code of conduct

Nkululeko

Nkululeko is a software to detect speaker characteristics by machine learning experiments with a high-level interface. The idea is to have a framework (based on e.g. sklearn and torch) that can be used to rapidly and automatically analyse audio data and explore machine learning models based on that data.

Some abilities that Nkululeko provides: combines acoustic features and machine learning models (including feature selection and features concatenation); performs data exploration, selection and visualization the results; finetuning; ensemble learning models; soft labeling (predicting labels with pre-trained model); and inference the model on a test set.

Nkululeko orchestrates data loading, feature extraction, and model training, allowing you to specify your experiment in a configuration file. The framework handles the process from raw data to trained model and evaluation, making it easy to run machine learning experiments without directly coding in Python.

Who is this for?

Nkululeko is for speech processing learners, researchers and ML practitioners focused on speaker characteristics, e.g., emotion, age, gender, or disorder detection.

Installation

Nkululeko requires Python 3.9 or higher with the following build status:

Create and activate a virtual Python environment and simply install Nkululeko:

```bash

using python venv

python -m venv .env source .env/bin/activate # specify OS versions, add a separate line for Windows users pip install nkululeko

using uv in development mode

uv venv --python 3.12 source .venv/bin/activate uv pip install -r requirements.txt

or run directly using uv run after cloning

uv run python -m nkululeko.nkululeko --config examples/exppolishtree.ini ```

Optional Dependencies

Nkululeko supports optional dependencies through extras:

```bash

Install with PyTorch support

pip install nkululeko[torch]

Install with CPU-only PyTorch

pip install nkululeko[torch-cpu]

Install with TensorFlow support

pip install nkululeko[tensorflow]

Install all optional dependencies

pip install nkululeko[all] ```

Manual Installation Options

You can also install dependencies manually:

PyTorch Installation

For CPU-only installation (recommended for most users): bash pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 --index-url https://download.pytorch.org/whl/cpu

For GPU support (cuda 12.6): bash pip install torch torchvision torchaudio

Some functionalities require extra packages to be installed, which we didn't include automatically:

For spotlight adapter: bash pip install PyYAML # Install PyYAML first to avoid dependency issues pip install nkululeko[spotlight]

Some examples for ini-files (which you use to control nkululeko) are in the examples folder.

Documentation

The documentation, along with extensions of installation, usage, INI file format, and examples, can be found nkululeko.readthedocs.io.

Usage

ini-file values

Basically, you specify your experiment in an "ini" file (e.g. experiment.ini) and then call one of the Nkululeko interfaces to run the experiment like this:

bash python -m nkululeko.nkululeko --config experiment.ini

A basic configuration looks like this:

ini [EXP] root = ./ name = exp_emodb [DATA] databases = ['emodb'] emodb = ./emodb/ emodb.split_strategy = speaker_split target = emotion labels = ['anger', 'boredom', 'disgust', 'fear'] [FEATS] type = ['praat'] [MODEL] type = svm [EXPL] model = tree plot_tree = True Read the Hello World example for initial usage with Emo-DB dataset.

Here is an overview of the interfaces/modules:

All of them take --config as an argument.

nkululeko.nkululeko: do machine learning experiments combining features and learners (e.g. opensmile with SVM)
nkululeko.ensemble: combine several nkululeko experiments and report on late fusion results
nkululeko.multidb: do multiple experiments, comparing several databases cross and in itself
nkululeko.demo: demo the current best model on the command line
nkululeko.test: predict a given data set with the current best model
nkululeko.explore: perform data exploration
nkululeko.augment: augment the current training data
nkululeko.aug_train: augment the current training data and do a training including this data
nkululeko.predict: predict features like SNR, MOS, arousal/valence, age/gender, with DNN models
nkululeko.segment: segment a database based on VAD (voice activity detection)
nkululeko.resample: check on all sampling rates and change to 16kHz
nkululeko.optim: do meta parameter optimization (e.g. grid search for SVM C and gamma)
nkululeko.flags: a convenient module to conduct multiple experiments with different configuration parameters on the command line.

Hello World example

NEW: Here's a Google colab that runs this example out-of-the-box, and here is the same with Kaggle
I made a video to show you how to do this on Windows
Set up Python on your computer, version >= 3.8
Open a terminal/command line/console window
Test python by typing python, python should start with version >3 (NOT 2!). You can leave the Python Interpreter by typing exit()
Create a folder on your computer for this example, let's call it nkulu_work
Get a copy of the Berlin emodb in audformat and unpack inside the folder you just created (nkulu_work)
Make sure the folder is called "emodb" and does contain the database files directly (not box-in-a-box)
Also, in the nkulu_work folder:
- Create a Python environment
- python -m venv venv
- Then, activate it:
- under Linux / mac
  - source venv/bin/activate
- under Windows
  - venv\Scripts\activate.bat
- if that worked, you should see a (venv) in front of your prompt
- Install the required packages in your environment
- pip install nkululeko
- Repeat until all error messages vanish (or fix them, or try to ignore them)...
Now you should have two folders in your nkulu_work folder:
- emodb and venv
Download a copy of the file exp_emodb.ini to the current working directory (nkulu_work)
Run the demo
- python -m nkululeko.nkululeko --config exp_emodb.ini
Find the results in the newly created folder exp_emodb
- Inspect exp_emodb/images/run_0/emodb_xgb_os_0_000_cnf.png
- This is the main result of your experiment: a confusion matrix for the emodb emotional categories
Inspect and play around with the demo configuration file that defined your experiment, then re-run.
There are many ways to experiment with different classifiers and acoustic feature sets, all described here

Features

The framework is targeted at the speech domain and supports experiments where different classifiers are combined with different feature extractors.

Classifiers: Naive Bayes, KNN, Tree, XGBoost, SVM, MLP
Feature extractors: Praat, Opensmile, openXBOW BoAW, TRILL embeddings, Wav2vec2 embeddings, audModel embeddings, ...
Feature scaling
Label encoding
Binning (continuous to categorical)
Online demo interface for trained models
Visualization: confusion matrix, feature importance, feature distribution, epoch progression, t-SNE plot, data distribution, bias checking, uncertainty estimation

Here's a rough UML-like sketch of the framework (and here's the real one done with pyreverse). sketch

Currently, the following linear classifiers are implemented (integrated from sklearn): * SVM, SVR, XGB, XGR, Tree, Treeregressor, KNN, KNNregressor, NaiveBayes, GMM and the following ANNs (artificial neural networks) * MLP (multi-layer perceptron), CNN (convolutional neural network)

For visualization, besides confusion matrix, feature importance, feature distribution, t-SNE plot, data distribution (just names a few), Nkululeko can also be used for bias checking, uncertainty estimation, and epoch progression.

Bias checking

In some cases, you might wonder if there's bias in your data. You can try to detect this with automatically estimated speech properties by visualizing the correlation of target labels and predicted labels.

Uncertainty

Nkululeko estimates the uncertainty of model decisions (only for classifiers) with entropy over the class probabilities or logits per sample.

Here's an animation that shows the progress of classification done with nkululeko.

News

There's Felix [blog](http://blog.syntheticspeech.de/?s=nkululeko) with tutorials below: * [Ensemble learning with Nkululeko](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/) * [Finetune transformer-models with Nkululeko](http://blog.syntheticspeech.de/2024/05/29/nkululeko-how-to-finetune-a-transformer-model/) * Below is a [Hello World example for Nkululeko](#helloworld) that should set you up fastly, also on [Google Colab](https://colab.research.google.com/drive/1GYNBd5cdZQ1QC3Jm58qoeMaJg3UuPhjw?usp=sharing#scrollTo=4G_SjuF9xeQf), and [with Kaggle](https://www.kaggle.com/felixburk/nkululeko-hello-world-example) * [Thanks to deepwiki, here's an analysis of the source code](https://deepwiki.com/felixbur/nkululeko) * [Here's a blog post on how to set up nkululeko on your computer.](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/) * [Here's a slide presentation about nkululeko](docs/nkululeko.pdf) * [Here's a video presentation about nkululeko](https://www.youtube.com/playlist?list=PLRceVavtxLg0y2jiLmpnUfiMtfvkK912D) * [Here's the 2022 LREC article on nkululeko](http://felix.syntheticspeech.de/publications/Nkululeko_LREC.pdf) * [Introduction](http://blog.syntheticspeech.de/2021/08/04/machine-learning-experiment-framework/) * [Nkululeko FAQ](http://blog.syntheticspeech.de/2022/07/07/nkululeko-faq/) * [How to set up your first nkululeko project](http://blog.syntheticspeech.de/2021/08/30/how-to-set-up-your-first-nkululeko-project/) * [Setting up a base nkululeko experiment](http://blog.syntheticspeech.de/2021/10/05/setting-up-a-base-nkululeko-experiment/) * [How to import a database](http://blog.syntheticspeech.de/2022/01/27/nkululeko-how-to-import-a-database/) * [Comparing classifiers and features](http://blog.syntheticspeech.de/2021/10/05/nkululeko-comparing-classifiers-and-features/) * [Use Praat features](http://blog.syntheticspeech.de/2022/06/27/how-to-use-selected-features-from-praat-with-nkululeko/) * [Combine feature sets](http://blog.syntheticspeech.de/2022/06/30/how-to-combine-feature-sets-with-nkululeko/) * [Classifying continuous variables](http://blog.syntheticspeech.de/2022/01/26/nkululeko-classifying-continuous-variables/) * [Try out / demo a trained model](http://blog.syntheticspeech.de/2022/01/24/nkululeko-try-out-demo-a-trained-model/) * [Perform cross-database experiments](http://blog.syntheticspeech.de/2021/10/05/nkululeko-perform-cross-database-experiments/) * [Meta parameter optimization](http://blog.syntheticspeech.de/2021/09/03/perform-optimization-with-nkululeko/) * [How to set up wav2vec embedding](http://blog.syntheticspeech.de/2021/12/03/how-to-set-up-wav2vec-embedding-for-nkululeko/) * [How to soft-label a database](http://blog.syntheticspeech.de/2022/01/24/how-to-soft-label-a-database-with-nkululeko/) * [Re-generate the progressing confusion matrix animation wit a different framerate](demos/plot_faster_anim.py) * [How to limit/filter a dataset](http://blog.syntheticspeech.de/2022/02/22/how-to-limit-a-dataset-with-nkululeko/) * [Specifying database disk location](http://blog.syntheticspeech.de/2022/02/21/specifying-database-disk-location-with-nkululeko/) * [Add dropout with MLP models](http://blog.syntheticspeech.de/2022/02/25/adding-dropout-to-mlp-models-with-nkululeko/) * [Do cross-validation](http://blog.syntheticspeech.de/2022/03/23/how-to-do-cross-validation-with-nkululeko/) * [Combine predictions per speaker](http://blog.syntheticspeech.de/2022/03/24/how-to-combine-predictions-per-speaker-with-nkululeko/) * [Run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/) * [Compare several MLP layer layouts with each other](http://blog.syntheticspeech.de/2022/04/11/how-to-compare-several-mlp-layer-layouts-with-each-other/) * [Import features from outside the software](http://blog.syntheticspeech.de/2022/10/18/how-to-import-features-from-outside-the-nkululeko-software/) * [Export acoustic features](http://blog.syntheticspeech.de/2024/05/30/nkululeko-export-acoustic-features/) * [Explore feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/) * [Plot distributions for feature values](http://blog.syntheticspeech.de/2023/02/16/nkululeko-how-to-plot-distributions-of-feature-values/) * [Show feature importance](http://blog.syntheticspeech.de/2023/02/20/nkululeko-show-feature-importance/) * [Augment the training set](http://blog.syntheticspeech.de/2023/03/13/nkululeko-how-to-augment-the-training-set/) * [Visualize clusters of acoustic features](http://blog.syntheticspeech.de/2023/04/20/nkululeko-visualize-clusters-of-your-acoustic-features/) * [Visualize your data distribution](http://blog.syntheticspeech.de/2023/05/11/nkululeko-how-to-visualize-your-data-distribution/) * [Check your dataset](http://blog.syntheticspeech.de/2023/07/11/nkululeko-check-your-dataset/) * [Segmenting a database](http://blog.syntheticspeech.de/2023/07/14/nkululeko-segmenting-a-database/) * [Predict new labels for your data from public models and check bias](http://blog.syntheticspeech.de/2023/08/16/nkululeko-how-to-predict-labels-for-your-data-from-existing-models-and-check-them/) * [Resample](http://blog.syntheticspeech.de/2023/08/31/how-to-fix-different-sampling-rates-in-a-dataset-with-nkululeko/) * [Get some statistics on correlation and effect-size](http://blog.syntheticspeech.de/2023/09/05/nkululeko-get-some-statistics-on-correlation-and-effect-size/) * [Automatic generation of a latex/pdf report](http://blog.syntheticspeech.de/2023/09/26/nkululeko-generate-a-latex-pdf-report/) * [Inspect your data with Spotlight](http://blog.syntheticspeech.de/2023/10/31/nkululeko-inspect-your-data-with-spotlight/) * [Automatically stratify your split sets](http://blog.syntheticspeech.de/2023/11/07/nkululeko-automatically-stratify-your-split-sets/) * [re-name data column names](http://blog.syntheticspeech.de/2023/11/16/nkululeko-re-name-data-column-names/) * [Oversample the training set](http://blog.syntheticspeech.de/2023/11/16/nkululeko-oversample-the-training-set/) * [Compare several databases](http://blog.syntheticspeech.de/2024/01/02/nkululeko-compare-several-databases/) * [Tweak the target variable for database comparison](http://blog.syntheticspeech.de/2024/03/13/nkululeko-how-to-tweak-the-target-variable-for-database-comparison/) * [How to run multiple experiments in one go](http://blog.syntheticspeech.de/2022/03/28/how-to-run-multiple-experiments-in-one-go-with-nkululeko/) * [How to finetune a transformer-model](http://blog.syntheticspeech.de/2024/05/29/nkululeko-how-to-finetune-a-transformer-model/) * [Ensemble (combine) classifiers with late-fusion](http://blog.syntheticspeech.de/2024/06/25/nkululeko-ensemble-classifiers-with-late-fusion/) * [Use train, dev and test splits](https://blog.syntheticspeech.de/2025/03/31/nkululeko-how-to-use-train-dev-test-splits/)

License

Nkululeko can be used under the MIT license.

Contributing

Contributions are welcome and encouraged. To learn more about how to contribute to nkululeko, please refer to the Contributing guidelines.

Citation

If you use Nkululeko, please cite the paper:

F. Burkhardt, Johannes Wagner, Hagen Wierstorf, Florian Eyben and Björn Schuller: Nkululeko: A Tool For Rapid Speaker Characteristics Detection, Proc. Proc. LREC, 2022

@inproceedings{Burkhardt:lrec2022, title = {Nkululeko: A Tool For Rapid Speaker Characteristics Detection}, author = {Felix Burkhardt and Johannes Wagner and Hagen Wierstorf and Florian Eyben and Björn Schuller}, isbn = {9791095546726}, journal = {2022 Language Resources and Evaluation Conference, LREC 2022}, keywords = {machine learning,speaker characteristics,tools}, pages = {1925-1932}, publisher = {European Language Resources Association (ELRA)}, year = {2022}, }

JOSS Publication

Nkululeko 1.0: A Python package to predict speaker characteristics with a high-level interface

Published

November 03, 2025

DOI

10.21105/joss.08049

Volume 10, Issue 115, Page 8049

Authors

Felix Burkhardt

audEERING GmbH, Germany, TU Berlin, Germany

Bagus Tris Atmaja

Nara Institute of Science and Technology (NAIST), Japan

Editor

Fabian-Robert Stöter

GitHub Events

Total

Create event: 59
Commit comment event: 3
Release event: 2
Issues event: 54
Watch event: 8
Delete event: 4
Issue comment event: 95
Push event: 245
Pull request review comment event: 65
Pull request review event: 90
Pull request event: 110
Fork event: 3

Last Year

Create event: 57
Commit comment event: 1
Release event: 2
Issues event: 51
Watch event: 8
Delete event: 4
Issue comment event: 86
Push event: 233
Pull request review comment event: 65
Pull request review event: 90
Pull request event: 105
Fork event: 3

Committers

Last synced: 8 months ago

All Time

Total Commits: 1,750
Total Committers: 14
Avg Commits per committer: 125.0
Development Distribution Score (DDS): 0.499

Past Year

Commits: 501
Committers: 13
Avg Commits per committer: 38.538
Development Distribution Score (DDS): 0.762

Top Committers

Name	Email	Commits
FBurkhardt	f**t@a**m	877
Bagus Tris Atmaja	b**s@y**m	223
Felix Burkhardt	f**k@g**m	208
Bagus Tris Atmaja	b**s@o**m	159
felixbur	f**r@u**m	93
Bagus Tris Atmaja	b**a@g**m	72
b-atmaja	b**a@a**p	47
bagustris	b**s@u**m	31
Devin AI	1**]@u**m	22
Bagus Tris Atmaja	b**s@e**d	12
Bagus Tris Atmaja	b**s@n**p	3
Bagus Tris Atmaja	b**s@c**p	1
FBurkhardt	f**t@A**l	1
google-labs-jules[bot]	1**]@u**m	1

Committer Domains (Top 20 + Academic)

cc21dev1.naist.jp: 1 naist.ac.jp: 1 ep.its.ac.id: 1 aist.go.jp: 1 audeering.com: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 125
Total pull requests: 176
Average time to close issues: 3 months
Average time to close pull requests: 1 day
Total issue authors: 5
Total pull request authors: 6
Average comments per issue: 1.98
Average comments per pull request: 0.26
Merged pull requests: 154
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 31
Pull requests: 87
Average time to close issues: 13 days
Average time to close pull requests: 1 day
Issue authors: 4
Pull request authors: 6
Average comments per issue: 0.74
Average comments per pull request: 0.24
Merged pull requests: 71
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

felixbur (82)
bagustris (38)
Pascal-H (3)
fakufaku (1)
isdanni (1)

Pull Request Authors

bagustris (141)
felixbur (31)
Copilot (1)
febaCODE2025 (1)
kally1218 (1)
faroit (1)

Top Labels

Issue Labels

enhancement (5) bug (2)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 1,744 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 243
Total maintainers: 1

pypi.org: nkululeko

Machine learning audio prediction experiments based on templates

Homepage: https://github.com/felixbur/nkululeko
Documentation: https://github.com/felixbur/nkululeko
License: MIT
Latest release: 1.0.1
published 7 months ago

Versions: 243
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 1,744 Last month

Rankings

Dependent packages count: 6.6%

Downloads: 8.1%

Average: 17.4%

Stargazers count: 18.6%

Forks count: 23.2%

Dependent repos count: 30.6%

Maintainers (1)

felixbur

Last synced: 7 months ago

Nkululeko 1.0: A Python package to predict speaker characteristics with a high-level interface

Science Score: 95.0%

Keywords

Basic Info

Statistics

Topics

Metadata Files

Nkululeko

Who is this for?

Installation

using python venv

using uv in development mode

or run directly using uv run after cloning

Optional Dependencies

Install with PyTorch support

Install with CPU-only PyTorch

Install with TensorFlow support

Install all optional dependencies

Manual Installation Options

PyTorch Installation

Documentation

Usage

Features

Bias checking

Uncertainty

News

License

Contributing

Citation

JOSS Publication

Nkululeko 1.0: A Python package to predict speaker characteristics with a high-level interface

Authors

Editor

Tags

GitHub Events

Total

Last Year

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: nkululeko

Rankings

Maintainers (1)