entrainment-using-dnn

Unsupervised Auditory and Semantic Entrainment Models with Deep Neural Networks

https://github.com/jaykejriwal/entrainment-using-dnn

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.1%) to scientific vocabulary

Keywords

acoustics dnn entrainment semantic

Last synced: 6 months ago · JSON representation ·

Repository

Unsupervised Auditory and Semantic Entrainment Models with Deep Neural Networks

Basic Info

Host: GitHub
Owner: jaykejriwal
Language: Python
Default Branch: main
Homepage: https://www.isca-archive.org/interspeech_2023/kejriwal23_interspeech.html
Size: 338 KB

Statistics

Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Topics

acoustics dnn entrainment semantic

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme Citation

Training models for detecting entrainment using DNN

Python program for DNN models for detecting entrainment at auditory and semantic linguistic levels.

The program for extracting auditory features is adapted from https://github.com/nasir0md/unsupervised-learning-entrainment

Dataset

We utilized state-of-the-art DNN embeddings such as BERT and TRIpLet Loss network (TRILL) vectors to extract features for measuring semantic and auditory similarities of turns within dialogues in three spoken corpora, namely Columbia Games corpus, Voice Assistant conversation corpus, and Fisher corpus.

Required Software

ffmpeg (Download from https://www.ffmpeg.org/download.html)

sph2pipe (Download from https://www.openslr.org/3/)

opensmile (https://github.com/audeering/opensmile)

sentence-transformers (pip install sentence-transformers)

tensorflow (pip install tensorflow)

textgrid (Install textgrid from https://github.com/kylebgorman/textgrid)

TRILL vectors model (Download from https://tfhub.dev/google/nonsemantic-speech-benchmark/trill/3)

Execution instruction

Firstly, train models with Fisher corpus. The programs need to be executed in a sequential format.

Firstly, LLD features can be extracted using shell script file 0featextractnopre.sh

Next, the 1create_h5data.py file allows the creation of embeddings in h5 data format.

Lastly, models can be trained using different distance measures, such as L1 and cos, which are mentioned in the file.

For CGC and VAC corpus, two Jupyter Notebook files are provided. These files need to be executed first for feature extraction.

Citation

Kejriwal, J., Beňuš, Š., Rojas-Barahona, L.M. (2023) Unsupervised Auditory and Semantic Entrainment Models with Deep Neural Networks. Proc. INTERSPEECH 2023, 2628-2632, doi: 10.21437/Interspeech.2023-1929

Owner

Name: Jay Kejriwal
Login: jaykejriwal
Kind: user

Repositories: 1
Profile: https://github.com/jaykejriwal

Citation (CITATION.bib)

@inproceedings{kejriwal23_interspeech,
  author={Jay Kejriwal and Štefan Beňuš and Lina M. Rojas-Barahona},
  title={{Unsupervised Auditory and Semantic Entrainment Models with Deep Neural Networks}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={2628--2632},
  doi={10.21437/Interspeech.2023-1929},
  issn={2958-1796}
}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science