Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ieee.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: linda-XI
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 224 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

Towards Efficient Laughter Detection with Convolutional Neural Networks

This repo is based on the laughter detection model by privous students Lasse Wolter and retrains it on the ICSI Meeting corpus

The data pipeline uses Lhotse, a new Python library for speech and audio data preparation.

This repository consists of three main parts: 1. Evaluation Pipeline 2. Data Pipeline 3. Training Code

The following list outlines which parts of the repository belong to each of them and classifies the parts/files as one of three types: 1. from scratch: entirely written by myself 2. adapted: code taken from Lasse Wolter and adapted 3. unmodified: code taken from Lasse Wolter and not adapted or modified

  • Evalation Pipeline (adapted):

    • analysis
      • transcript_parsing/parse.py +preprocess.py(adapted): parsing and preprocessing the ICSI transcripts
      • analyse.py(adapted): main function, that parses and evaluates predictions from .TextGrid files output by the model
    • visualise.py(adapted): functions for visualising model performance (incl. prec-recall curve and confusion matrix)
    • flops.py(from scratch): functions to calculate the FLOPs of models
    • inference_time.py(from scratch): functions to calculate the inference time of models
    • rftPricision.py(from scratch): functions to draw diagram for accuracy and speed metrics
  • Data Pipeline (adapted)

    • compute_features(adapted): computes feature representing the whole corpus and specific subsets of the ICSI corpus
    • create_data_df.py(adapted): creates a dataframe representing training, development and test-set
  • Training Code(adapted):

    • models.py(adapted) : defines the model architecture
    • model_utils.py(from scratch): defines model architecture
    • train.py(adapted) : main training code
    • segment_laughter.py(adapted) + laugh_segmenter.py(unmodified) : inference code to run laughter detection on audio files
    • datasets.py(unmodified) + load_data.py(adapted) : the new LAD (Laugh Activity Detection) Dataset + new inference Dataset and code for their creation
  • Misc:

    • config.py(adapted) : configurations for different parts of the pipeline
    • results.zip (N/A): contains the model predictions from experiments presented in my thesis

Owner

  • Login: linda-XI
  • Kind: user

GitHub Events

Total
Last Year