https://github.com/bagustris/detect-segment-cough

A python model to detect and segment coughs, forked from coughvid's repo

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
1 of 4 committers (25.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary

Keywords

automatic-speech-recognition cough-detection cough-sound covid-19 speech-segmentation

Last synced: 6 months ago · JSON representation

Repository

A python model to detect and segment coughs, forked from coughvid's repo

Basic Info

Host: GitHub
Owner: bagustris
Language: Jupyter Notebook
Default Branch: master
Homepage:
Size: 822 KB

Statistics

Stars: 10
Watchers: 1
Forks: 3
Open Issues: 1
Releases: 0

Topics

automatic-speech-recognition cough-detection cough-sound covid-19 speech-segmentation

Created about 4 years ago · Last pushed over 1 year ago

Metadata Files

Readme

Detect and Segment Cough

This repository hosts the codes and models to detect and segment cough sounds. As the names suggest, detecting a cough returns the probability of a given audio file containing a cough sound. Segment cough returns audio files containing a single segmented cough sound from given an audio file with many cough sounds (output of detect cough). These two methods (detect and segment cough) are the most important building blocks for developing cough-based diagnosis tools such as COVID-19.

Input-output

Input: audio files (.wav) to be predicted to have (multiple) cough sound
Output: Cough or non-cough (detect), new wav files containing segmented cough

Supported Python Version and Model (IMPORTANT!)

I tested that this model works on Python version >= 3.7.0.
Less than the version above (e.g., python3.6), the output probability will be "0". The xgboost package must be version 0.90.

Installation:

First, install the Python library dependencies in a virtual environment via pip.

pip install -r requirements.txt

API/Command Line Usage

```

Detect cough:

python3 detectcough.py -i inputfile.wav

Segment cough:

python3 segmentcough -i inputfile.wav ```

Python Usage

```

detect cough, see notebooks for segment cough

import sys sys.path.append('./src') from src.featureclass import features from src.DSP import classifycough from scipy.io import wavfile import pickle

inputfile = './samplerecordings/cough.wav' model = pickle.load(open('./models/coughclassifier', 'rb')) scaler = pickle.load(open('./models/coughclassification_scaler', 'rb'))

fs, x = wavfile.read(inputfile) prob = classifycough(x, fs, model, scaler) print(f"{input_file} has probability of cough: {prob}") ```

Example

```

detect cough:

bagus@L140MU:detect-cough$ /usr/bin/python3 detectcough.py -i samplerecordings/cough.wav /home/bagus/.local/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator LabelEncoder from version 0.22.1 when using version 1.0.2. This might lead to breaking code or invalid results. Use at your own risk. For more info, please refer to: https://scikit-learn.org/stable/modules/modelpersistence.html#security-maintainability-limitations warnings.warn( /home/bagus/.local/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator StandardScaler from version 0.22.1 when using version 1.0.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/modules/modelpersistence.html#security-maintainability-limitations warnings.warn( sample_recordings/cough.wav has probability of cough: 0.988208472728729

segment cough

bagus@L140MU:detect-cough$ ./segmentcough.py -i samplerecordings/cough.wav Write to ./cough-0.wav Write to ./cough-1.wav ```

Overview:

In the wake of the COVID-19 pandemic, mass coronavirus testing has proven essential to governments in monitoring the spread of the disease, isolating infected individuals, and effectively “flattening the curve” of infections over time. However, this oropharyngeal swab test is physically invasive and must be performed by a trained clinician. This requires patients to travel to a laboratory facility to get tested, thereby potentially infecting others along the way. Ideally, testing would be performed noninvasively at no cost, and administered at the homes of potential patients to minimize contamination risk.

The World Health Organization (WHO) has reported that 67.7% of COVID-19 patients exhibit a “dry cough,” which may be audibly different from coughs caused by other pathologies. Such cough sounds analysis has proven successful in diagnosing respiratory conditions like pertussis, asthma, and pneumonia.

At the Embedded Systems Laboratory (ESL) at EPFL, we have developed the COUGHVID database, which is an extensive dataset of COVID-19 cough sounds from around the world, partially validated by expert pulmonologists. We contribute our data, signal preprocessing source code, cough classification algorithm, and feature extraction methods to assist the global research community in developing algorithms to automatically screen for COVID-19 based on cough sounds.

Data access

The COUGHVID dataset can be downloaded from the following Zenodo link: https://doi.org/10.5281/zenodo.4048311

Notebooks

The coughvid_classification_example.ipynb notebook illustrates the usage of the cough classifier model for removing unwanted recordings from a cough database.

The segmentation_and_SNR_example.ipynb notebook is an example of how to use the automatic cough segmentation and SNR estimation algorithm.

Source code

Convert files

A quick function to automatically convert all of the compressed .webm and .ogg files in the COUGHVID dataset to the more usable .wav format. Note: you must have FFMPEG installed for this to work.

DSP

This file contains all-digital signal processing functions, including filtering the recordings and classifying between cough sounds and non-cough sounds.

Features

This file contains all of the functions used for the computation of audio signal features commonly used in cough classification.

Segmentation

This file contains a function for segmenting a recording into individual cough signals and additional code to compute the SNR of the recording.

Models

The cough_classifier is an XGB model that can be loaded and used in the classify_cough function to classify whether or not a given recording contains cough sounds. The cough_classification_scaler is a feature scaler also used in this function.

Citation

When using this resource, please cite the following publication:

Orlandic, L., Teijeiro, T. & Atienza, D. The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms. Sci Data 8, 156 (2021). https://doi.org/10.1038/s41597-021-00937-4
B. T. Atmaja, Zanjabila, Suyanto, and A. Sasou, “Comparing hysteresis comparator and RMS threshold methods for automatic single cough segmentations,” Int. J. Inf. Technol., no. 0123456789, Dec. 2023, doi: 10.1007/s41870-023-01626-8.

Reference

https://c4science.ch/diffusion/10770/ (original repo forked from)

Changelog

21/03/2022: Submit paper to interspeech about cough segmentation, commit: f330c2fb90431c736ee495b668ac0b0e0994b0cf
08/02/2022: Rename repo from detect-cough to detect-segment-cough
03/02/2022: Initial version, forked from EPFL's original repo

Owner

Name: Bagus Tris Atmaja
Login: bagustris
Kind: user
Location: Tsukuba
Company: AIST

Website: http://www.bagustris.blogspot.com
Twitter: btatmaja
Repositories: 221
Profile: https://github.com/bagustris

Researcher @aistairc @VibrasticLab

GitHub Events

Total

Watch event: 4
Issue comment event: 1
Push event: 1
Fork event: 1

Last Year

Watch event: 4
Issue comment event: 1
Push event: 1
Fork event: 1

Committers

Last synced: 8 months ago

All Time

Total Commits: 23
Total Committers: 4
Avg Commits per committer: 5.75
Development Distribution Score (DDS): 0.391

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Bagus Tris Atmaja	b**s@y**m	14
Lara Orlandic	l**c@g**m	7
zanjabil2502	z**l@g**m	1
Tomas Teijeiro	t**o@e**h	1

Committer Domains (Top 20 + Academic)

epfl.ch: 1

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 1
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: about 5 hours
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

https://github.com/bagustris/detect-segment-cough

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Detect and Segment Cough

Input-output

Supported Python Version and Model (IMPORTANT!)

Installation:

API/Command Line Usage

Detect cough:

Segment cough:

Python Usage

detect cough, see notebooks for segment cough

Example

detect cough:

segment cough

Overview:

Data access

Notebooks

Source code

Convert files

DSP

Features

Segmentation

Models

Citation

Reference

Changelog

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels