Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: AaltoRSE
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 36.1 KB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

Transcribe and Diarize

A script for trancribing and diarizing wav formatted audio files.

Installation

bash pip install git+https://github.com/AaltoRSE/Diarize.git

Usage

Transcription

List all parameters using

bash transcribe_and_diarize --help

For example, to run for all files in a folder:

bash transcribe_and_diarize --input_folder=INPUT_FOLDER_NAME --output_folder=OUTPUT_FOLDER_NAME --hugging_face_token YOUR_TOKEN

Summarizing transcripts

First install the GPT4All client and use it to download a model. Make a note of the folder where the model files are stored. You will need a path to the model file to run summarization.

To summarize all diarized transcripts in a folder:

bash summarize_transcript --input_folder=INPUT_FOLDER_NAME --output_folder=OUTPUT_FOLDER_NAME --model PATH_TO_MODEL

Owner

  • Name: AaltoRSE
  • Login: AaltoRSE
  • Kind: organization

Citation (citation.cff)

cff-version: 1.2.0
message: "Please cite this software as below."
authors:
- family-names: "Rantaharju"
  given-names: "Jarno"
  orcid: "https://orcid.org/0000-0002-0072-7707"
- family-names: "Teemu"
  given-names: "Ruokolainen"
- family-names: "Truong"
  given-names: "Nghiep Lucy"
  orcid: "https://orcid.org/0009-0009-9668-4403"
title: "Diarize"
version: 1.0.0
doi: 10.5281/zenodo.10201091
date-released: 2024-01-14
url: "https://github.com/AaltoRSE/Diarize"

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • click *
setup.py pypi