nlp_pipeline4researchers
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.8%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: mrhallonline
- License: gpl-3.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 26.9 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
NLPResearchPipelines (In Progress)
WhisperXTranscription4Researchers: This notebook leverages advanced transcription and diarization capabilities provided by the Whisper, WhisperX, and pyannote libraries. By using GPU acceleration, it processes audio data efficiently, performing alignment and diarization to produce structured outputs that are saved in CSV format for further analysis. This will batch transcribe all marked audio files found in a folder or subfolder of the path folder. This will also output the resulting csv and txt files to the folder where the audio files were found. The process will also anonymize specific names, diarize voices, and set up the files that can be used for the rest of the NLP pipeline. (Working)

Corpus Comparison Tool: (Working)

NLTK NLP Text Analytics: Peek under the hood of the data corpus (Working)
SpaCy + BERTopic Topic Modeling Pipeline: (In Progress)
Gensim Pipeline: (In Progress)
Owner
- Name: blerdrage
- Login: mrhallonline
- Kind: user
- Website: kevinhall.info
- Repositories: 4
- Profile: https://github.com/mrhallonline
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: NLP_Pipeline4Researchers
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Kevin
family-names: Hall
name-particle:
email: cyborgresearcher@gmail.com
orcid: 'https://orcid.org/0009-0009-1615-7524'
repository-code: >-
https://github.com/mrhallonline/NLP_Pipeline4Researchers
GitHub Events
Total
- Push event: 3
Last Year
- Push event: 3