https://github.com/amanvirparhar/chaplin
A real-time silent speech recognition tool.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Keywords
auto-avsr
avsr
llm
ollama
speech-recognition
speech-to-text
vsr
Last synced: 9 months ago
·
JSON representation
Repository
A real-time silent speech recognition tool.
Basic Info
Statistics
- Stars: 530
- Watchers: 7
- Forks: 40
- Open Issues: 3
- Releases: 0
Topics
auto-avsr
avsr
llm
ollama
speech-recognition
speech-to-text
vsr
Created over 1 year ago
· Last pushed over 1 year ago
Metadata Files
Readme
License
README.md
Chaplin
![]()
A visual speech recognition (VSR) tool that reads your lips in real-time and types whatever you silently mouth. Runs fully locally.
Relies on a model trained on the Lip Reading Sentences 3 dataset as part of the Auto-AVSR project.
Watch a demo of Chaplin here.
Setup
- Clone the repository, and
cdinto it:bash git clone https://github.com/amanvirparhar/chaplin cd chaplin - Download the required model components: LRS3VWER19.1 and lmensubword.
- Unzip both folders, and place them in their respective directories:
chaplin/ ├── benchmarks/ ├── LRS3/ ├── language_models/ ├── lm_en_subword/ ├── models/ ├── LRS3_V_WER19.1/ ├── ... - Install and run
ollama, and pull thellama3.2model. - Install
uv.
Usage
- Run the following command:
bash sudo uv run --with-requirements requirements.txt --python 3.12 main.py config_filename=./configs/LRS3_V_WER19.1.ini detector=mediapipe - Once the camera feed is displayed, you can start "recording" by pressing the
optionkey (Mac) or thealtkey (Windows/Linux), and start mouthing words. - To stop recording, press the
optionkey (Mac) or thealtkey (Windows/Linux) again. You should see some text being typed out wherever your cursor is. - To exit gracefully, focus on the window displaying the camera feed and press
q.
Owner
- Name: Amanvir Parhar
- Login: amanvirparhar
- Kind: user
- Location: California
- Website: amanvir.com
- Twitter: amanvirparhar
- Repositories: 10
- Profile: https://github.com/amanvirparhar
builder, cs @ umd
GitHub Events
Total
- Issues event: 2
- Watch event: 458
- Issue comment event: 3
- Push event: 3
- Pull request event: 1
- Fork event: 40
- Create event: 2
Last Year
- Issues event: 2
- Watch event: 458
- Issue comment event: 3
- Push event: 3
- Pull request event: 1
- Fork event: 40
- Create event: 2
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Amanvir Parhar | a****r@g****m | 4 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 2
- Total pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- steve-the-crab (1)
- willwade (1)
Pull Request Authors
- synexo (2)