https://github.com/bagustris/speech-recognition-course
Material for learning speech recognition, based on Microsoft teaching material on EdX
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary
Keywords
Repository
Material for learning speech recognition, based on Microsoft teaching material on EdX
Basic Info
- Host: GitHub
- Owner: bagustris
- Language: Roff
- Default Branch: master
- Homepage: https://bagustris.github.io/speech-recognition-course
- Size: 146 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
Speech Recognition Course
Material for learning speech recognition, based on Microsoft teaching material on EdX (changed from CNTK to PyTorch). Learning/teaching materials are given in each module/directory. The comprehensive learning materials covers signal processing, acoustic modeling, language modeling, and modern end-to-end approaches
Github Pages: https://bagustris.github.io/speech-recognition-course
Repository: https://github.com/bagustris/speech-recognition-course
Modules
- Module 1: Introduction to Speech Recognition
- Module 2: Speech Signal Processing
- Module 3: Acoustic Modeling
- Module 4: Language Modeling
- Module 5: Decoding
- Module 6: End-to-End Models
Convert from markdown to pdf with pandoc in each module:
bash
pandoc readme.md -o readme.pdf
Then, you can inspect the generated PDFs.
References:
- https://learning.edx.org/course/course-v1:Microsoft+DEV287x+1T2019a/home
- L. Gillick and S. J. Cox, Some statistical issues in the comparison of speech recognition algorithms, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1989, vol. 1, pp. 532535, doi: 10.1109/icassp.1989.266481.
- M. Mohri, F. Pereira, and M. Riley, SPEECH RECOGNITION WITH WEIGHTED FINITE-STATE TRANSDUCERS, in Springer Handbook on Speech Processing and Speech Communication.
- D. S. Pallet, W. M. Fisher, and J. G. Fiscus, Tools for the analysis of benchmark speech recognition tests, ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 1, pp. 97100, 1990, doi: 10.1109/icassp.1990.115546.
- T. Morioka, T. Iwata, T. Hori, and T. Kobayashi, Multiscale recurrent neural network based language model, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2015, vol. 2015-Janua, pp. 23662370.
- M. Sundermeyer, R. Schlter, and H. Ney, LSTM neural networks for language modeling, in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, 2012, vol. 1, pp. 194197, Accessed: Aug. 08, 2020. [Online]. Available: http://www.isca-speech.org/archive.
- Y. Bengio, R. Ducharme, and P. Vincent, A neural probabilistic language model, J. Mach. Learn. Res., vol. 3, pp. 11371155, 2003.
- P. F. Brown, P. V DeSouza, R. L. Mercer, V. J. Della Pietra, and J. C. Lai, Class-Based n-gram Models of Natural Language, Comput. Linguist., vol. 18, no. 4, pp. 467480, 1992.
- M. Levit, S. Parthasarathy, S. Chang, A. Stolcke, and B. Dumoulin, Word-phrase-entity language models: Getting more mileage out of N-grams, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2014, pp. 666670.
- X. Shen, Y. Oualil, C. Greenberg, M. Singh, and D. Klakow, Estimation of gap between current language models and human performance, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017, vol. 2017-Augus, pp. 553557, doi: 10.21437/Interspeech.2017-729.
- A. Stolcke, Entropy-based Pruning of Backoff Language Models, pp. 579588, 2000, doi: 10.3115/1075218.107521.
Owner
- Name: Bagus Tris Atmaja
- Login: bagustris
- Kind: user
- Location: Tsukuba
- Company: AIST
- Website: http://www.bagustris.blogspot.com
- Twitter: btatmaja
- Repositories: 221
- Profile: https://github.com/bagustris
Researcher @aistairc @VibrasticLab
GitHub Events
Total
- Watch event: 1
- Push event: 172
- Pull request event: 1
Last Year
- Watch event: 1
- Push event: 172
- Pull request event: 1
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Bagus Tris Atmaja | b****s@y****m | 182 |
| Adrian Leven | a****n@m****m | 3 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 4 hours
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 4 hours
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- Copilot (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v4 composite
- actions/configure-pages v5 composite
- actions/deploy-pages v4 composite
- actions/jekyll-build-pages v1 composite
- actions/upload-pages-artifact v3 composite