https://github.com/bagustris/speech-recognition-course

Material for learning speech recognition, based on Microsoft teaching material on EdX

https://github.com/bagustris/speech-recognition-course

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.4%) to scientific vocabulary

Keywords

speech-processing speech-recognition speech-to-text
Last synced: 5 months ago · JSON representation

Repository

Material for learning speech recognition, based on Microsoft teaching material on EdX

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 1
  • Open Issues: 1
  • Releases: 0
Topics
speech-processing speech-recognition speech-to-text
Created about 7 years ago · Last pushed 6 months ago
Metadata Files
Readme

README.md

Speech Recognition Course

Material for learning speech recognition, based on Microsoft teaching material on EdX (changed from CNTK to PyTorch). Learning/teaching materials are given in each module/directory. The comprehensive learning materials covers signal processing, acoustic modeling, language modeling, and modern end-to-end approaches

Github Pages: https://bagustris.github.io/speech-recognition-course
Repository: https://github.com/bagustris/speech-recognition-course

Modules

Convert from markdown to pdf with pandoc in each module:

bash pandoc readme.md -o readme.pdf

Then, you can inspect the generated PDFs.

References:

  1. https://learning.edx.org/course/course-v1:Microsoft+DEV287x+1T2019a/home
  2. L. Gillick and S. J. Cox, Some statistical issues in the comparison of speech recognition algorithms, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1989, vol. 1, pp. 532535, doi: 10.1109/icassp.1989.266481.
  3. M. Mohri, F. Pereira, and M. Riley, SPEECH RECOGNITION WITH WEIGHTED FINITE-STATE TRANSDUCERS, in Springer Handbook on Speech Processing and Speech Communication.
  4. D. S. Pallet, W. M. Fisher, and J. G. Fiscus, Tools for the analysis of benchmark speech recognition tests, ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 1, pp. 97100, 1990, doi: 10.1109/icassp.1990.115546.
  5. T. Morioka, T. Iwata, T. Hori, and T. Kobayashi, Multiscale recurrent neural network based language model, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2015, vol. 2015-Janua, pp. 23662370.
  6. M. Sundermeyer, R. Schlter, and H. Ney, LSTM neural networks for language modeling, in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, 2012, vol. 1, pp. 194197, Accessed: Aug. 08, 2020. [Online]. Available: http://www.isca-speech.org/archive.
  7. Y. Bengio, R. Ducharme, and P. Vincent, A neural probabilistic language model, J. Mach. Learn. Res., vol. 3, pp. 11371155, 2003.
  8. P. F. Brown, P. V DeSouza, R. L. Mercer, V. J. Della Pietra, and J. C. Lai, Class-Based n-gram Models of Natural Language, Comput. Linguist., vol. 18, no. 4, pp. 467480, 1992.
  9. M. Levit, S. Parthasarathy, S. Chang, A. Stolcke, and B. Dumoulin, Word-phrase-entity language models: Getting more mileage out of N-grams, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2014, pp. 666670.
  10. X. Shen, Y. Oualil, C. Greenberg, M. Singh, and D. Klakow, Estimation of gap between current language models and human performance, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017, vol. 2017-Augus, pp. 553557, doi: 10.21437/Interspeech.2017-729.
  11. A. Stolcke, Entropy-based Pruning of Backoff Language Models, pp. 579588, 2000, doi: 10.3115/1075218.107521.

Owner

  • Name: Bagus Tris Atmaja
  • Login: bagustris
  • Kind: user
  • Location: Tsukuba
  • Company: AIST

Researcher @aistairc @VibrasticLab

GitHub Events

Total
  • Watch event: 1
  • Push event: 172
  • Pull request event: 1
Last Year
  • Watch event: 1
  • Push event: 172
  • Pull request event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 185
  • Total Committers: 2
  • Avg Commits per committer: 92.5
  • Development Distribution Score (DDS): 0.016
Past Year
  • Commits: 181
  • Committers: 1
  • Avg Commits per committer: 181.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Bagus Tris Atmaja b****s@y****m 182
Adrian Leven a****n@m****m 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 4 hours
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 4 hours
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • Copilot (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/jekyll-gh-pages.yml actions
  • actions/checkout v4 composite
  • actions/configure-pages v5 composite
  • actions/deploy-pages v4 composite
  • actions/jekyll-build-pages v1 composite
  • actions/upload-pages-artifact v3 composite