https://github.com/ccoreilly/generate-n-gram-lm
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (2.8%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: ccoreilly
- License: apache-2.0
- Language: Python
- Default Branch: master
- Size: 8.79 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 0
Created about 5 years ago
· Last pushed almost 5 years ago
Metadata Files
Readme
License
README.md
Generate n-gram LM
Simple tooling for generating n-gram language models with KenLM.
A 4-gram LM for the Catalan language can be found here.
Building the Docker Image
sh
docker build . -f Dockerfile -t kenlm
Building a language model
sh
docker run -it --rm -v `pwd`:/io -w /io kenlm python generate_lm.py --input_txt catalan_textual_corpus.txt \ --output_dir . --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 \ --binary_type trie
Owner
- Name: Ciaran O'Reilly
- Login: ccoreilly
- Kind: user
- Location: Berlin
- Company: @parloa
- Website: https://oreilly.cat
- Repositories: 51
- Profile: https://github.com/ccoreilly
GitHub Events
Total
Last Year
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Ciaran O'Reilly | c****n@o****t | 3 |
Committer Domains (Top 20 + Academic)
oreilly.cat: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0