Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Keywords
hindi
hindi-english
transliteration
Last synced: 10 months ago
·
JSON representation
·
Repository
transliterate hindi to english
Basic Info
- Host: GitHub
- Owner: in-rolls
- License: mit
- Language: Jupyter Notebook
- Default Branch: master
- Homepage: https://indicate.readthedocs.io/en/latest/
- Size: 152 MB
Statistics
- Stars: 14
- Watchers: 5
- Forks: 2
- Open Issues: 0
- Releases: 0
Topics
hindi
hindi-english
transliteration
Created almost 6 years ago
· Last pushed 10 months ago
Metadata Files
Readme
License
Citation
README.rst
==================================================
Indicate: Transliterate Indic Languages to English
==================================================
.. image:: https://notarypy.soodoku.workers.dev/badge/indicate/0.2.1/indicate-0.2.1-py3-none-any.whl
:target: https://pypi.org/integrity/indicate/0.2.1/indicate-0.2.1-py3-none-any.whl/provenance
.. image:: https://img.shields.io/pypi/v/indicate.svg
:target: https://pypi.python.org/pypi/indicate
.. image:: https://readthedocs.org/projects/indicate/badge/?version=latest
:target: http://notnews.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
.. image:: https://static.pepy.tech/badge/indicate
:target: https://pepy.tech/project/indicate
Transliterations to/from Indian languages are still generally low quality. One problem is access to data. Another is that there is no standard transliteration.
For Hindi--English, we build novel dataset for names using the ESPNcricinfo. For instance, see `here `__ for hindi version of the `english scorecard `__.
We also create a dataset from `election affidavits `__
We also exploit the `Google Dakshina dataset `__.
To overcome the fact that there isn't one standard way of transliteration, we provide k-best transliterations.
Install
-------
We strongly recommend installing `indicate` inside a Python virtual environment
(see `venv documentation `__)
::
pip install indicate
General API
-----------
1. transliterate.hindi2english will take Hindi text and translate into English.
Examples
--------
::
from indicate import transliterate
english_translated = transliterate.hindi2english("")
print(english_translated)
output -
hindi
Functions
----------
We expose 1 function, which will take Hindi text and transliterate it to English.
- **transliterate.hindi2english(input)**
- What it does:
- Converts given hindi text into English alphabet
- Output
- Returns text in English
Data
----
The datasets used to train the model:
- `Indian Election affidavits `__
- `Google Dakshina dataset `__
- `ESPN Cric Info `__ for hindi version of the `english scorecard `__.
- `IIT Bombay English-Hindi Corpus `__
Evaluation
----------
Model was evaluated on test dataset of Google Dakshina dataset, Model predicted 73.64% exact matches.
`Indic-trans `__ predicted 63.12% exact matches on Google Dakshina dataset.
Below is the edit distance metrics on test dataset (0.0 mean exact match, the farther away from 0.0,
the difference is more between predicted text and actual text)
.. image:: https://github.com/in-rolls/indicate/raw/master/images/h2e_ed.png
:width: 400
:alt: Edit distance metrics of model on Google Dakshina test dataset
Authors
-------
Rajashekar Chintalapati and Gaurav Sood
Contributor Code of Conduct
---------------------------------
The project welcomes contributions from everyone! In fact, it depends on
it. To maintain this welcoming atmosphere, and to collaborate in a fun
and productive way, we expect contributors to the project to abide by
the `Contributor Code of
Conduct `__.
License
----------
The package is released under the `MIT
License `__.
Owner
- Name: Data, Analysis, and Tools for India
- Login: in-rolls
- Kind: organization
- Website: https://in-rolls.github.io
- Repositories: 33
- Profile: https://github.com/in-rolls
Citation (Citation.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Chintalapati" given-names: "Rajashekar" - family-names: "Sood" given-names: "Gaurav" title: "Indicate: Transliterate Indic Languages to English" version: 0.1.0 date-released: 2022-08-21 url: "https://github.com/in-rolls/indicate"
GitHub Events
Total
- Watch event: 1
- Delete event: 1
- Push event: 10
- Pull request event: 1
- Create event: 3
Last Year
- Watch event: 1
- Delete event: 1
- Push event: 10
- Pull request event: 1
- Create event: 3
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 2
- Total pull requests: 4
- Average time to close issues: 11 days
- Average time to close pull requests: 20 days
- Total issue authors: 2
- Total pull request authors: 2
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 2 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mahik2604 (1)
- LSYS (1)
Pull Request Authors
- rajashekar (2)
- soodoku (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 179 last-month
- Total dependent packages: 0
- Total dependent repositories: 5
- Total versions: 14
- Total maintainers: 2
pypi.org: indicate
Transliterations to/from Indian languages
- Homepage: https://github.com/in-rolls/indicate
- Documentation: https://indicate.readthedocs.io/
- License: MIT
-
Latest release: 0.3.0
published 10 months ago
Rankings
Dependent repos count: 6.7%
Dependent packages count: 10.0%
Average: 13.9%
Stargazers count: 16.0%
Downloads: 17.6%
Forks count: 19.1%
Maintainers (2)
Last synced:
10 months ago
Dependencies
.github/workflows/python-publish.yml
actions
- actions/checkout v2 composite
- actions/setup-python v2 composite
- pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
.github/workflows/test.yml
actions
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/setup-python v1 composite
requirements_rtd.txt
pypi
- func-timeout *
- tensorflow ==2.18.0
- tqdm *
- wheel >=0.38.0
pyproject.toml
pypi
- func-timeout *
- importlib-resources >=5.0.0;python_version<'3.9'
- tensorflow >=2.18.0,<3.0.0
- tqdm *