https://github.com/amazon-science/idioms-incontext-mt
idioms in context dataset
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary
Keywords
Repository
idioms in context dataset
Basic Info
Statistics
- Stars: 5
- Watchers: 10
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Idioms in Context Dataset
This repository contains the "Idioms in Context" dataset used in our ACL 2024 paper: The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities.
Description
The dataset consists of idiomatic expressions in context and their human-written translations. It covers 2 language pairs (English-German and English-Russian) with 3 translation directions: 1. English → German 2. German → English 3. Russian → English
The dataset is designed to evaluate the performance of large language models and machine translation systems in handling idiomatic expressions, which can be challenging due to their non-literal meanings.
Usage
If you use this dataset in your work, please cite our paper:
@misc{stap2024-idioms,
title={The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities},
author={David Stap and Eva Hasler and Bill Byrne and Christof Monz and Ke Tran},
year={2024},
eprint={2405.20089},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2405.20089},
}
Security
See CONTRIBUTING for more information.
License
This dataset is licensed under the CC-BY-NC-4.0 License.
Owner
- Name: Amazon Science
- Login: amazon-science
- Kind: organization
- Website: https://amazon.science
- Twitter: AmazonScience
- Repositories: 80
- Profile: https://github.com/amazon-science
GitHub Events
Total
- Watch event: 5
Last Year
- Watch event: 5
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0