https://github.com/cyberagentailab/camera3
CAMERA3: An Evaluation Dataset for Controllable Ad Text Generation in Japanese
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.8%) to scientific vocabulary
Repository
CAMERA3: An Evaluation Dataset for Controllable Ad Text Generation in Japanese
Basic Info
- Host: GitHub
- Owner: CyberAgentAILab
- Default Branch: main
- Size: 5.04 MB
Statistics
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
CAMERA3 dataset
CAMERA3 is an evaluation dataset for controllable text generation in the advertising domain in Japanese. CAMERA3 includes 3,980 ad texts written by expert annotators, taking into account various aspects of ad appeals in LP (Landing Page).
- The annotated data is available at
data/directory in this repository injsonandjsonl. - The LP images are available here (2.3GB).
Dataset
- json:
data/camera3-v1.json - jsonl:
data/camera3-v1.jsonl - lpimages: `camera3-v1-lp-screenshot-sliced/screen-1200-{lpimage_sliced}.png`
| Name | Description | | --- | ---- | | instanceid | unique id| | lpimagesliced | id associated with LP image | | annotatorid | annotator id | | kw | search keyword | | lpmetadescription | meta description extracted from LP | | lpimageslicedocrtext | OCR results for LP imagge | | adappealtype | ad appeal type | | ad_text | ad text |
- Example
jsonentry:json { "instance_id":0, "lp_image_sliced":"screen-1200-100303_00.png", "annotator_id":5, "kw":"マイカー 共済", "lp_meta_description":"2022年最新の自動車保険のランキングを発表!...", "lp_image_sliced_ocr_text":"コのほけん!\n保険比較のコのほけん!>...", "ad_appeal_type":"価格", "ad_text":"ネットからの契約だと割引あり" }
Citation
@inproceedings{inoue-etal-2024-camera3,
title = "CAMERA³: An Evaluation Dataset for Controllable Ad Text Generation in Japanese",
author = "Inoue, Go and
Kato, Akihiko and
Mita, Masato and
Honda, Ukyo and
Zhang, Peinan",
booktitle = "Proceedings of the Fourteenth Language Resources and Evaluation Conference",
month = may,
year = "2024",
address = "Turin, Italy",
publisher = "European Language Resources Association"
}
License
The dataset is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Owner
- Name: CyberAgent AI Lab
- Login: CyberAgentAILab
- Kind: organization
- Location: Japan
- Website: https://cyberagent.ai/ailab/
- Twitter: cyberagent_ai
- Repositories: 7
- Profile: https://github.com/CyberAgentAILab
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1