https://github.com/cyberagentailab/camera3

CAMERA3: An Evaluation Dataset for Controllable Ad Text Generation in Japanese

https://github.com/cyberagentailab/camera3

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

CAMERA3: An Evaluation Dataset for Controllable Ad Text Generation in Japanese

Basic Info
  • Host: GitHub
  • Owner: CyberAgentAILab
  • Default Branch: main
  • Size: 5.04 MB
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme

README.md

CAMERA3 dataset

CAMERA3 is an evaluation dataset for controllable text generation in the advertising domain in Japanese. CAMERA3 includes 3,980 ad texts written by expert annotators, taking into account various aspects of ad appeals in LP (Landing Page).

  • The annotated data is available at data/ directory in this repository in json and jsonl.
  • The LP images are available here (2.3GB).

Dataset

| Name | Description | | --- | ---- | | instanceid | unique id| | lpimagesliced | id associated with LP image | | annotatorid | annotator id | | kw | search keyword | | lpmetadescription | meta description extracted from LP | | lpimageslicedocrtext | OCR results for LP imagge | | adappealtype | ad appeal type | | ad_text | ad text |

  • Example json entry: json { "instance_id":0, "lp_image_sliced":"screen-1200-100303_00.png", "annotator_id":5, "kw":"マイカー 共済", "lp_meta_description":"2022年最新の自動車保険のランキングを発表!...", "lp_image_sliced_ocr_text":"コのほけん!\n保険比較のコのほけん!>...", "ad_appeal_type":"価格", "ad_text":"ネットからの契約だと割引あり" }

Citation

@inproceedings{inoue-etal-2024-camera3, title = "CAMERA³: An Evaluation Dataset for Controllable Ad Text Generation in Japanese", author = "Inoue, Go and Kato, Akihiko and Mita, Masato and Honda, Ukyo and Zhang, Peinan", booktitle = "Proceedings of the Fourteenth Language Resources and Evaluation Conference", month = may, year = "2024", address = "Turin, Italy", publisher = "European Language Resources Association" }

License

The dataset is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Owner

  • Name: CyberAgent AI Lab
  • Login: CyberAgentAILab
  • Kind: organization
  • Location: Japan

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1