https://github.com/google-deepmind/tell_me_a_story

https://github.com/google-deepmind/tell_me_a_story

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: google-deepmind
  • License: apache-2.0
  • Default Branch: main
  • Size: 10.7 KB
Statistics
  • Stars: 25
  • Watchers: 6
  • Forks: 4
  • Open Issues: 4
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License

README.md

Tell Me A Story

This repository includes the Tell Me A Story dataset used in our paper: Agents' Room: Narrative Generation through Multi-step Collaboration.

Abstract

Writing compelling fiction is a multifaceted process combining elements such as crafting a plot, developing interesting characters, and using evocative language. While large language models (LLMs) show promise for story writing, they currently rely heavily on intricate prompting, which limits their use. We propose Agents' Room, a generation framework inspired by narrative theory, that decomposes narrative writing into subtasks tackled by specialized agents. To illustrate our method, we introduce Tell Me A Story, a high-quality dataset of complex writing prompts and human-written stories, and a novel evaluation framework designed specifically for assessing long narratives. We show that Agents' Room generates stories that are preferred by expert evaluators over those produced by baseline systems by leveraging collaboration and specialization to decompose the complex story writing task into tractable components. We provide extensive analysis with automated and human-based metrics of the generated output.

Dataset Description

The Tell Me A Story dataset is available in JSONL format at: link. The data can be downloaded via direct download using:

bash wget https://storage.googleapis.com/tell-me-a-story/tell-me-a-story-train_encrypted.jsonl wget https://storage.googleapis.com/tell-me-a-story/tell-me-a-story-validation_encrypted.jsonl wget https://storage.googleapis.com/tell-me-a-story/tell-me-a-story-test_encrypted.jsonl

The dataset files when downloaded will take up approximately 3MB.

Dataset decryption

The files have been encrypted to prevent the dataset from being scraped by automated scraping tools.

This repository contains both the symmetric key skey.key and private key private_key.pem required to decrypt the files. The symmetric key is encrypted and can be un-encrypted using the private key.

The files can be decrypted using the Python package cryptography. If you do not have it, you can install it using the following command:

pip install cryptography

Then use the following script to decrypt the files:

```python from cryptography.hazmat.primitives import hashes from cryptography.hazmat.primitives.asymmetric import padding from cryptography.hazmat.backends import default_backend from cryptography.hazmat.primitives import serialization from cryptography.fernet import Fernet import sys import os

if len(sys.argv) > 3: filename = sys.argv[1] # Name of the file to decrypt. skeyfile = sys.argv[2] # File containing the symmetrical key. pkeyfile = sys.argv[3] # File containing the private key.

# Load the private key.
with open(pkey_file, 'rb') as f:
    private_key = serialization.load_pem_private_key(
        f.read(),
        password=None,
        backend=default_backend()
    )

# Load the symmetrical key.
with open(skey_file, 'rb') as f:
  skey = f.read()

# Load the file to decrypt.
with open(filename, 'rb') as f:
  data = f.read()

# Decrypt the symmetrical key.
unenc_skey = private_key.decrypt(
    skey,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

# Decrypt the data.
f = Fernet(unenc_skey)
decrypted = f.decrypt(data)

# Write the data to file.
out_file = filename.replace('_encrypted.jsonl', '.jsonl')
with open(out_file, 'wb') as f:
    f.write(decrypted)

else: print('Usage: ' + os.path.basename(file) + ' filename.jsonl skey.key private_key.pem') ```

Dataset columns

There are three data splits: train, validation, and test. The dataset contains the following columns:

  • example_id (str): A unique identifier for each input prompt.
  • inputs (str): The input writing prompt.
  • targets (str): The target fiction story corresponding to the writing prompt.

Citing this work

If you use any of the material here, please cite the following paper:

latex @article{huot2024agents, title={Agents' Room: Narrative Generation through Multi-step Collaboration}, author={Huot, Fantine and Amplayo, Reinald Kim and Palomaki, Jennimaria and Jakobovits, Alice Shoshana and Clark, Elizabeth and Lapata, Mirella}, journal={arXiv preprint arXiv:2410.02603}, year={2024} }

License and disclaimer

Copyright 2024 DeepMind Technologies Limited

This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode

Unless required by applicable law or agreed to in writing, all materials distributed here under the CC-BY license are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.

This is not an official Google product.

Owner

  • Name: Google DeepMind
  • Login: google-deepmind
  • Kind: organization

GitHub Events

Total
  • Issues event: 4
  • Watch event: 26
  • Issue comment event: 1
  • Member event: 1
  • Push event: 2
  • Public event: 1
  • Fork event: 3
  • Create event: 1
Last Year
  • Issues event: 4
  • Watch event: 26
  • Issue comment event: 1
  • Member event: 1
  • Push event: 2
  • Public event: 1
  • Fork event: 3
  • Create event: 1

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 3
  • Total Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 3
  • Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Fantine Huot f****t@g****m 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 4
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 4
  • Total pull request authors: 0
  • Average comments per issue: 0.25
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 4
  • Pull request authors: 0
  • Average comments per issue: 0.25
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sgtziggy (1)
  • chtmp223 (1)
  • ChrisRBXiong (1)
  • bitkira (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels