Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.0%) to scientific vocabulary
Keywords
Repository
The Multitask Long Document Benchmark
Basic Info
Statistics
- Stars: 41
- Watchers: 1
- Forks: 1
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
MuLD: The Multitask Long Document Benchmark
MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks where the inputs consist of at least 10,000 words. The benchmark covers a wide variety of task types including translation, summarization, question answering, and classification. Additionally there is a range of output lengths from a single word classification label all the way up to an output longer than the input text.

This repo contains official code for the paper MuLD: The Multitask Long Document Benchmark.
Quickstart
The easiest method is to use the Huggingface Datasets library:
python
import datasets
ds = datasets.load_dataset("ghomasHudson/muld", "NarrativeQA")
ds = datasets.load_dataset("ghomasHudson/muld", "HotpotQA")
ds = datasets.load_dataset("ghomasHudson/muld", "Character Archetype Classification")
ds = datasets.load_dataset("ghomasHudson/muld", "OpenSubtitles")
ds = datasets.load_dataset("ghomasHudson/muld", "AO3 Style Change Detection")
ds = datasets.load_dataset("ghomasHudson/muld", "VLSP")
Or by cloning this repo:
python
import datasets
ds = datasets.load_dataset("./muld.py", "NarrativeQA")
...
Manual Download
If you prefer to download the data files yourself: - NarrativeQA Train Val Test - Mirror: Train Val Test - HotpotQA Train, Val - Mirror: Train Val - Character Archetype Classification Train Val Test - Mirror: Train Val Test - OpenSubtitles Train Test - Mirror: Train Test - Style Change Train Val Test, - Mirror: Train Val Test - VLSP Test - Mirror: Test
Citation
If you use our benchmark please cite the paper:
@InProceedings{hudson-almoubayed:2022:LREC,
author = {Hudson, George and Al Moubayed, Noura},
title = {MuLD: The Multitask Long Document Benchmark},
booktitle = {Proceedings of the Language Resources and Evaluation Conference},
month = {June},
year = {2022},
address = {Marseille, France},
publisher = {European Language Resources Association},
pages = {3675--3685},
url = {https://aclanthology.org/2022.lrec-1.392}
}
Additionally please cite the datasets we used (particularly NarrativeQA, HotpotQA, and Opensubtitles where we directly use their data with limited filtering).
Dataset Metadata
The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.
| property | value | ||||||
|---|---|---|---|---|---|---|---|
| name | MuLD |
||||||
| alternateName | Multitask Long Document Benchmark |
||||||
| url | https://github.com/ghomasHudson/muld |
||||||
| description | MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks
where the inputs consist of at least 10,000 words. The benchmark covers a
wide variety of task types including translation, summarization,
question answering, and classification. Additionally there is a range of
output lengths from a single word classification label all the way up
to an output longer than the input text. |
||||||
| citation | https://arxiv.org/abs/2202.07362 |
||||||
| creator |
|
Owner
- Name: Thomas Hudson
- Login: ghomasHudson
- Kind: user
- Company: Durham University
- Website: ghomashudson.github.io
- Repositories: 6
- Profile: https://github.com/ghomasHudson
Research Associate at Durham University. Researching NLP for Veterinary medicine
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'MuLD: The Multitask Long Document Benchmark'
message: >-
If you use this dataset, please cite it using the
metadata from this file.
type: dataset
authors:
- given-names: G Thomas
family-names: Hudson
email: g.t.hudson@durham.ac.uk
affiliation: Durham University
orcid: 'https://orcid.org/0000-0003-3562-3593'
- given-names: Noura
name-particle: Al
family-names: Moubayed
orcid: 'https://orcid.org/0000-0001-8942-355X'
affiliation: Durham University
identifiers:
- type: url
value: 'https://aclanthology.org/2022.lrec-1.392'
abstract: >-
The impressive progress in NLP techniques has been
driven by the development of multi-task benchmarks
such as GLUE and SuperGLUE. While these benchmarks
focus on tasks for one or two input sentences,
there has been exciting work in designing efficient
techniques for processing much longer inputs. In
this paper, we present MuLD: a new long document
benchmark consisting of only documents over 10,000
tokens. By modifying existing NLP tasks, we create
a diverse benchmark which requires models to
successfully model long-term dependencies in the
text. We evaluate how existing models perform, and
find that our benchmark is much more challenging
than their ‘short document’ equivalents.
Furthermore, by evaluating both regular and
efficient transformers, we show that models with
increased context length are better able to solve
the tasks presented, suggesting that future
improvements in these models are vital for solving
similar long document problems. We release the data
and code for baselines to encourage further
research on efficient NLP models.
keywords:
- Long Documents
- Benchmark
- Multitask learning
- NLP
license: CC-BY-NC-4.0
preferred-citation:
authors:
- given-names: G Thomas
family-names: Hudson
email: g.t.hudson@durham.ac.uk
affiliation: Durham University
orcid: 'https://orcid.org/0000-0003-3562-3593'
- given-names: Noura
name-particle: Al
family-names: Moubayed
orcid: 'https://orcid.org/0000-0001-8942-355X'
affiliation: Durham University
title: "MuLD: The Multitask Long Document Benchmark"
type: conference-paper
collection-title: Proceedings of the Language Resources and Evaluation Conference
conference:
name: Language Resources and Evaluation Conference
date-start: 2022-06-21
date-end: 2022-06-23
address: Marseille, France
location:
name: Marseille, France
start: 3675
end: 3685
publisher:
name: European Language Resources Association
url: https://aclanthology.org/2022.lrec-1.392
abstract: >-
The impressive progress in NLP techniques has been
driven by the development of multi-task benchmarks
such as GLUE and SuperGLUE. While these benchmarks
focus on tasks for one or two input sentences,
there has been exciting work in designing efficient
techniques for processing much longer inputs. In
this paper, we present MuLD: a new long document
benchmark consisting of only documents over 10,000
tokens. By modifying existing NLP tasks, we create
a diverse benchmark which requires models to
successfully model long-term dependencies in the
text. We evaluate how existing models perform, and
find that our benchmark is much more challenging
than their ‘short document’ equivalents.
Furthermore, by evaluating both regular and
efficient transformers, we show that models with
increased context length are better able to solve
the tasks presented, suggesting that future
improvements in these models are vital for solving
similar long document problems. We release the data
and code for baselines to encourage further
research on efficient NLP models.
GitHub Events
Total
- Watch event: 2
Last Year
- Watch event: 2
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Thomas Hudson | 0****n@g****m | 38 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 3
- Total pull requests: 0
- Average time to close issues: 22 days
- Average time to close pull requests: N/A
- Total issue authors: 2
- Total pull request authors: 0
- Average comments per issue: 1.33
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- floschne (2)
- yulonglin (1)