dragon

Resources for the DRAGON challenge

https://github.com/diagnijmegen/dragon

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (2.4%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Resources for the DRAGON challenge

Basic Info

Host: GitHub
Owner: DIAGNijmegen
License: apache-2.0
Language: TeX
Default Branch: main
Size: 10.7 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

README.md

If you are using DRAGON resources, please cite the following article:

J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, and H. Huisman on behalf of the DRAGON consortium. The DRAGON benchmark for clinical NLP. npj Digital Medicine 8, 289 (2025). https://doi.org/10.1038/s41746-025-01626-x

Download the citation file for your reference manager: BibTeX | RIS

Owner

Name: Diagnostic Image Analysis Group
Login: DIAGNijmegen
Kind: organization
Location: Radboud University Medical Center, Nijmegen, The Netherlands

Website: www.diagnijmegen.nl
Repositories: 41
Profile: https://github.com/DIAGNijmegen

Citation (citation.bib)

@article{bosma_dragon_2025,
	title = {The {DRAGON} benchmark for clinical {NLP}},
	volume = {8},
	rights = {https://creativecommons.org/licenses/by/4.0},
	issn = {2398-6352},
	url = {https://www.nature.com/articles/s41746-025-01626-x},
	doi = {10.1038/s41746-025-01626-x},
	abstract = {Artificial Intelligence can mitigate the global shortage of medical diagnostic personnel but requires large-scale annotated datasets to train clinical algorithms. Natural Language Processing ({NLP}), including Large Language Models ({LLMs}), shows great potential for annotating clinical data to facilitate algorithm development but remains underexplored due to a lack of public benchmarks. This study introduces the {DRAGON} challenge, a benchmark for clinical {NLP} with 28 tasks and 28,824 annotated medical reports from five Dutch care centers. It facilitates automated, large-scale, cost-effective data annotation. Foundational {LLMs} were pretrained using four million clinical reports from a sixth Dutch care center. Evaluations showed the superiority of domain-specific pretraining ({DRAGON} 2025 test score of 0.770) and mixed-domain pretraining (0.756), compared to general-domain pretraining (0.734, p {\textless} 0.005). While strong performance was achieved on 18/28 tasks, performance was subpar on 10/28 tasks, uncovering where innovations are needed. Benchmark, code, and foundational {LLMs} are publicly available.},
	number = {1},
	journaltitle = {npj Digital Medicine},
	shortjournal = {npj Digit. Med.},
	author = {Bosma, Joeran S. and Dercksen, Koen and Builtjes, Luc and André, Romain and Roest, Christian and Fransen, Stefan J. and Noordman, Constant R. and Navarro-Padilla, Mar and Lefkes, Judith and Alves, Natália and De Grauw, Max J. J. and Van Eekelen, Leander and Spronck, Joey M. A. and Schuurmans, Megan and De Wilde, Bram and Hendrix, Ward and Aswolinskiy, Witali and Saha, Anindo and Twilt, Jasper J. and van Lohuizen, Quintin and Stegeman, Michelle and Rutten, Karlijn and Smit, Inge M. E. and Stultiens, Gijs and Overduin, Christiaan G. and Rutten, Matthieu J. C. M. and Scholten, Ernst Th. and van der Post, Rachel S. and Grünberg, Katrien and Vos, Shoko and Taken, Elise M. G. and Nagtegaal, Iris D. and Mickan, Anne and Groeneveld, Miriam and Gerke, Paul K. and Meakin, James A. and Looijen-Salamon, M. G. and de Haas, Tijmen L. M. and Hoitsma, Fabian and D’Amato, Marina and Geijs, Daan and Veltman, Jeroen and Yakar, Derya and de Rooij, Maarten and Ciompi, Francesco and Hering, Alessa and Geerdink, Jeroen and Huisman, Henkjan and {On behalf of the DRAGON consortium}},
	urldate = {2025-05-23},
	date = {2025-05-17},
	langid = {english},
	note = {Publisher: Springer Science and Business Media {LLC}},
}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science