negativeresultdetector

NLP: Tool to predict prevalence of positive and negative results in scientific abstracts of clinical psychology and psychotherapy

https://github.com/schiekiera/negativeresultdetector

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.2%) to scientific vocabulary

Last synced: 8 months ago · JSON representation ·

Repository

NLP: Tool to predict prevalence of positive and negative results in scientific abstracts of clinical psychology and psychotherapy

Basic Info

Host: GitHub
Owner: schiekiera
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 41.6 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created almost 3 years ago · Last pushed 12 months ago

Metadata Files

Readme License Citation

NegativeResultDetector

Documentation, code and data for the study "Classifying Positive Results in Clinical Psychology Using Natural Language Processing" by Louis Schiekiera, Jonathan Diederichs & Helen Niemeyer. The preprint for this study is available on PsyArxiv.

The best-performing model, SciBERT, was deployed under the name 'NegativeResultDetector' on HuggingFace. It can be used directly via a graphical user interface for single abstract evaluations or for larger-scale inference by downloading the model files from HuggingFace, utilizing this script from the GitHub repository.

Abstract
Results

Abstract

Background: This study addresses the gap in machine learning tools for positive results classification by evaluating the performance of SciBERT, a transformer model pretrained on scientific text, and random forest in clinical psychology abstracts.

Methods: Over 1,900 abstracts were annotated into two categories: ‘positive results only’ and ‘mixed or negative results’. Model performance was evaluated on three benchmarks. The best-performing model was utilized to analyze trends in over 20,000 psychotherapy study abstracts.

Results: SciBERT outperformed all benchmarks and random forest in in-domain and out-of-domain data. The trend analysis revealed non-significant effects ofpublication year on positive results for 1990-2005, but a significant decrease in positive results between 2005-2022. When examining the entire time-span, significant positive linear and negative quadratic effects were observed.

Discussion: Machine learning could support future efforts to understand patterns of positive results in large data sets. The fine-tuned SciBERT model was deployed for public use.

Results

Table 1
Different metric scores for model evaluation of test data from the annotated MAIN corpus, consisting of *n = 198 abstracts authored by researchers affiliated with German clinical psychology departments and published between 2012 and 2022*

	Accuracy	Mixed & Negative Results			Positive Results Only
	Accuracy	F1	Recall	Precision	F1	Recall	Precision
SciBERT	0.864	0.867	0.907	0.830	0.860	0.822	0.902
Random Forest	0.803	0.810	0.856	0.769	0.796	0.752	0.844
Extracted p-values	0.515	0.495	0.485	0.505	0.534	0.545	0.524
Extracted NL Indicators	0.530	0.497	0.474	0.523	0.559	0.584	0.536
Number of Words	0.475	0.441	0.423	0.461	0.505	0.525	0.486

Figure 1
Comparing model performances across in-domain and out-of-domain data; Colored bars represent different model types; Samples: MAIN test: n = 198 abstracts; VAL1: n = 150 abstracts; VAL2: n = 150 abstracts. alt text

Funding & Project

This study was conducted as part of the PANNE Project (German acronym for “publication bias analysis of non-publication and non-reception of results in a disciplinary comparison”) at Freie Universität Berlin and was funded by the Berlin University Alliance.

Citation

If you use the data or the code, please cite the paper as follows:

Schiekiera, L., Niemeyer, H., & Diederichs, J. (2024). Political bias in historiography - an experimental investigation of preferences for publication as a function of political orientation. F1000Research, 14, 320. https://f1000research.com/articles/14-320/v1

Owner

Name: Louis Schiekiera
Login: schiekiera
Kind: user

Repositories: 1
Profile: https://github.com/schiekiera

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use models, scripts or data, please cite it as below."
authors:
- family-names: "Schiekiera"
  given-names: "Louis"
  orcid: "https://orcid.org/0000-0003-0082-175X"
title: "NegativeResultDetector"
version: 1.0.0
date-released: 2023-09-27
url: "https://github.com/PsyCapsLock/NegativeResultDetector"

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science