negativeresultdetector

NLP: Tool to predict prevalence of positive and negative results in scientific abstracts of clinical psychology and psychotherapy

https://github.com/schiekiera/negativeresultdetector

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.2%) to scientific vocabulary
Last synced: 8 months ago · JSON representation ·

Repository

NLP: Tool to predict prevalence of positive and negative results in scientific abstracts of clinical psychology and psychotherapy

Basic Info
  • Host: GitHub
  • Owner: schiekiera
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 41.6 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed 12 months ago
Metadata Files
Readme License Citation

README.md

NegativeResultDetector

Documentation, code and data for the study "Classifying Positive Results in Clinical Psychology Using Natural Language Processing" by Louis Schiekiera, Jonathan Diederichs & Helen Niemeyer. The preprint for this study is available on PsyArxiv.

The best-performing model, SciBERT, was deployed under the name 'NegativeResultDetector' on HuggingFace. It can be used directly via a graphical user interface for single abstract evaluations or for larger-scale inference by downloading the model files from HuggingFace, utilizing this script from the GitHub repository.

Table of Contents

Abstract

Background: This study addresses the gap in machine learning tools for positive results classification by evaluating the performance of SciBERT, a transformer model pretrained on scientific text, and random forest in clinical psychology abstracts.

Methods: Over 1,900 abstracts were annotated into two categories: ‘positive results only’ and ‘mixed or negative results’. Model performance was evaluated on three benchmarks. The best-performing model was utilized to analyze trends in over 20,000 psychotherapy study abstracts.

Results: SciBERT outperformed all benchmarks and random forest in in-domain and out-of-domain data. The trend analysis revealed non-significant effects ofpublication year on positive results for 1990-2005, but a significant decrease in positive results between 2005-2022. When examining the entire time-span, significant positive linear and negative quadratic effects were observed.

Discussion: Machine learning could support future efforts to understand patterns of positive results in large data sets. The fine-tuned SciBERT model was deployed for public use.


Results

Table 1
Different metric scores for model evaluation of test data from the annotated MAIN corpus, consisting of *n = 198 abstracts authored by researchers affiliated with German clinical psychology departments and published between 2012 and 2022*

Accuracy Mixed & Negative Results Positive Results Only
F1 Recall Precision F1 Recall Precision
SciBERT 0.864 0.867 0.907 0.830 0.860 0.822 0.902
Random Forest 0.803 0.810 0.856 0.769 0.796 0.752 0.844
Extracted p-values 0.515 0.495 0.485 0.505 0.534 0.545 0.524
Extracted NL Indicators 0.530 0.497 0.474 0.523 0.559 0.584 0.536
Number of Words 0.475 0.441 0.423 0.461 0.505 0.525 0.486


Figure 1
Comparing model performances across in-domain and out-of-domain data; Colored bars represent different model types; Samples: MAIN test: n = 198 abstracts; VAL1: n = 150 abstracts; VAL2: n = 150 abstracts. alt text


Funding & Project

This study was conducted as part of the PANNE Project (German acronym for “publication bias analysis of non-publication and non-reception of results in a disciplinary comparison”) at Freie Universität Berlin and was funded by the Berlin University Alliance.

Citation

If you use the data or the code, please cite the paper as follows:

Schiekiera, L., Niemeyer, H., & Diederichs, J. (2024). Political bias in historiography - an experimental investigation of preferences for publication as a function of political orientation. F1000Research, 14, 320. https://f1000research.com/articles/14-320/v1

Owner

  • Name: Louis Schiekiera
  • Login: schiekiera
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use models, scripts or data, please cite it as below."
authors:
- family-names: "Schiekiera"
  given-names: "Louis"
  orcid: "https://orcid.org/0000-0003-0082-175X"
title: "NegativeResultDetector"
version: 1.0.0
date-released: 2023-09-27
url: "https://github.com/PsyCapsLock/NegativeResultDetector"

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1