Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: mo-arvan
  • Language: Python
  • Default Branch: main
  • Size: 343 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 10 months ago · Last pushed 9 months ago
Metadata Files
Readme Citation

README.md

ReproHum-0744-02

This repository contains tools and scripts for quantified reproducibility assessment of NLP results through human evaluation data analysis. The project implements methodologies described in Belz, Popovic & Mille (2022) "Quantified Reproducibility Assessment of NLP Results" (ACL'22).

Overview

The project analyzes human evaluation data for different NLP systems, providing:

  • Statistical reliability metrics (Fleiss Kappa, Krippendorff's Alpha)
  • ANOVA and Tukey HSD tests for system comparisons
  • Power analysis and effect size calculations
  • Coefficient of variation (CV) analysis for reproducibility assessment

Project Structure

text . ├── src/ # Source code directory │ ├── analyze_responses.py # Main analysis script │ ├── cv.py # Coefficient of variation calculations │ ├── statistical_power_analysis.py # Statistical power analysis │ └── preprocess_responses.py # Data preprocessing ├── responses/ # Input data directory ├── results/ # Analysis output directory │ ├── lab1/ # Primary results │ └── original/ # Original data results └── power_analysis.r # R script for power analysis

Prerequisites

  • Docker (optional, for containerized environment)
  • Python 3.x
  • Required packages:
    • pandas
    • scipy
    • statsmodels
    • krippendorff

Setup

Clone the repository and install the required packages. You can use a virtual environment or Docker for isolation.

bash docker build -t reprohum-0744-02 . docker run -it --rm -v $(pwd):/app reprohum-0744-02

Usage

  1. Preprocess the response data (requires original responses, you can skip this version if you are loading the preprocessed data from this repository):

bash python src/preprocess_responses.py

  1. Run the analysis pipeline:

bash python src/analyze_responses.py

  1. Generate reproducibility metrics:

bash python src/quantified_reproducibility.py

Output

The analysis generates several outputs in the results/lab1/ directory:

  • Statistical test results (anova_tukeyhsd.txt)
  • Inter-rater reliability metrics (fleiss_kappa.txt, krippendorff_alpha.txt)
  • Dataset usage statistics (tables/datasets_used.csv)
  • System comparison results (tables/results.csv)
  • Detailed reliability data (reliability_data.csv)
  • Coefficient of variation analysis (cv_2_way.csv, cv_summary.csv)
  • Correlation analysis (correlations.csv)
  • Best-Worst system results (results.csv)

Citation

If you use this software in your research, please cite:

bibtex TBA

License

This project is licensed under CC-BY-4.0. See the LICENSE file for details.

Owner

  • Name: Mo Arvan
  • Login: mo-arvan
  • Kind: user
  • Location: Chicago
  • Company: University of Illinois at Chicago

Computer Scientist

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: reprohum-0744-02
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Mohammad
    family-names: Arvan
    email: marvan3@uic.edu
    affiliation: University of Illinois at Chicago
  - given-names: Natalie
    family-names: Parde
    affiliation: University of Illinois at Chicago
license: CC-BY-4.0
url: https://github.com/mo-arvan/reprohum-0744-02
date-released: 2025

GitHub Events

Total
  • Push event: 5
  • Create event: 2
Last Year
  • Push event: 5
  • Create event: 2

Dependencies

Dockerfile docker
  • python 3.12.1-alpine3.19 build