dataandcodefse2025

Data and code for Paper "Scientific Open-Source Software Is Less Likely To Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software"

https://github.com/addimt/dataandcodefse2025

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Data and code for Paper "Scientific Open-Source Software Is Less Likely To Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software"

Basic Info
  • Host: GitHub
  • Owner: AddiMT
  • Language: Shell
  • Default Branch: main
  • Size: 0 Bytes
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created 10 months ago · Last pushed 10 months ago
Metadata Files
Readme Citation

README.md

Data and code for Paper "Scientific Open-Source Software Is Less Likely To Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software"

If you use the dataset or code, please cite our paper: https://doi.org/10.1145/3729369.

The data and scripts necesary to replicate visualizations and models included in the paper "Scientific Open-Source Software Is Less Likely To Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software".

README.md - Overview of files found in the replication package and their details.

dataset.tar.gz -> aP4-t.csv - Full Dataset used in the paper, contains both scientific and matching OSS projects. Scientific Software has attribute isSci = 1

Code/data_collection.sh - Scripts for to data collection and case matching.

Code/ModelsAndAnalysisRep.R - Our Cox models for survival and correlation analysis.

Code/KMPlotsRep.py - kaplan meier plots.

Owner

  • Name: Addi Malviya Thakur
  • Login: AddiMT
  • Kind: user
  • Company: ORNL ; UT Knoxville

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this dataset or code, please cite the following associated publication."
title: "Scientific Open-Source Software Is Less Likely to Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software"
authors:
  - family-names: Malviya Thakur
    given-names: Addi 
    orcid: https://orcid.org/0000-0002-2681-9992
    affiliation: University of Tennessee at Knoxville, USA and Oak Ridge National Lab, USA
  - family-names: Milewicz
    given-names: Reed
    orcid: https://orcid.org/0000-0002-1701-0008
    affiliation: Sandia National Laboratories, USA
  - family-names: Jahanshahi
    given-names: Mahmoud
    orcid: https://orcid.org/0000-0003-4408-1183
    affiliation: University of Tennessee at Knoxville, USA
  - family-names: Paganini
    given-names: Lavínia
    orcid: https://orcid.org/0000-0002-2729-0314
    affiliation: Eindhoven University of Technology, Netherlands
  - family-names: Vasilescu
    given-names: Bogdan
    orcid: https://orcid.org/0000-0003-4418-5783
    affiliation: Carnegie Mellon University, USA
  - family-names: Mockus
    given-names: Audris
    orcid: https://orcid.org/0000-0002-7987-7598
    affiliation: University of Tennessee at Knoxville, USA
abstract: >
  This repository supports the study "Scientific Open-Source Software Is Less Likely to Become Abandoned Than One Might Think!"
  It contains data and code curated to identify scientific open-source projects across 13 STEM domains and to analyze their longevity,
  providing insights into infrastructure sustainability, downstream dependencies, and the comparative survival of scientific versus non-scientific software.
repository-code: https://github.com/AddiMT/DataAndCodeFSE2025
license: MIT
keywords:
  - scientific software
  - open source
  - software sustainability
  - software longevity
  - software ecosystems
  - research software
  - large language models
version: 1.0.0
date-released: 2025-04-25
doi: 10.1145/3729369

GitHub Events

Total
  • Release event: 2
  • Push event: 6
  • Create event: 3
Last Year
  • Release event: 2
  • Push event: 6
  • Create event: 3