research_ai-usage-in-science

A research project to understand how researchers studying who leverage scientific computing reuse DNNs within their work

https://github.com/nicholassynovic/research_ai-usage-in-science

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: nature.com, plos.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

A research project to understand how researchers studying who leverage scientific computing reuse DNNs within their work

Basic Info
  • Host: GitHub
  • Owner: NicholasSynovic
  • License: agpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 1.08 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 3
  • Open Issues: 4
  • Releases: 0
Created over 2 years ago · Last pushed 12 months ago
Metadata Files
Readme Contributing Funding License Code of conduct Citation Codeowners Security Support Governance

README.md

An Exploratory Mixed-Methods Study of Deep Neural Network Reuse in Computational Natural Science

TODO: Update authors after double blind review TODO: Add link to the pre-print article TODO: Add link to data hosted on Zenodo

This repository contains the source code used for An Exploratory Mixed-Methods Study of Deep Neural Network Reuse in Computational Natural Science. If you are looking for the data used in the study, we have released our SQLite3 database and author agreement excel workbooks to Zenodo.

Table Of Contents

About

TODO: Add links to each tutorial section TODO: Release template Excel (.xslx) file for author agreement with instructions on how to use it

This repository contains the source code to automatically search and filter for Natural Science publications from Nature and PLOS using OpenAlex. It also includes all of our code to capture and parse the metadata, as well as bulk download open-access articles from Nature and PLOS. Finally, it also includes the code to run an automated analysis of our study on arbiturary papers using pre-trained foundational reasoning and non-reasoning large language models (LLMs) via Ollama. We have provided a runner script to execute all autonomous operations and figure generations of our work. For the manual review portions of study, we have released a template Excel (.xlsx) workbook and instructions on how to perform our author agreement process.

Open-Access Data Collection

TODO: Release PDFs as part of the Zenodo artifact

Our work is based on peer-reviewed, open-access academic articles from PLOS and Nature. As part of our Zenodo artifact, we have released the .pdf documents leveraged in our study from both PLOS and Nature. While it may be possible to leverage our methods on non-open-access works, we make no claims to its effectiveness our efficacy.

TODO: Review TOS to ensure that this is accurate

At the time of this study, both PLOS's and Nature's Terms Of Service (TOS) supported the collection, aggregation, and release of open-access works in scientific pursuit. Prior to bulk downloading any documents from Nature or PLOS, we advise the reader to review both Nature's TOS and PLOS's TOS.

OpenAlex

OpenAlex is an open database of scientific works and their metadata. We leverage OpenAlex extensively to extract academic work metadata in a journal agnostic manner. Additionally, we rely on OpenAlex's topic identification system to filter for Natural Science (i.e., Chemistry, Biology, Physics, and Environmental Science). You can read more about this system here.

In our Zenodo release, we store the OpenAlex responses within the SQLite3 artifact in the openalex_responses table. As OpenAlex continously updates its aggregated works, we recommend reproducers of our work to leverage the stored responses.

Automated Analysis With Ollama Models

Leveraging the results from the author agreement process described in our paper, we leveraged pre-trained foundational reasoning and non-reasoning models to automatically review the bulk of remaining papers

Dependencies

asdf

How To Install

shell make create-dev make build

How To Run

We have created a pipeline that you can execute to reproduce the work. Please run ./run.bash {{EMAIL_ADDRESS}} where {{EMAIL_ADDRESS}} is a valid email to access the OpenAlex API polite pool.

Tutorial

asdf

Owner

  • Name: Nicholas Synovic
  • Login: NicholasSynovic
  • Kind: user
  • Location: Chicago, IL

Loyola University Chicago Computer Science Student Expected Graduation: May 2022

Citation (CITATION.cff)

# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files

GitHub Events

Total
  • Delete event: 2
  • Push event: 120
  • Pull request event: 2
  • Create event: 5
Last Year
  • Delete event: 2
  • Push event: 120
  • Pull request event: 2
  • Create event: 5

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 17
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Total issue authors: 0
  • Total pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.06
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 5
Past Year
  • Issues: 0
  • Pull requests: 17
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.06
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 5
Top Authors
Issue Authors
Pull Request Authors
  • NicholasSynovic (11)
  • dependabot[bot] (8)
  • KarolinaRyzka (2)
Top Labels
Issue Labels
Pull Request Labels
dependencies (8)

Dependencies

poetry.lock pypi
  • feedparser 6.0.11
  • sgmllib3k 1.0.0
pyproject.toml pypi
  • feedparser ^6.0.11
  • python ^3.10
requirements.txt pypi
  • poetry *