research_ai-usage-in-science
A research project to understand how researchers studying who leverage scientific computing reuse DNNs within their work
https://github.com/nicholassynovic/research_ai-usage-in-science
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: nature.com, plos.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Repository
A research project to understand how researchers studying who leverage scientific computing reuse DNNs within their work
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 3
- Open Issues: 4
- Releases: 0
Metadata Files
README.md
An Exploratory Mixed-Methods Study of Deep Neural Network Reuse in Computational Natural Science
TODO: Update authors after double blind review TODO: Add link to the pre-print article TODO: Add link to data hosted on Zenodo
This repository contains the source code used for An Exploratory Mixed-Methods Study of Deep Neural Network Reuse in Computational Natural Science. If you are looking for the data used in the study, we have released our SQLite3 database and author agreement excel workbooks to Zenodo.
Table Of Contents
About
TODO: Add links to each tutorial section TODO: Release template Excel (.xslx) file for author agreement with instructions on how to use it
This repository contains the source code to automatically search and filter for Natural Science publications from Nature and PLOS using OpenAlex. It also includes all of our code to capture and parse the metadata, as well as bulk download open-access articles from Nature and PLOS. Finally, it also includes the code to run an automated analysis of our study on arbiturary papers using pre-trained foundational reasoning and non-reasoning large language models (LLMs) via Ollama. We have provided a runner script to execute all autonomous operations and figure generations of our work. For the manual review portions of study, we have released a template Excel (.xlsx) workbook and instructions on how to perform our author agreement process.
Open-Access Data Collection
TODO: Release PDFs as part of the Zenodo artifact
Our work is based on peer-reviewed, open-access academic articles from PLOS and
Nature. As part of our Zenodo artifact, we have released the .pdf documents
leveraged in our study from both PLOS and Nature. While it may be possible to
leverage our methods on non-open-access works, we make no claims to its
effectiveness our efficacy.
TODO: Review TOS to ensure that this is accurate
At the time of this study, both PLOS's and Nature's Terms Of Service (TOS) supported the collection, aggregation, and release of open-access works in scientific pursuit. Prior to bulk downloading any documents from Nature or PLOS, we advise the reader to review both Nature's TOS and PLOS's TOS.
OpenAlex
OpenAlex is an open database of scientific works and their metadata. We leverage OpenAlex extensively to extract academic work metadata in a journal agnostic manner. Additionally, we rely on OpenAlex's topic identification system to filter for Natural Science (i.e., Chemistry, Biology, Physics, and Environmental Science). You can read more about this system here.
In our Zenodo release, we store the OpenAlex responses within the SQLite3
artifact in the openalex_responses table. As OpenAlex continously updates its
aggregated works, we recommend reproducers of our work to leverage the stored
responses.
Automated Analysis With Ollama Models
Leveraging the results from the author agreement process described in our paper, we leveraged pre-trained foundational reasoning and non-reasoning models to automatically review the bulk of remaining papers
Dependencies
asdf
How To Install
shell
make create-dev
make build
How To Run
We have created a pipeline that you can execute to reproduce the work. Please
run ./run.bash {{EMAIL_ADDRESS}} where {{EMAIL_ADDRESS}} is a valid email to
access the
OpenAlex API polite pool.
Tutorial
asdf
Owner
- Name: Nicholas Synovic
- Login: NicholasSynovic
- Kind: user
- Location: Chicago, IL
- Website: https://nicholassynovic.github.io/
- Repositories: 89
- Profile: https://github.com/NicholasSynovic
Loyola University Chicago Computer Science Student Expected Graduation: May 2022
Citation (CITATION.cff)
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files
GitHub Events
Total
- Delete event: 2
- Push event: 120
- Pull request event: 2
- Create event: 5
Last Year
- Delete event: 2
- Push event: 120
- Pull request event: 2
- Create event: 5
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 17
- Average time to close issues: N/A
- Average time to close pull requests: 1 day
- Total issue authors: 0
- Total pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.06
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 5
Past Year
- Issues: 0
- Pull requests: 17
- Average time to close issues: N/A
- Average time to close pull requests: 1 day
- Issue authors: 0
- Pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.06
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 5
Top Authors
Issue Authors
Pull Request Authors
- NicholasSynovic (11)
- dependabot[bot] (8)
- KarolinaRyzka (2)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- feedparser 6.0.11
- sgmllib3k 1.0.0
- feedparser ^6.0.11
- python ^3.10
- poetry *