Recent Releases of simulacrumWorkflowR

simulacrumWorkflowR - Version 3

Release 3.0 addresses all comments from the JOSS review issue. The following changes have been implemented:

Versioning: The version number has been updated to v3.0 to match the latest Zenodo deposit for consistency.

Licensing: The repository and Zenodo now only contains MIT license.

Paper Revisions: The paper.md file has been revised to:

  • Correct capitalization and formatting issues under the references.

  • Rewrote the "Key functionalities" section to be a complete sentence.

  • Shorten the overall length by removing the "Workflow Illustration" section.

  • Correction of grammar and capitalization as requested.

Scientific Software - Peer-reviewed - R
Published by tzuV 5 months ago

simulacrumWorkflowR - v1.0.0

New Features

  • Integrated SQL Environment: Leverages the sqldf package to enable running SQL queries on the Simulacrum dataset directly within R, removing the need for an external Oracle database setup.

  • Workflow Generator: Includes the create_workflow() function, which automatically generates a complete and properly structured R script for submission to the UK's Cancer Analysis System (CAS).

  • Query and Data Helpers: Provides a collection of pre-made SQL queries for common data extraction tasks, data preprocessing functions for cleaning and analysis preparation, and a sqlite2oracle function to assist in query translation.

  • Automated Data Loading: Simplifies the initial setup with the read_simulacrum() function to easily load the dataset's CSV files into the R environment.

Fixes

This release includes feedback from the JOSS review process. Thank you to our reviewers for their detailed and constructive suggestions.

  • Added a browseURL Function for Convenience: Adds the opensimulacrumrequest() function to open a web browser directly to the data request page

  • Corrected and Validated Code Examples: All code examples in the paper, README, and vignettes have been reviewed, corrected, and validated to ensure they are functional and consistent.

  • Improved create_workflow() Functionality: The function now correctly handles variable names, ensures outputs from functions like cancer_grouping() are properly assigned, and automatically removes the sim_ prefix from table names to ensure compatibility with CAS.

  • Improved Documentation and Paper Clarity:

    • Addressed all typos and formatting issues in the paper.
    • Added missing citations and references for R packages, the CAS database, and related services.
    • Clarified terminology and added context, such as the "as of 2025" note for the free-tier time limit.
    • Updated Figure 1 for accuracy.
  • Automated Testing: Implemented a full suite of automated tests using the testthat package and created a minimal, non-identifiable dummy dataset to allow for unit testing.

  • Package and Repository:

    • Added a CONTRIBUTING.md file with issue templates for bug reports and feature requests.
    • Addressed and provided explanations for warnings generated by package functions.

Scientific Software - Peer-reviewed - R
Published by tzuV 5 months ago