Recent Releases of simulacrumWorkflowR
simulacrumWorkflowR - Version 3
Release 3.0 addresses all comments from the JOSS review issue. The following changes have been implemented:
Versioning: The version number has been updated to v3.0 to match the latest Zenodo deposit for consistency.
Licensing: The repository and Zenodo now only contains MIT license.
Paper Revisions: The paper.md file has been revised to:
Correct capitalization and formatting issues under the references.
Rewrote the "Key functionalities" section to be a complete sentence.
Shorten the overall length by removing the "Workflow Illustration" section.
Correction of grammar and capitalization as requested.
Scientific Software - Peer-reviewed
- R
Published by tzuV 5 months ago
simulacrumWorkflowR - v1.0.0
New Features
Integrated SQL Environment: Leverages the
sqldfpackage to enable running SQL queries on the Simulacrum dataset directly within R, removing the need for an external Oracle database setup.Workflow Generator: Includes the
create_workflow()function, which automatically generates a complete and properly structured R script for submission to the UK's Cancer Analysis System (CAS).Query and Data Helpers: Provides a collection of pre-made SQL queries for common data extraction tasks, data preprocessing functions for cleaning and analysis preparation, and a
sqlite2oraclefunction to assist in query translation.Automated Data Loading: Simplifies the initial setup with the
read_simulacrum()function to easily load the dataset's CSV files into the R environment.
Fixes
This release includes feedback from the JOSS review process. Thank you to our reviewers for their detailed and constructive suggestions.
Added a browseURL Function for Convenience: Adds the opensimulacrumrequest() function to open a web browser directly to the data request page
Corrected and Validated Code Examples: All code examples in the paper, README, and vignettes have been reviewed, corrected, and validated to ensure they are functional and consistent.
Improved
create_workflow()Functionality: The function now correctly handles variable names, ensures outputs from functions likecancer_grouping()are properly assigned, and automatically removes thesim_prefix from table names to ensure compatibility with CAS.Improved Documentation and Paper Clarity:
- Addressed all typos and formatting issues in the paper.
- Added missing citations and references for R packages, the CAS database, and related services.
- Clarified terminology and added context, such as the "as of 2025" note for the free-tier time limit.
- Updated Figure 1 for accuracy.
Automated Testing: Implemented a full suite of automated tests using the
testthatpackage and created a minimal, non-identifiable dummy dataset to allow for unit testing.Package and Repository:
- Added a
CONTRIBUTING.mdfile with issue templates for bug reports and feature requests. - Addressed and provided explanations for warnings generated by package functions.
- Added a
Scientific Software - Peer-reviewed
- R
Published by tzuV 5 months ago