https://github.com/mcusac/scriptcraft-workspace

https://github.com/mcusac/scriptcraft-workspace

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: Mcusac
  • Language: Batchfile
  • Default Branch: main
  • Size: 17.1 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 12 months ago · Last pushed 10 months ago
Metadata Files
Readme Roadmap

README.md

ScriptCraft Workspace

A comprehensive data processing and quality control workspace for research workflows, featuring automated tools, validation frameworks, and pipeline orchestration.

🚀 Overview

ScriptCraft provides a unified framework for data processing, quality control, and automation in research environments. Built with scalability and reusability in mind, it offers:

  • 🔧 Automated Tools: Data validation, cleaning, comparison, and form automation
  • 📊 Quality Control: Comprehensive validation frameworks with plugin support
  • 🔄 Pipeline Orchestration: Multi-step workflows for complex data processing
  • 📦 Packaging System: Easy distribution of tools to end users
  • 🎯 Research Focus: Specialized tools for clinical and biomarker data processing
  • 🚀 Release Management: Automated PyPI and Git release workflows
  • ⚙️ Config-Driven: Single source of truth configuration system

🛡️ Security & Privacy

This workspace is designed with security and privacy in mind:

  • 🔒 Data Protection: All sensitive research data is excluded from version control
  • 📁 Safe Structure: Directory structure maintained without actual data files
  • ⚙️ Generic Configuration: No institution-specific URLs or credentials
  • 🧪 Template-Based: Uses sample data and placeholder configurations

Important: This repository contains only the framework and tools. Actual research data, credentials, and institution-specific configurations are excluded via .gitignore.

📦 Installation

Prerequisites

  • Python 3.8+
  • Git

Setup

```bash

Clone the repository

git clone https://github.com/yourusername/ScriptCraft-Workspace.git cd ScriptCraft-Workspace

Install the Python package

cd implementations/python-package pip install -e . ```

🧰 Available Tools

Data Processing

  • Data Content Comparer: Compare datasets for consistency and changes
  • Dictionary Cleaner: Clean and standardize dictionary files
  • Date Format Standardizer: Standardize date formats across datasets
  • Schema Detector: Automatic schema detection and validation

Quality Control

  • Dictionary Validator: Validate dictionary structures and content
  • Dictionary Driven Checker: Check data against dictionary definitions
  • MedVisit Integrity Validator: Validate medical visit data integrity
  • Score Totals Checker: Validate score calculations and totals

Automation

  • RHQ Form Autofiller: Automate form filling for research questionnaires
  • Automated Labeler: Generate labels and documentation from schemas
  • Feature Change Checker: Track feature changes across releases

Release Management

  • PyPI Release Tool: Automated PyPI package testing and release
  • Git Workspace Tool: Git repository management and operations
  • Git Submodule Tool: Git submodule synchronization and management
  • Generic Release Tool: Flexible release workflow orchestration

Workflows

  • Dictionary Workflow: Complete dictionary processing pipeline

🚀 Quick Start

Using Individual Tools

```python import scriptcraft.common as cu from scriptcraft.tools.datacontentcomparer import DataContentComparer

Initialize tool

comparer = DataContentComparer()

Process data

comparer.run( inputpaths=["data1.csv", "data2.csv"], outputdir="output/" ) ```

Using Pipelines

```python from scriptcraft.pipelines.gitpipelines import createpypitestpipeline

Create and run a pipeline

pipeline = createpypitest_pipeline() pipeline.run() ```

Command Line Usage

Industry-Standard CLI (Recommended)

```bash

List available tools and pipelines

scriptcraft list

Run a specific tool

scriptcraft datacontentcomparer

Run a pipeline

scriptcraft dictionary_pipeline

Release operations (INDUSTRY STANDARD)

scriptcraft-release pypi-test # Test PyPI upload scriptcraft-release pypi-release # Release to PyPI scriptcraft-release git-sync # Sync Git repository scriptcraft-release git-status # Check Git status scriptcraft-release full-release # Full release workflow

See docs/RELEASEUSAGEGUIDE.md for comprehensive release examples

```

Legacy run_all.py (Still Supported)

```bash

Run a specific tool

python runall.py --tool datacontent_comparer

Run a pipeline

python runall.py --pipeline dictionarypipeline ```

📁 Project Structure

ScriptCraft-Workspace/ ├── implementations/python-package/ # Main Python package │ └── scriptcraft/ │ ├── common/ # Shared utilities │ ├── tools/ # Individual tools │ └── pipelines/ # Pipeline orchestration ├── data/ # Workspace data (gitignored) │ ├── domains/ # Domain-specific data │ ├── input/ # Input files │ ├── output/ # Output files │ └── logs/ # Log files ├── templates/ # Tool templates ├── distributables/ # Packaged tools └── config.yaml # Central configuration

🔧 Configuration

The workspace uses a centralized configuration system in config.yaml:

```yaml

Example configuration

workspaces: data: studyname: "RESEARCHSTUDY" domains: ["Clinical", "Biomarkers", "Genomics", "Imaging"] idcolumns: ["MedID", "Visit_ID"]

tools: datacontentcomparer: description: "📊 Compares data content between releases" packages: [pandas, numpy, openpyxl] ```

📦 Packaging & Distribution

ScriptCraft provides multiple distribution methods:

PyPI Distribution

```bash

Install from PyPI

pip install scriptcraft-python

Use CLI commands

scriptcraft-release pypi-test scriptcraft rhqformautofiller

Use release manager for version bumps

python -c "from scriptcraft.tools.releasemanager import ReleaseManager; ReleaseManager().run(mode='pythonpackage', versiontype='patch', autopush=True)" ```

Local Packaging

```bash

Package a tool for distribution

python runall.py --tool rhqform_autofiller

The packaged tool will be available in distributables/

```

Development Installation

```bash

Install in development mode

cd implementations/python-package pip install -e . ```

🧪 Testing

```bash

Run all tests

python -m pytest

Run specific test categories

python -m pytest tests/unit/ python -m pytest tests/integration/ ```

📚 Documentation

  • Tool Documentation: See individual tool README files in implementations/python-package/scriptcraft/tools/
  • API Reference: Available in the main package documentation
  • Examples: Check the templates/ directory for usage examples

🤝 Contributing

We welcome contributions! Please see our contributing guidelines for details on:

  • Code style and standards
  • Testing requirements
  • Documentation updates
  • Security considerations

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support

🙏 Acknowledgments

  • Built for the research community
  • Developed with support from research institutions
  • Thanks to all contributors and users

ScriptCraft Workspace - Making research data processing easier, one tool at a time. 🚀

Owner

  • Login: Mcusac
  • Kind: user

GitHub Events

Total
  • Public event: 1
  • Push event: 20
  • Create event: 4
Last Year
  • Public event: 1
  • Push event: 20
  • Create event: 4

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 1,132 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 17
  • Total maintainers: 1
pypi.org: scriptcraft-python

Data processing and quality control tools for research workflows

  • Versions: 17
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,132 Last month
Rankings
Dependent packages count: 8.9%
Average: 29.4%
Dependent repos count: 49.9%
Maintainers (1)
Last synced: 10 months ago

Dependencies

tests/requirements-test.txt pypi
  • black >=21.12b0 test
  • coverage >=5.5.0 test
  • factory-boy >=3.2.1 test
  • faker >=8.12.0 test
  • flake8 >=3.9.0 test
  • memory-profiler >=0.60.0 test
  • mypy >=0.910 test
  • pre-commit >=2.15.0 test
  • psutil >=5.8.0 test
  • pytest >=6.2.5 test
  • pytest-benchmark >=3.4.0 test
  • pytest-cov >=2.12.1 test
  • pytest-mock >=3.6.1 test