putior

Register In- and Outputs for Workflow Visualization.

https://github.com/pjt222/putior

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Register In- and Outputs for Workflow Visualization.

Basic Info
  • Host: GitHub
  • Owner: pjt222
  • License: other
  • Language: R
  • Default Branch: main
  • Size: 3.93 MB
Statistics
  • Stars: 4
  • Watchers: 1
  • Forks: 0
  • Open Issues: 3
  • Releases: 0
Created about 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog License

README.md

putior

R CMD check CRAN status CRAN downloads License: MIT lifecycle

Extract beautiful workflow diagrams from your code annotations

putior (PUT + Input + Output + R) is an R package that extracts structured annotations from source code files and creates beautiful Mermaid flowchart diagrams. Perfect for documenting data pipelines, workflows, and understanding complex codebases.

🌟 Key Features

  • Simple annotations - Add structured comments to your existing code
  • Beautiful diagrams - Generate professional Mermaid flowcharts
  • File flow tracking - Automatically connects scripts based on input/output files
  • Multiple themes - 5 built-in themes including GitHub-optimized
  • Cross-language support - Works with R, Python, SQL, shell scripts, and Julia
  • Flexible output - Console, file, or clipboard export
  • Customizable styling - Control colors, direction, and node shapes

📦 Installation

```r

Install from CRAN (recommended)

install.packages("putior")

Or install from GitHub (development version)

remotes::install_github("pjt222/putior")

Or with renv

renv::install("putior") # CRAN version renv::install("pjt222/putior") # GitHub version

Or with pak (faster)

pak::pkginstall("putior") # CRAN version pak::pkginstall("pjt222/putior") # GitHub version ```

🚀 Quick Start

Step 1: Annotate Your Code

Add structured annotations to your R or Python scripts using #put comments:

01_fetch_data.R ```r

put label:"Fetch Sales Data", nodetype:"input", output:"salesdata.csv"

Your actual code

library(readr) salesdata <- fetchsalesfromapi() writecsv(salesdata, "sales_data.csv") ```

02_clean_data.py ```python

put label:"Clean and Process", nodetype:"process", input:"salesdata.csv", output:"clean_sales.csv"

import pandas as pd df = pd.readcsv("salesdata.csv")

... data cleaning code ...

df.tocsv("cleansales.csv") ```

Step 2: Extract and Visualize

```r library(putior)

Extract workflow from your scripts

workflow <- put("./scripts/")

Generate diagram

put_diagram(workflow) ```

Result: ```mermaid flowchart TD fetchsales([Fetch Sales Data]) cleandata[Clean and Process]

%% Connections
fetch_sales --> clean_data

%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch_sales inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_data processStyle

```

📈 Common Data Science Pattern

Modular Workflow with source()

The most common data science pattern: modularize functions into separate scripts and orchestrate them in a main workflow:

utils.R - Utility functions ```r

put label:"Data Utilities", node_type:"input"

loadandclean <- function(file) { data <- read.csv(file) data[complete.cases(data), ] }

validate_data <- function(data) { stopifnot(nrow(data) > 0) return(data) } ```

analysis.R - Analysis functions ```r

put label:"Statistical Analysis", input:"utils.R"

performanalysis <- function(data) { # Uses utility functions from utils.R cleaned <- validatedata(data) summary(cleaned) } ```

main.R - Workflow orchestrator ```r

put label:"Main Analysis Pipeline", input:"utils.R,analysis.R", output:"results.csv"

source("utils.R") # Load utility functions source("analysis.R") # Load analysis functions

Execute the pipeline

data <- loadandclean("rawdata.csv") results <- performanalysis(data) write.csv(results, "results.csv") ```

Generated Workflow (Simple): ```mermaid flowchart TD utils([Data Utilities]) analysis[Statistical Analysis] main[Main Analysis Pipeline]

%% Connections
utils --> analysis
utils --> main
analysis --> main

%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class utils inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class analysis processStyle
class main processStyle

```

Generated Workflow (With Data Artifacts): ```r

Show complete data flow including all files

putdiagram(workflow, showartifacts = TRUE) ```

```mermaid flowchart TD utils([Data Utilities]) analysis[Statistical Analysis] main[Main Analysis Pipeline] artifactresultscsv[(results.csv)]

%% Connections
utils --> analysis
utils --> main
analysis --> main
main --> artifact_results_csv

%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class utils inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class analysis processStyle
class main processStyle
classDef artifactStyle fill:#f3f4f6,stroke:#6b7280,stroke-width:1px,color:#374151
class artifact_results_csv artifactStyle

```

This pattern clearly shows: - Function modules (utils.R, analysis.R) are sourced into the main script - Dependencies between modules (analysis depends on utils)
- Complete data flow with artifacts showing terminal outputs like results.csv - Two visualization modes: simple (script connections only) vs. complete (with data artifacts)

📊 Visualization Examples

Basic Workflow

```r

Simple three-step process

workflow <- put("./datapipeline/") putdiagram(workflow) ```

Advanced Data Science Pipeline

Here's how putior handles a complete data science workflow:

File Structure: data_pipeline/ ├── 01_fetch_sales.R # Fetch sales data ├── 02_fetch_customers.R # Fetch customer data ├── 03_clean_sales.py # Clean sales data ├── 04_merge_data.R # Merge datasets ├── 05_analyze.py # Statistical analysis └── 06_report.R # Generate final report

Generated Workflow: ```mermaid flowchart TD fetchsales([Fetch Sales Data]) fetchcustomers([Fetch Customer Data]) cleansales[Clean Sales Data] mergedata[Merge Datasets] analyze[Statistical Analysis] report[[Generate Final Report]]

%% Connections
fetch_sales --> clean_sales
fetch_customers --> merge_data
clean_sales --> merge_data
merge_data --> analyze
analyze --> report

%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch_sales inputStyle
class fetch_customers inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_sales processStyle
class merge_data processStyle
class analyze processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class report outputStyle

```

📋 Using the Diagrams

Embedding in Documentation

The generated Mermaid code works perfectly in:

  • GitHub README files (native Mermaid support)
  • GitLab documentation
  • Notion pages
  • Obsidian notes
  • Jupyter notebooks (with extensions)
  • Sphinx documentation (with plugins)
  • Any Markdown renderer with Mermaid support

Saving and Sharing

```r

Save to markdown file

put_diagram(workflow, output = "file", file = "workflow.md")

Copy to clipboard for pasting

put_diagram(workflow, output = "clipboard")

Include title for documentation

put_diagram(workflow, output = "file", file = "docs/pipeline.md", title = "Data Processing Pipeline") ```

🔧 Visualization Modes

putior offers two visualization modes to suit different needs:

Workflow Boundaries Demo

First, let's see how workflow boundaries enhance pipeline visualization:

Pipeline with Boundaries (Default): ```r

Complete ETL pipeline with clear start/end boundaries

putdiagram(workflow, showworkflow_boundaries = TRUE) ```

```mermaid flowchart TD pipelinestart([Data Pipeline Start]) extractdata[Extract Raw Data] transformdata[Transform Data] pipelineend([Pipeline Complete])

%% Connections
pipeline_start --> extract_data
extract_data --> transform_data
transform_data --> pipeline_end

%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class extract_data processStyle
class transform_data processStyle
classDef startStyle fill:#fef3c7,stroke:#d97706,stroke-width:3px,color:#92400e
class pipeline_start startStyle
classDef endStyle fill:#dcfce7,stroke:#16a34a,stroke-width:3px,color:#15803d
class pipeline_end endStyle

```

Same Pipeline without Boundaries: ```r

Clean diagram without workflow control styling

putdiagram(workflow, showworkflow_boundaries = FALSE) ```

```mermaid flowchart TD pipelinestart([Data Pipeline Start]) extractdata[Extract Raw Data] transformdata[Transform Data] pipelineend([Pipeline Complete])

%% Connections
pipeline_start --> extract_data
extract_data --> transform_data
transform_data --> pipeline_end

%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class extract_data processStyle
class transform_data processStyle

```

Simple Mode (Default)

Shows only script-to-script connections - perfect for understanding code dependencies: r put_diagram(workflow) # Default: simple mode

Use when: - Documenting code architecture - Showing function dependencies - Clean, simple workflow diagrams

Artifact Mode (Complete Data Flow)

Shows all data files as nodes - provides complete picture of data flow including terminal outputs: r put_diagram(workflow, show_artifacts = TRUE)

Use when: - Documenting data pipelines - Tracking data lineage - Showing complete input/output flow - Understanding data dependencies

Comparison Example

Simple Mode: mermaid flowchart TD load[Load Data] --> process[Process Data] process --> analyze[Analyze]

Artifact Mode: ```mermaid flowchart TD load[Load Data] rawdata[(rawdata.csv)] process[Process Data] cleandata[(cleandata.csv)] analyze[Analyze] results[(results.json)]

load --> raw_data
raw_data --> process
process --> clean_data
clean_data --> analyze
analyze --> results

```

Key Differences

| Mode | Shows | Best For | |------|-------|----------| | Simple | Script connections only | Code architecture, dependencies | | Artifact | Scripts + data files | Data pipelines, complete data flow |

File Labeling

Add file names to connections for extra clarity: ```r

Show file names on arrows

putdiagram(workflow, showartifacts = TRUE, show_files = TRUE) ```

🎨 Theme System

putior provides 5 carefully designed themes optimized for different environments:

```r

Get list of available themes

getdiagramthemes() ```

Theme Overview

| Theme | Best For | Description | |-------|----------|-------------| | light | Documentation sites, tutorials | Default light theme with bright colors | | dark | Dark mode apps, terminals | Dark theme with muted colors | | auto | GitHub README files | GitHub-adaptive theme that works in both modes | | minimal | Business reports, presentations | Grayscale professional theme | | github | GitHub README (recommended) | Optimized for maximum GitHub compatibility |

Theme Examples

Light Theme r put_diagram(workflow, theme = "light") ```mermaid flowchart TD fetchdata([Fetch API Data]) cleandata[Clean and Validate] generate_report[[Generate Final Report]]

%% Connections
fetch_data --> clean_data
clean_data --> generate_report

%% Styling
classDef inputStyle fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000000
class fetch_data inputStyle
classDef processStyle fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000000
class clean_data processStyle
classDef outputStyle fill:#e8f5e8,stroke:#1b5e20,stroke-width:2px,color:#000000
class generate_report outputStyle

```

Dark Theme r put_diagram(workflow, theme = "dark") ```mermaid flowchart TD fetchdata([Fetch API Data]) cleandata[Clean and Validate] generate_report[[Generate Final Report]]

%% Connections
fetch_data --> clean_data
clean_data --> generate_report

%% Styling
classDef inputStyle fill:#1a237e,stroke:#3f51b5,stroke-width:2px,color:#ffffff
class fetch_data inputStyle
classDef processStyle fill:#4a148c,stroke:#9c27b0,stroke-width:2px,color:#ffffff
class clean_data processStyle
classDef outputStyle fill:#1b5e20,stroke:#4caf50,stroke-width:2px,color:#ffffff
class generate_report outputStyle

```

Auto Theme (GitHub Adaptive) r put_diagram(workflow, theme = "auto") # Recommended for GitHub! ```mermaid flowchart TD fetchdata([Fetch API Data]) cleandata[Clean and Validate] generate_report[[Generate Final Report]]

%% Connections
fetch_data --> clean_data
clean_data --> generate_report

%% Styling
classDef inputStyle fill:#3b82f6,stroke:#1d4ed8,stroke-width:2px,color:#ffffff
class fetch_data inputStyle
classDef processStyle fill:#8b5cf6,stroke:#6d28d9,stroke-width:2px,color:#ffffff
class clean_data processStyle
classDef outputStyle fill:#10b981,stroke:#047857,stroke-width:2px,color:#ffffff
class generate_report outputStyle

```

GitHub Theme (Maximum Compatibility) r put_diagram(workflow, theme = "github") # Best for GitHub README ```mermaid flowchart TD fetchdata([Fetch API Data]) cleandata[Clean and Validate] generate_report[[Generate Final Report]]

%% Connections
fetch_data --> clean_data
clean_data --> generate_report

%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch_data inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_data processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class generate_report outputStyle

```

Minimal Theme r put_diagram(workflow, theme = "minimal") # Professional documents ```mermaid flowchart TD fetchdata([Fetch API Data]) cleandata[Clean and Validate] generate_report[[Generate Final Report]]

%% Connections
fetch_data --> clean_data
clean_data --> generate_report

%% Styling
classDef inputStyle fill:#f8fafc,stroke:#64748b,stroke-width:1px,color:#1e293b
class fetch_data inputStyle
classDef processStyle fill:#f1f5f9,stroke:#64748b,stroke-width:1px,color:#1e293b
class clean_data processStyle
classDef outputStyle fill:#f8fafc,stroke:#64748b,stroke-width:1px,color:#1e293b
class generate_report outputStyle

```

When to Use Each Theme

| Theme | Use Case | Environment | |-------|----------|-------------| | light | Documentation sites, tutorials | Light backgrounds | | dark | Dark mode apps, terminals | Dark backgrounds | | auto | GitHub README files | Adapts automatically | | github | GitHub README (recommended) | Maximum compatibility | | minimal | Business reports, presentations | Print-friendly |

Pro Tips

  • For GitHub: Use theme = "github" for maximum compatibility, or theme = "auto" for adaptive colors
  • For Documentation: Use theme = "light" or theme = "dark" to match your site
  • For Reports: Use theme = "minimal" for professional, print-friendly diagrams
  • For Demos: Light theme usually shows colors best in presentations

Theme Usage Examples

```r

For GitHub README (recommended)

put_diagram(workflow, theme = "github")

For GitHub README (adaptive)

put_diagram(workflow, theme = "auto")

For dark documentation sites

put_diagram(workflow, theme = "dark", direction = "LR")

For professional reports

put_diagram(workflow, theme = "minimal", output = "file", file = "report.md")

Save all themes for comparison

themes <- c("light", "dark", "auto", "github", "minimal") for(theme in themes) { putdiagram(workflow, theme = theme, output = "file", file = paste0("workflow", theme, ".md"), title = paste("Workflow -", stringr::strtotitle(theme), "Theme")) } ```

🔧 Customization Options

Flow Direction

r put_diagram(workflow, direction = "TD") # Top to bottom (default) put_diagram(workflow, direction = "LR") # Left to right put_diagram(workflow, direction = "BT") # Bottom to top put_diagram(workflow, direction = "RL") # Right to left

Node Labels

r put_diagram(workflow, node_labels = "name") # Show node IDs put_diagram(workflow, node_labels = "label") # Show descriptions (default) put_diagram(workflow, node_labels = "both") # Show name: description

File Connections

```r

Show file names on arrows

putdiagram(workflow, showfiles = TRUE)

Clean arrows without file names

putdiagram(workflow, showfiles = FALSE) ```

Styling Control

```r

Include colored styling (default)

putdiagram(workflow, stylenodes = TRUE)

Plain diagram without colors

putdiagram(workflow, stylenodes = FALSE)

Control workflow boundary styling

putdiagram(workflow, showworkflowboundaries = TRUE) # Special start/end styling (default) putdiagram(workflow, showworkflowboundaries = FALSE) # Regular node styling ```

Workflow Boundaries

```r

Enable workflow boundaries (default) - start/end get special styling

putdiagram(workflow, showworkflow_boundaries = TRUE)

Disable workflow boundaries - start/end render as regular nodes

putdiagram(workflow, showworkflow_boundaries = FALSE) ```

Output Options

```r

Console output (default)

put_diagram(workflow)

Save to markdown file

putdiagram(workflow, output = "file", file = "myworkflow.md")

Copy to clipboard for pasting

put_diagram(workflow, output = "clipboard") ```

📝 Annotation Reference

Basic Syntax

All PUT annotations follow this format: ```r

put property1:"value1", property2:"value2", property3:"value3"

```

Alternative Formats (All Valid)

```r

put id:"node_id", label:"Description" # Standard

put id:"node_id", label:"Description" # Space after

put| id:"node_id", label:"Description" # Pipe separator

put: id:"node_id", label:"Description" # Colon separator

```

Annotations

| Annotation | Description | Example | Required | |------------|-------------|---------|----------| | id | Unique identifier for the node (auto-generated if omitted) | "fetch_data", "clean_sales" | Optional* | | label | Human-readable description | "Fetch Sales Data", "Clean and Process" | Recommended |

*Note: If id is omitted, a UUID will be automatically generated. If you provide an empty id (e.g., id:""), you'll get a validation warning.

Optional Annotations

| Annotation | Description | Example | Default | |------------|-------------|---------|---------| | node_type | Visual shape of the node | "input", "process", "output", "decision", "start", "end" | "process" | | input | Input files (comma-separated) | "raw_data.csv, config.json" | None | | output | Output files (comma-separated) | "processed_data.csv, summary.txt" | Current file name* |

*Note: If output is omitted, it defaults to the name of the file containing the annotation. This ensures nodes can be connected in workflows.

Node Types and Shapes

putior uses a data-centric approach with workflow boundaries as special control elements:

Data Processing Nodes: - "input" - Data sources, APIs, file readers → Stadium shape ([text]) - "process" - Data transformation, analysis → Rectangle [text]
- "output" - Final results, reports, exports → Subroutine [[text]] - "decision" - Conditional logic, branching → Diamond {text}

Workflow Control Nodes: - "start" - Workflow entry point → Stadium shape with orange styling - "end" - Workflow termination → Stadium shape with green styling

Workflow Boundaries

Control the visualization of workflow start/end points with show_workflow_boundaries:

```r

Special workflow boundary styling (default)

putdiagram(workflow, showworkflow_boundaries = TRUE)

Regular nodes without special workflow styling

putdiagram(workflow, showworkflow_boundaries = FALSE) ```

With boundaries enabled (default): - node_type:"start" gets distinctive orange styling with thicker borders - node_type:"end" gets distinctive green styling with thicker borders

With boundaries disabled: - Start/end nodes render as regular stadium shapes without special colors

Example Annotations

R Scripts: ```r

put id:"loadsalesdata", label:"Load Sales Data from API", nodetype:"input", output:"rawsales.csv, metadata.json"

put id:"validatedata", label:"Validate and Clean Data", nodetype:"process", input:"rawsales.csv", output:"cleansales.csv"

put id:"generatereport", label:"Generate Executive Summary", nodetype:"output", input:"cleansales.csv, metadata.json", output:"executivesummary.pdf"

```

Python Scripts: ```python

put id:"collectdata", label:"Collect Raw Data", nodetype:"input", output:"raw_data.csv"

put id:"trainmodel", label:"Train ML Model", nodetype:"process", input:"features.csv", output:"model.pkl"

put id:"predict", label:"Generate Predictions", nodetype:"output", input:"model.pkl, testdata.csv", output:"predictions.csv"

```

Multiple Annotations Per File: ```r

analysis.R

put id:"createsummary", label:"Calculate Summary Stats", nodetype:"process", input:"processeddata.csv", output:"summarystats.json"

put id:"createreport", label:"Generate Sales Report", nodetype:"output", input:"processeddata.csv", output:"salesreport.html"

Your R code here...

```

Workflow Entry and Exit Points: ```r

main_workflow.R

put id:"workflowstart", label:"Start Analysis Pipeline", nodetype:"start", output:"config.json"

put id:"workflowend", label:"Pipeline Complete", nodetype:"end", input:"final_report.pdf"

```

Workflow Boundary Examples: ```r

Complete pipeline with boundaries

put id:"pipelinestart", label:"Data Pipeline Start", nodetype:"start", output:"raw_config.json"

put id:"extractdata", label:"Extract Raw Data", nodetype:"process", input:"rawconfig.json", output:"rawdata.csv"

put id:"transformdata", label:"Transform Data", nodetype:"process", input:"rawdata.csv", output:"cleandata.csv"

put id:"pipelineend", label:"Pipeline Complete", nodetype:"end", input:"clean_data.csv"

```

Generated Workflow with Boundaries: ```mermaid flowchart TD pipelinestart([Data Pipeline Start]) extractdata[Extract Raw Data] transformdata[Transform Data] pipelineend([Pipeline Complete])

pipeline_start --> extract_data
extract_data --> transform_data
transform_data --> pipeline_end

classDef startStyle fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px,color:#1b5e20
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
classDef endStyle fill:#ffebee,stroke:#c62828,stroke-width:3px,color:#b71c1c
class pipeline_start startStyle
class extract_data,transform_data processStyle
class pipeline_end endStyle

```

Supported File Types

putior automatically detects and processes these file types: - R: .R, .r - Python: .py - SQL: .sql - Shell: .sh - Julia: .jl

🛠️ Advanced Usage

Directory Scanning

```r

Scan current directory

workflow <- put(".")

Scan specific directory

workflow <- put("./src/")

Recursive scanning (include subdirectories)

workflow <- put("./project/", recursive = TRUE)

Custom file patterns

workflow <- put("./analysis/", pattern = "\.(R|py)$")

Single file

workflow <- put("./script.R") ```

Debugging and Validation

```r

Include line numbers for debugging

workflow <- put("./src/", includelinenumbers = TRUE)

Disable validation warnings

workflow <- put("./src/", validate = FALSE)

Test annotation syntax

isvalidputannotation('#put id:"test", label:"Test Node"') # TRUE isvalidputannotation("#put invalid syntax") # FALSE ```

UUID Auto-Generation

When you omit the id field, putior automatically generates a unique UUID:

```r

Annotations without explicit IDs

put label:"Load Data", node_type:"input", output:"data.csv"

put label:"Process Data", node_type:"process", input:"data.csv"

Extract workflow - IDs will be auto-generated

workflow <- put("./") print(workflow$id)

[1] "a1b2c3d4-e5f6-7890-abcd-ef1234567890"

[2] "b2c3d4e5-f6a7-8901-bcde-f23456789012"

```

This feature is perfect for: - Quick prototyping without worrying about unique IDs - Temporary workflows where IDs don't matter - Ensuring uniqueness across large codebases

Note: If you provide an empty id (e.g., id:""), you'll get a validation warning.

Tracking Source Relationships

When you have a main script that sources other scripts, annotate them to show the sourcing relationships:

```r

main.R - sources other scripts

put label:"Main Workflow", input:"utils.R,analysis.R", output:"results.csv"

source("utils.R") # Reading utils.R into main.R source("analysis.R") # Reading analysis.R into main.R

utils.R - sourced by main.R

put label:"Utility Functions", node_type:"input"

output defaults to "utils.R"

analysis.R - sourced by main.R, depends on utils.R

put label:"Analysis Functions", input:"utils.R"

output defaults to "analysis.R"

```

This creates a diagram showing: - utils.Rmain.R (sourced into) - analysis.Rmain.R (sourced into) - utils.Ranalysis.R (dependency)

🔄 Self-Documentation: putior Documents Itself!

As a demonstration of putior's capabilities, we've added PUT annotations to putior's own source code. This creates a beautiful visualization of how the package works internally:

```r

Extract putior's own workflow

workflow <- put("./R/") put_diagram(workflow, theme = "github", title = "putior Package Internals") ```

Result:

```mermaid

title: putior Package Internals

flowchart TD putentry([Entry Point - Scan Files]) processfile[Process Single File] parser[Parse Annotation Syntax] convertdf[Convert to Data Frame] diagramgen[Generate Mermaid Diagram] nodedefs[Create Node Definitions] connections[Generate Node Connections] outputhandler([Output Final Diagram])

%% Connections
put_entry --> process_file
process_file --> parser
parser --> convert_df
convert_df --> diagram_gen
diagram_gen --> node_defs
node_defs --> connections
connections --> output_handler

%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class process_file processStyle
class parser processStyle
class convert_df processStyle
class diagram_gen processStyle
class node_defs processStyle
class connections processStyle
classDef startStyle fill:#fef3c7,stroke:#d97706,stroke-width:3px,color:#92400e
class put_entry startStyle
classDef endStyle fill:#dcfce7,stroke:#16a34a,stroke-width:3px,color:#15803d
class output_handler endStyle

```

This self-documentation shows the two main phases of putior: 1. Parsing Phase: Scanning files → extracting annotations → converting to workflow data 2. Diagram Generation Phase: Taking workflow data → creating nodes/connections → outputting diagram

To see the complete data flow with intermediate files, run: r put_diagram(workflow, show_artifacts = TRUE, theme = "github")

🤝 Contributing

Contributions welcome! Please open an issue or pull request on GitHub.

Development Setup: ```bash git clone https://github.com/pjt222/putior.git cd putior

Install dev dependencies

Rscript -e "devtools::installdevdeps()"

Run tests

Rscript -e "devtools::test()"

Check package

Rscript -e "devtools::check()" ```

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📊 How putior Compares to Other R Packages

putior fills a unique niche in the R ecosystem by combining annotation-based workflow extraction with beautiful diagram generation:

| Package | Focus | Approach | Output | Best For | |---------|-------|----------|--------|----------| | putior | Data workflow visualization | Code annotations | Mermaid diagrams | Pipeline documentation | | CodeDepends | Code dependency analysis | Static analysis | Variable graphs | Understanding code structure | | DiagrammeR | General diagramming | Manual diagram code | Interactive graphs | Custom diagrams | | visNetwork | Interactive networks | Manual network definition | Interactive vis.js | Complex network exploration | | dm | Database relationships | Schema analysis | ER diagrams | Database documentation | | flowchart | Study flow diagrams | Dataframe input | ggplot2 charts | Clinical trials |

Key Advantages of putior

  • 📝 Annotation-Based: Workflow documentation lives in your code comments
  • 🔄 Multi-Language: Works across R, Python, SQL, Shell, and Julia
  • 📁 File Flow Tracking: Automatically connects scripts based on input/output files
  • 🎨 Beautiful Output: GitHub-ready Mermaid diagrams with multiple themes
  • 📦 Lightweight: Minimal dependencies (only requires tools package)
  • 🔍 Two Views: Simple script connections + complete data artifact flow

🙏 Acknowledgments

  • Built with Mermaid for beautiful diagram generation
  • Inspired by the need for better code documentation and workflow visualization
  • Thanks to the R community for excellent development tooling

👥 Contributors

  • Philipp Thoss (@pjt222) - Primary author and maintainer
  • Claude (Anthropic) - Co-author on 38 commits, contributing to package development, documentation, and testing

Note: While GitHub's contributor graph only displays primary commit authors, Claude's contributions are properly attributed through Co-Authored-By tags in the commit messages. To see all contributions, use: git log --grep="Co-Authored-By: Claude"

🌟 Shoutout to Related R Packages

putior stands on the shoulders of giants in the R visualization and workflow ecosystem:

  • CodeDepends by Duncan Temple Lang - pioneering work in R code dependency analysis
  • targets by William Michael Landau - powerful pipeline toolkit for reproducible computation
  • DiagrammeR by Richard Iannone - bringing beautiful graph visualization to R
  • ggraph by Thomas Lin Pedersen - grammar of graphics for networks and trees
  • visNetwork by Almende B.V. - interactive network visualization excellence
  • networkD3 by Christopher Gandrud - D3.js network graphs in R
  • dm by energie360° AG - relational data model visualization
  • flowchart by Adrian Antico - participant flow diagrams
  • igraph by Gábor Csárdi & Tamás Nepusz - the foundation of network analysis in R

Each of these packages excels in their domain, and putior complements them by focusing specifically on code workflow documentation through annotations.


Made with ❤️ for polyglot data science workflows across R, Python, Julia, SQL, Shell, and beyond

Owner

  • Name: Philipp Thoss
  • Login: pjt222
  • Kind: user

Data Scientist, Chemist, Maille Artisan

GitHub Events

Total
  • Issues event: 3
  • Watch event: 4
  • Delete event: 2
  • Issue comment event: 7
  • Push event: 44
  • Create event: 2
Last Year
  • Issues event: 3
  • Watch event: 4
  • Delete event: 2
  • Issue comment event: 7
  • Push event: 44
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads:
    • cran 163 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: putior

"Register In- and Outputs for Workflow Visualization"

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 163 Last month
Rankings
Dependent packages count: 26.2%
Dependent repos count: 32.3%
Average: 48.3%
Downloads: 86.4%
Maintainers (1)
Last synced: 11 months ago

Dependencies

.github/workflows/r.yml actions
  • actions/checkout v4 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 3.5.0 depends
  • tools * imports
  • clipr * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests