Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary
Repository
Register In- and Outputs for Workflow Visualization.
Basic Info
- Host: GitHub
- Owner: pjt222
- License: other
- Language: R
- Default Branch: main
- Size: 3.93 MB
Statistics
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
putior 
Extract beautiful workflow diagrams from your code annotations
putior (PUT + Input + Output + R) is an R package that extracts structured annotations from source code files and creates beautiful Mermaid flowchart diagrams. Perfect for documenting data pipelines, workflows, and understanding complex codebases.
🌟 Key Features
- Simple annotations - Add structured comments to your existing code
- Beautiful diagrams - Generate professional Mermaid flowcharts
- File flow tracking - Automatically connects scripts based on input/output files
- Multiple themes - 5 built-in themes including GitHub-optimized
- Cross-language support - Works with R, Python, SQL, shell scripts, and Julia
- Flexible output - Console, file, or clipboard export
- Customizable styling - Control colors, direction, and node shapes
📦 Installation
```r
Install from CRAN (recommended)
install.packages("putior")
Or install from GitHub (development version)
remotes::install_github("pjt222/putior")
Or with renv
renv::install("putior") # CRAN version renv::install("pjt222/putior") # GitHub version
Or with pak (faster)
pak::pkginstall("putior") # CRAN version pak::pkginstall("pjt222/putior") # GitHub version ```
🚀 Quick Start
Step 1: Annotate Your Code
Add structured annotations to your R or Python scripts using #put comments:
01_fetch_data.R
```r
put label:"Fetch Sales Data", nodetype:"input", output:"salesdata.csv"
Your actual code
library(readr) salesdata <- fetchsalesfromapi() writecsv(salesdata, "sales_data.csv") ```
02_clean_data.py
```python
put label:"Clean and Process", nodetype:"process", input:"salesdata.csv", output:"clean_sales.csv"
import pandas as pd df = pd.readcsv("salesdata.csv")
... data cleaning code ...
df.tocsv("cleansales.csv") ```
Step 2: Extract and Visualize
```r library(putior)
Extract workflow from your scripts
workflow <- put("./scripts/")
Generate diagram
put_diagram(workflow) ```
Result: ```mermaid flowchart TD fetchsales([Fetch Sales Data]) cleandata[Clean and Process]
%% Connections
fetch_sales --> clean_data
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch_sales inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_data processStyle
```
📈 Common Data Science Pattern
Modular Workflow with source()
The most common data science pattern: modularize functions into separate scripts and orchestrate them in a main workflow:
utils.R - Utility functions
```r
put label:"Data Utilities", node_type:"input"
loadandclean <- function(file) { data <- read.csv(file) data[complete.cases(data), ] }
validate_data <- function(data) { stopifnot(nrow(data) > 0) return(data) } ```
analysis.R - Analysis functions
```r
put label:"Statistical Analysis", input:"utils.R"
performanalysis <- function(data) { # Uses utility functions from utils.R cleaned <- validatedata(data) summary(cleaned) } ```
main.R - Workflow orchestrator
```r
put label:"Main Analysis Pipeline", input:"utils.R,analysis.R", output:"results.csv"
source("utils.R") # Load utility functions source("analysis.R") # Load analysis functions
Execute the pipeline
data <- loadandclean("rawdata.csv") results <- performanalysis(data) write.csv(results, "results.csv") ```
Generated Workflow (Simple): ```mermaid flowchart TD utils([Data Utilities]) analysis[Statistical Analysis] main[Main Analysis Pipeline]
%% Connections
utils --> analysis
utils --> main
analysis --> main
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class utils inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class analysis processStyle
class main processStyle
```
Generated Workflow (With Data Artifacts): ```r
Show complete data flow including all files
putdiagram(workflow, showartifacts = TRUE) ```
```mermaid flowchart TD utils([Data Utilities]) analysis[Statistical Analysis] main[Main Analysis Pipeline] artifactresultscsv[(results.csv)]
%% Connections
utils --> analysis
utils --> main
analysis --> main
main --> artifact_results_csv
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class utils inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class analysis processStyle
class main processStyle
classDef artifactStyle fill:#f3f4f6,stroke:#6b7280,stroke-width:1px,color:#374151
class artifact_results_csv artifactStyle
```
This pattern clearly shows:
- Function modules (utils.R, analysis.R) are sourced into the main script
- Dependencies between modules (analysis depends on utils)
- Complete data flow with artifacts showing terminal outputs like results.csv
- Two visualization modes: simple (script connections only) vs. complete (with data artifacts)
📊 Visualization Examples
Basic Workflow
```r
Simple three-step process
workflow <- put("./datapipeline/") putdiagram(workflow) ```
Advanced Data Science Pipeline
Here's how putior handles a complete data science workflow:
File Structure:
data_pipeline/
├── 01_fetch_sales.R # Fetch sales data
├── 02_fetch_customers.R # Fetch customer data
├── 03_clean_sales.py # Clean sales data
├── 04_merge_data.R # Merge datasets
├── 05_analyze.py # Statistical analysis
└── 06_report.R # Generate final report
Generated Workflow: ```mermaid flowchart TD fetchsales([Fetch Sales Data]) fetchcustomers([Fetch Customer Data]) cleansales[Clean Sales Data] mergedata[Merge Datasets] analyze[Statistical Analysis] report[[Generate Final Report]]
%% Connections
fetch_sales --> clean_sales
fetch_customers --> merge_data
clean_sales --> merge_data
merge_data --> analyze
analyze --> report
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch_sales inputStyle
class fetch_customers inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_sales processStyle
class merge_data processStyle
class analyze processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class report outputStyle
```
📋 Using the Diagrams
Embedding in Documentation
The generated Mermaid code works perfectly in:
- GitHub README files (native Mermaid support)
- GitLab documentation
- Notion pages
- Obsidian notes
- Jupyter notebooks (with extensions)
- Sphinx documentation (with plugins)
- Any Markdown renderer with Mermaid support
Saving and Sharing
```r
Save to markdown file
put_diagram(workflow, output = "file", file = "workflow.md")
Copy to clipboard for pasting
put_diagram(workflow, output = "clipboard")
Include title for documentation
put_diagram(workflow, output = "file", file = "docs/pipeline.md", title = "Data Processing Pipeline") ```
🔧 Visualization Modes
putior offers two visualization modes to suit different needs:
Workflow Boundaries Demo
First, let's see how workflow boundaries enhance pipeline visualization:
Pipeline with Boundaries (Default): ```r
Complete ETL pipeline with clear start/end boundaries
putdiagram(workflow, showworkflow_boundaries = TRUE) ```
```mermaid flowchart TD pipelinestart([Data Pipeline Start]) extractdata[Extract Raw Data] transformdata[Transform Data] pipelineend([Pipeline Complete])
%% Connections
pipeline_start --> extract_data
extract_data --> transform_data
transform_data --> pipeline_end
%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class extract_data processStyle
class transform_data processStyle
classDef startStyle fill:#fef3c7,stroke:#d97706,stroke-width:3px,color:#92400e
class pipeline_start startStyle
classDef endStyle fill:#dcfce7,stroke:#16a34a,stroke-width:3px,color:#15803d
class pipeline_end endStyle
```
Same Pipeline without Boundaries: ```r
Clean diagram without workflow control styling
putdiagram(workflow, showworkflow_boundaries = FALSE) ```
```mermaid flowchart TD pipelinestart([Data Pipeline Start]) extractdata[Extract Raw Data] transformdata[Transform Data] pipelineend([Pipeline Complete])
%% Connections
pipeline_start --> extract_data
extract_data --> transform_data
transform_data --> pipeline_end
%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class extract_data processStyle
class transform_data processStyle
```
Simple Mode (Default)
Shows only script-to-script connections - perfect for understanding code dependencies:
r
put_diagram(workflow) # Default: simple mode
Use when: - Documenting code architecture - Showing function dependencies - Clean, simple workflow diagrams
Artifact Mode (Complete Data Flow)
Shows all data files as nodes - provides complete picture of data flow including terminal outputs:
r
put_diagram(workflow, show_artifacts = TRUE)
Use when: - Documenting data pipelines - Tracking data lineage - Showing complete input/output flow - Understanding data dependencies
Comparison Example
Simple Mode:
mermaid
flowchart TD
load[Load Data] --> process[Process Data]
process --> analyze[Analyze]
Artifact Mode: ```mermaid flowchart TD load[Load Data] rawdata[(rawdata.csv)] process[Process Data] cleandata[(cleandata.csv)] analyze[Analyze] results[(results.json)]
load --> raw_data
raw_data --> process
process --> clean_data
clean_data --> analyze
analyze --> results
```
Key Differences
| Mode | Shows | Best For | |------|-------|----------| | Simple | Script connections only | Code architecture, dependencies | | Artifact | Scripts + data files | Data pipelines, complete data flow |
File Labeling
Add file names to connections for extra clarity: ```r
Show file names on arrows
putdiagram(workflow, showartifacts = TRUE, show_files = TRUE) ```
🎨 Theme System
putior provides 5 carefully designed themes optimized for different environments:
```r
Get list of available themes
getdiagramthemes() ```
Theme Overview
| Theme | Best For | Description |
|-------|----------|-------------|
| light | Documentation sites, tutorials | Default light theme with bright colors |
| dark | Dark mode apps, terminals | Dark theme with muted colors |
| auto | GitHub README files | GitHub-adaptive theme that works in both modes |
| minimal | Business reports, presentations | Grayscale professional theme |
| github | GitHub README (recommended) | Optimized for maximum GitHub compatibility |
Theme Examples
Light Theme
r
put_diagram(workflow, theme = "light")
```mermaid
flowchart TD
fetchdata([Fetch API Data])
cleandata[Clean and Validate]
generate_report[[Generate Final Report]]
%% Connections
fetch_data --> clean_data
clean_data --> generate_report
%% Styling
classDef inputStyle fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000000
class fetch_data inputStyle
classDef processStyle fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000000
class clean_data processStyle
classDef outputStyle fill:#e8f5e8,stroke:#1b5e20,stroke-width:2px,color:#000000
class generate_report outputStyle
```
Dark Theme
r
put_diagram(workflow, theme = "dark")
```mermaid
flowchart TD
fetchdata([Fetch API Data])
cleandata[Clean and Validate]
generate_report[[Generate Final Report]]
%% Connections
fetch_data --> clean_data
clean_data --> generate_report
%% Styling
classDef inputStyle fill:#1a237e,stroke:#3f51b5,stroke-width:2px,color:#ffffff
class fetch_data inputStyle
classDef processStyle fill:#4a148c,stroke:#9c27b0,stroke-width:2px,color:#ffffff
class clean_data processStyle
classDef outputStyle fill:#1b5e20,stroke:#4caf50,stroke-width:2px,color:#ffffff
class generate_report outputStyle
```
Auto Theme (GitHub Adaptive)
r
put_diagram(workflow, theme = "auto") # Recommended for GitHub!
```mermaid
flowchart TD
fetchdata([Fetch API Data])
cleandata[Clean and Validate]
generate_report[[Generate Final Report]]
%% Connections
fetch_data --> clean_data
clean_data --> generate_report
%% Styling
classDef inputStyle fill:#3b82f6,stroke:#1d4ed8,stroke-width:2px,color:#ffffff
class fetch_data inputStyle
classDef processStyle fill:#8b5cf6,stroke:#6d28d9,stroke-width:2px,color:#ffffff
class clean_data processStyle
classDef outputStyle fill:#10b981,stroke:#047857,stroke-width:2px,color:#ffffff
class generate_report outputStyle
```
GitHub Theme (Maximum Compatibility)
r
put_diagram(workflow, theme = "github") # Best for GitHub README
```mermaid
flowchart TD
fetchdata([Fetch API Data])
cleandata[Clean and Validate]
generate_report[[Generate Final Report]]
%% Connections
fetch_data --> clean_data
clean_data --> generate_report
%% Styling
classDef inputStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
class fetch_data inputStyle
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class clean_data processStyle
classDef outputStyle fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#15803d
class generate_report outputStyle
```
Minimal Theme
r
put_diagram(workflow, theme = "minimal") # Professional documents
```mermaid
flowchart TD
fetchdata([Fetch API Data])
cleandata[Clean and Validate]
generate_report[[Generate Final Report]]
%% Connections
fetch_data --> clean_data
clean_data --> generate_report
%% Styling
classDef inputStyle fill:#f8fafc,stroke:#64748b,stroke-width:1px,color:#1e293b
class fetch_data inputStyle
classDef processStyle fill:#f1f5f9,stroke:#64748b,stroke-width:1px,color:#1e293b
class clean_data processStyle
classDef outputStyle fill:#f8fafc,stroke:#64748b,stroke-width:1px,color:#1e293b
class generate_report outputStyle
```
When to Use Each Theme
| Theme | Use Case | Environment |
|-------|----------|-------------|
| light | Documentation sites, tutorials | Light backgrounds |
| dark | Dark mode apps, terminals | Dark backgrounds |
| auto | GitHub README files | Adapts automatically |
| github | GitHub README (recommended) | Maximum compatibility |
| minimal | Business reports, presentations | Print-friendly |
Pro Tips
- For GitHub: Use
theme = "github"for maximum compatibility, ortheme = "auto"for adaptive colors - For Documentation: Use
theme = "light"ortheme = "dark"to match your site - For Reports: Use
theme = "minimal"for professional, print-friendly diagrams - For Demos: Light theme usually shows colors best in presentations
Theme Usage Examples
```r
For GitHub README (recommended)
put_diagram(workflow, theme = "github")
For GitHub README (adaptive)
put_diagram(workflow, theme = "auto")
For dark documentation sites
put_diagram(workflow, theme = "dark", direction = "LR")
For professional reports
put_diagram(workflow, theme = "minimal", output = "file", file = "report.md")
Save all themes for comparison
themes <- c("light", "dark", "auto", "github", "minimal") for(theme in themes) { putdiagram(workflow, theme = theme, output = "file", file = paste0("workflow", theme, ".md"), title = paste("Workflow -", stringr::strtotitle(theme), "Theme")) } ```
🔧 Customization Options
Flow Direction
r
put_diagram(workflow, direction = "TD") # Top to bottom (default)
put_diagram(workflow, direction = "LR") # Left to right
put_diagram(workflow, direction = "BT") # Bottom to top
put_diagram(workflow, direction = "RL") # Right to left
Node Labels
r
put_diagram(workflow, node_labels = "name") # Show node IDs
put_diagram(workflow, node_labels = "label") # Show descriptions (default)
put_diagram(workflow, node_labels = "both") # Show name: description
File Connections
```r
Show file names on arrows
putdiagram(workflow, showfiles = TRUE)
Clean arrows without file names
putdiagram(workflow, showfiles = FALSE) ```
Styling Control
```r
Include colored styling (default)
putdiagram(workflow, stylenodes = TRUE)
Plain diagram without colors
putdiagram(workflow, stylenodes = FALSE)
Control workflow boundary styling
putdiagram(workflow, showworkflowboundaries = TRUE) # Special start/end styling (default) putdiagram(workflow, showworkflowboundaries = FALSE) # Regular node styling ```
Workflow Boundaries
```r
Enable workflow boundaries (default) - start/end get special styling
putdiagram(workflow, showworkflow_boundaries = TRUE)
Disable workflow boundaries - start/end render as regular nodes
putdiagram(workflow, showworkflow_boundaries = FALSE) ```
Output Options
```r
Console output (default)
put_diagram(workflow)
Save to markdown file
putdiagram(workflow, output = "file", file = "myworkflow.md")
Copy to clipboard for pasting
put_diagram(workflow, output = "clipboard") ```
📝 Annotation Reference
Basic Syntax
All PUT annotations follow this format: ```r
put property1:"value1", property2:"value2", property3:"value3"
```
Alternative Formats (All Valid)
```r
put id:"node_id", label:"Description" # Standard
put id:"node_id", label:"Description" # Space after
put| id:"node_id", label:"Description" # Pipe separator
put: id:"node_id", label:"Description" # Colon separator
```
Annotations
| Annotation | Description | Example | Required |
|------------|-------------|---------|----------|
| id | Unique identifier for the node (auto-generated if omitted) | "fetch_data", "clean_sales" | Optional* |
| label | Human-readable description | "Fetch Sales Data", "Clean and Process" | Recommended |
*Note: If id is omitted, a UUID will be automatically generated. If you provide an empty id (e.g., id:""), you'll get a validation warning.
Optional Annotations
| Annotation | Description | Example | Default |
|------------|-------------|---------|---------|
| node_type | Visual shape of the node | "input", "process", "output", "decision", "start", "end" | "process" |
| input | Input files (comma-separated) | "raw_data.csv, config.json" | None |
| output | Output files (comma-separated) | "processed_data.csv, summary.txt" | Current file name* |
*Note: If output is omitted, it defaults to the name of the file containing the annotation. This ensures nodes can be connected in workflows.
Node Types and Shapes
putior uses a data-centric approach with workflow boundaries as special control elements:
Data Processing Nodes:
- "input" - Data sources, APIs, file readers → Stadium shape ([text])
- "process" - Data transformation, analysis → Rectangle [text]
- "output" - Final results, reports, exports → Subroutine [[text]]
- "decision" - Conditional logic, branching → Diamond {text}
Workflow Control Nodes:
- "start" - Workflow entry point → Stadium shape with orange styling
- "end" - Workflow termination → Stadium shape with green styling
Workflow Boundaries
Control the visualization of workflow start/end points with show_workflow_boundaries:
```r
Special workflow boundary styling (default)
putdiagram(workflow, showworkflow_boundaries = TRUE)
Regular nodes without special workflow styling
putdiagram(workflow, showworkflow_boundaries = FALSE) ```
With boundaries enabled (default):
- node_type:"start" gets distinctive orange styling with thicker borders
- node_type:"end" gets distinctive green styling with thicker borders
With boundaries disabled: - Start/end nodes render as regular stadium shapes without special colors
Example Annotations
R Scripts: ```r
put id:"loadsalesdata", label:"Load Sales Data from API", nodetype:"input", output:"rawsales.csv, metadata.json"
put id:"validatedata", label:"Validate and Clean Data", nodetype:"process", input:"rawsales.csv", output:"cleansales.csv"
put id:"generatereport", label:"Generate Executive Summary", nodetype:"output", input:"cleansales.csv, metadata.json", output:"executivesummary.pdf"
```
Python Scripts: ```python
put id:"collectdata", label:"Collect Raw Data", nodetype:"input", output:"raw_data.csv"
put id:"trainmodel", label:"Train ML Model", nodetype:"process", input:"features.csv", output:"model.pkl"
put id:"predict", label:"Generate Predictions", nodetype:"output", input:"model.pkl, testdata.csv", output:"predictions.csv"
```
Multiple Annotations Per File: ```r
analysis.R
put id:"createsummary", label:"Calculate Summary Stats", nodetype:"process", input:"processeddata.csv", output:"summarystats.json"
put id:"createreport", label:"Generate Sales Report", nodetype:"output", input:"processeddata.csv", output:"salesreport.html"
Your R code here...
```
Workflow Entry and Exit Points: ```r
main_workflow.R
put id:"workflowstart", label:"Start Analysis Pipeline", nodetype:"start", output:"config.json"
put id:"workflowend", label:"Pipeline Complete", nodetype:"end", input:"final_report.pdf"
```
Workflow Boundary Examples: ```r
Complete pipeline with boundaries
put id:"pipelinestart", label:"Data Pipeline Start", nodetype:"start", output:"raw_config.json"
put id:"extractdata", label:"Extract Raw Data", nodetype:"process", input:"rawconfig.json", output:"rawdata.csv"
put id:"transformdata", label:"Transform Data", nodetype:"process", input:"rawdata.csv", output:"cleandata.csv"
put id:"pipelineend", label:"Pipeline Complete", nodetype:"end", input:"clean_data.csv"
```
Generated Workflow with Boundaries: ```mermaid flowchart TD pipelinestart([Data Pipeline Start]) extractdata[Extract Raw Data] transformdata[Transform Data] pipelineend([Pipeline Complete])
pipeline_start --> extract_data
extract_data --> transform_data
transform_data --> pipeline_end
classDef startStyle fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px,color:#1b5e20
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
classDef endStyle fill:#ffebee,stroke:#c62828,stroke-width:3px,color:#b71c1c
class pipeline_start startStyle
class extract_data,transform_data processStyle
class pipeline_end endStyle
```
Supported File Types
putior automatically detects and processes these file types:
- R: .R, .r
- Python: .py
- SQL: .sql
- Shell: .sh
- Julia: .jl
🛠️ Advanced Usage
Directory Scanning
```r
Scan current directory
workflow <- put(".")
Scan specific directory
workflow <- put("./src/")
Recursive scanning (include subdirectories)
workflow <- put("./project/", recursive = TRUE)
Custom file patterns
workflow <- put("./analysis/", pattern = "\.(R|py)$")
Single file
workflow <- put("./script.R") ```
Debugging and Validation
```r
Include line numbers for debugging
workflow <- put("./src/", includelinenumbers = TRUE)
Disable validation warnings
workflow <- put("./src/", validate = FALSE)
Test annotation syntax
isvalidputannotation('#put id:"test", label:"Test Node"') # TRUE isvalidputannotation("#put invalid syntax") # FALSE ```
UUID Auto-Generation
When you omit the id field, putior automatically generates a unique UUID:
```r
Annotations without explicit IDs
put label:"Load Data", node_type:"input", output:"data.csv"
put label:"Process Data", node_type:"process", input:"data.csv"
Extract workflow - IDs will be auto-generated
workflow <- put("./") print(workflow$id)
[1] "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
[2] "b2c3d4e5-f6a7-8901-bcde-f23456789012"
```
This feature is perfect for: - Quick prototyping without worrying about unique IDs - Temporary workflows where IDs don't matter - Ensuring uniqueness across large codebases
Note: If you provide an empty id (e.g., id:""), you'll get a validation warning.
Tracking Source Relationships
When you have a main script that sources other scripts, annotate them to show the sourcing relationships:
```r
main.R - sources other scripts
put label:"Main Workflow", input:"utils.R,analysis.R", output:"results.csv"
source("utils.R") # Reading utils.R into main.R source("analysis.R") # Reading analysis.R into main.R
utils.R - sourced by main.R
put label:"Utility Functions", node_type:"input"
output defaults to "utils.R"
analysis.R - sourced by main.R, depends on utils.R
put label:"Analysis Functions", input:"utils.R"
output defaults to "analysis.R"
```
This creates a diagram showing:
- utils.R → main.R (sourced into)
- analysis.R → main.R (sourced into)
- utils.R → analysis.R (dependency)
🔄 Self-Documentation: putior Documents Itself!
As a demonstration of putior's capabilities, we've added PUT annotations to putior's own source code. This creates a beautiful visualization of how the package works internally:
```r
Extract putior's own workflow
workflow <- put("./R/") put_diagram(workflow, theme = "github", title = "putior Package Internals") ```
Result:
```mermaid
title: putior Package Internals
flowchart TD putentry([Entry Point - Scan Files]) processfile[Process Single File] parser[Parse Annotation Syntax] convertdf[Convert to Data Frame] diagramgen[Generate Mermaid Diagram] nodedefs[Create Node Definitions] connections[Generate Node Connections] outputhandler([Output Final Diagram])
%% Connections
put_entry --> process_file
process_file --> parser
parser --> convert_df
convert_df --> diagram_gen
diagram_gen --> node_defs
node_defs --> connections
connections --> output_handler
%% Styling
classDef processStyle fill:#ede9fe,stroke:#7c3aed,stroke-width:2px,color:#5b21b6
class process_file processStyle
class parser processStyle
class convert_df processStyle
class diagram_gen processStyle
class node_defs processStyle
class connections processStyle
classDef startStyle fill:#fef3c7,stroke:#d97706,stroke-width:3px,color:#92400e
class put_entry startStyle
classDef endStyle fill:#dcfce7,stroke:#16a34a,stroke-width:3px,color:#15803d
class output_handler endStyle
```
This self-documentation shows the two main phases of putior: 1. Parsing Phase: Scanning files → extracting annotations → converting to workflow data 2. Diagram Generation Phase: Taking workflow data → creating nodes/connections → outputting diagram
To see the complete data flow with intermediate files, run:
r
put_diagram(workflow, show_artifacts = TRUE, theme = "github")
🤝 Contributing
Contributions welcome! Please open an issue or pull request on GitHub.
Development Setup: ```bash git clone https://github.com/pjt222/putior.git cd putior
Install dev dependencies
Rscript -e "devtools::installdevdeps()"
Run tests
Rscript -e "devtools::test()"
Check package
Rscript -e "devtools::check()" ```
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
📊 How putior Compares to Other R Packages
putior fills a unique niche in the R ecosystem by combining annotation-based workflow extraction with beautiful diagram generation:
| Package | Focus | Approach | Output | Best For | |---------|-------|----------|--------|----------| | putior | Data workflow visualization | Code annotations | Mermaid diagrams | Pipeline documentation | | CodeDepends | Code dependency analysis | Static analysis | Variable graphs | Understanding code structure | | DiagrammeR | General diagramming | Manual diagram code | Interactive graphs | Custom diagrams | | visNetwork | Interactive networks | Manual network definition | Interactive vis.js | Complex network exploration | | dm | Database relationships | Schema analysis | ER diagrams | Database documentation | | flowchart | Study flow diagrams | Dataframe input | ggplot2 charts | Clinical trials |
Key Advantages of putior
- 📝 Annotation-Based: Workflow documentation lives in your code comments
- 🔄 Multi-Language: Works across R, Python, SQL, Shell, and Julia
- 📁 File Flow Tracking: Automatically connects scripts based on input/output files
- 🎨 Beautiful Output: GitHub-ready Mermaid diagrams with multiple themes
- 📦 Lightweight: Minimal dependencies (only requires
toolspackage) - 🔍 Two Views: Simple script connections + complete data artifact flow
🙏 Acknowledgments
- Built with Mermaid for beautiful diagram generation
- Inspired by the need for better code documentation and workflow visualization
- Thanks to the R community for excellent development tooling
👥 Contributors
- Philipp Thoss (@pjt222) - Primary author and maintainer
- Claude (Anthropic) - Co-author on 38 commits, contributing to package development, documentation, and testing
Note: While GitHub's contributor graph only displays primary commit authors, Claude's contributions are properly attributed through Co-Authored-By tags in the commit messages. To see all contributions, use: git log --grep="Co-Authored-By: Claude"
🌟 Shoutout to Related R Packages
putior stands on the shoulders of giants in the R visualization and workflow ecosystem:
- CodeDepends by Duncan Temple Lang - pioneering work in R code dependency analysis
- targets by William Michael Landau - powerful pipeline toolkit for reproducible computation
- DiagrammeR by Richard Iannone - bringing beautiful graph visualization to R
- ggraph by Thomas Lin Pedersen - grammar of graphics for networks and trees
- visNetwork by Almende B.V. - interactive network visualization excellence
- networkD3 by Christopher Gandrud - D3.js network graphs in R
- dm by energie360° AG - relational data model visualization
- flowchart by Adrian Antico - participant flow diagrams
- igraph by Gábor Csárdi & Tamás Nepusz - the foundation of network analysis in R
Each of these packages excels in their domain, and putior complements them by focusing specifically on code workflow documentation through annotations.
Made with ❤️ for polyglot data science workflows across R, Python, Julia, SQL, Shell, and beyond
Owner
- Name: Philipp Thoss
- Login: pjt222
- Kind: user
- Repositories: 6
- Profile: https://github.com/pjt222
Data Scientist, Chemist, Maille Artisan
GitHub Events
Total
- Issues event: 3
- Watch event: 4
- Delete event: 2
- Issue comment event: 7
- Push event: 44
- Create event: 2
Last Year
- Issues event: 3
- Watch event: 4
- Delete event: 2
- Issue comment event: 7
- Push event: 44
- Create event: 2
Packages
- Total packages: 1
-
Total downloads:
- cran 163 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
cran.r-project.org: putior
"Register In- and Outputs for Workflow Visualization"
- Homepage: https://pjt222.github.io/putior/
- Documentation: http://cran.r-project.org/web/packages/putior/putior.pdf
- License: MIT + file LICENSE
-
Latest release: 0.1.0
published about 1 year ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v4 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
- R >= 3.5.0 depends
- tools * imports
- clipr * suggests
- knitr * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests