https://github.com/graphbookai/graphbook
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python.⭐ Leave a star to support us!
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Keywords
Repository
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python.⭐ Leave a star to support us!
Basic Info
- Host: GitHub
- Owner: graphbookai
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://docs.graphbook.ai/
- Size: 1.9 MB
Statistics
- Stars: 41
- Watchers: 3
- Forks: 5
- Open Issues: 6
- Releases: 23
Topics
Metadata Files
README.md
Graphbook
The Framework for AI-driven Data Pipelines
Report bug
·
Request feature
Overview • Status • Getting Started • Examples • Collaboration
Overview
Graphbook is a framework for building efficient, interactive DAG-structured AI data pipelines or workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.
Try out the demo!
Applications
- Clean and curate custom large scale datasets
- Demo ML apps on Huggingface Spaces
- Build and deliver customizable no-code or hybrid low-code ML apps and services
- Quickly experiment with different ML models and adjust hyperparameters
- Maximize GPU utilization, parallelize IO, and scale across clusters
- Wrap your Ray DAGs with a frontend for end users
Status
Graphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to report a bug or request a feature, please feel free to do so. We aim to make Graphbook serve our users in the best way possible.
Current Features
- Graph-based visual editor to experiment and create complex ML workflows
- Workflows can be serialized as Python and JSON files
- Caches outputs and only re-executes parts of the workflow that changes between executions
- UI monitoring components for logs and outputs per node
- Custom buildable nodes with Python via OOP and functional patterns
- Multiprocessing I/O to and from disk and network
- Customizable multiprocessing functions
- Ability to execute entire graphs, or individual subgraphs/nodes
- Ability to execute singular batches of data
- Ability to pause graph execution
- Basic nodes for filtering, loading, and saving outputs
- Node grouping and subflows
- Autosaving and shareable serialized workflow files
- Registers node code changes without needing a restart
- Monitorable system CPU and GPU resource usage
- Monitorable worker queue sizes for optimal worker scaling
- Human-in-the-loop prompting for interactivity and manual control during DAG execution
- Can switch to threaded processing per client session for demoing apps to multiple simultaneous users
- Scale with Ray: Build all-code workflows and scale pipelines on Ray clusters
- (BETA) Third Party Plugins *
* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at example_plugin and graphbook-huggingface
Supported OS
The following operating systems are supported in order of most to least recommended: - Linux - Mac - Windows (not recommended) *
* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.
Getting Started
Install from PyPI
pip install graphbookgraphbook- Visit http://localhost:8005
Install with Docker
- Pull and run the downloaded image
bash docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest - Visit http://localhost:8005
Recommended Plugins
Visit the docs to learn more on how to create custom nodes and workflows with Graphbook.
Examples
See plugin and workflow examples here in this folder.
Collaboration
Graphbook is in active development and very much welcomes contributors. If you would like to be actively involved in making Graphbook great, join our discord.
Run Graphbook in Development Mode
This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the Getting Started section.
You can use any other virtual environment solution, but it is highly adviced to use poetry since our dependencies are specified in poetry's format.
1. Clone the repo and cd graphbook
1. poetry install --with dev
1. poetry shell
1. python graphbook/core/cli.py
1. cd web
1. deno install
1. deno run dev
1. In your browser, navigate to localhost:5173, and in the settings, change your Graph Server Host to localhost:8005.
Owner
- Name: Graphbook AI
- Login: graphbookai
- Kind: organization
- Email: support@heddle.ai
- Repositories: 1
- Profile: https://github.com/graphbookai
Developers of the interactive and extensible editor for ML workflows
GitHub Events
Total
- Create event: 47
- Release event: 12
- Issues event: 50
- Watch event: 31
- Delete event: 28
- Issue comment event: 14
- Push event: 183
- Pull request review comment event: 5
- Pull request review event: 20
- Pull request event: 79
- Fork event: 3
Last Year
- Create event: 47
- Release event: 12
- Issues event: 50
- Watch event: 31
- Delete event: 28
- Issue comment event: 14
- Push event: 183
- Pull request review comment event: 5
- Pull request review event: 20
- Pull request event: 79
- Fork event: 3
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 73
- Total pull requests: 117
- Average time to close issues: 25 days
- Average time to close pull requests: about 4 hours
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.36
- Average comments per pull request: 0.08
- Merged pull requests: 114
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 36
- Pull requests: 81
- Average time to close issues: 15 days
- Average time to close pull requests: about 6 hours
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.19
- Average comments per pull request: 0.11
- Merged pull requests: 78
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- rsamf (78)
Pull Request Authors
- rsamf (173)
- davidspector67 (7)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 88 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 28
- Total maintainers: 1
pypi.org: graphbook_huggingface
Graphbook Hugging Face Plugin for no-code Hugging Face AI pipelines
- Homepage: https://graphbook.ai
- Documentation: https://docs.graphbook.ai
- License: MIT
-
Latest release: 0.0.6
published 11 months ago
Rankings
Maintainers (1)
pypi.org: graphbook
The AI-driven data pipeline and workflow framework for data scientists and machine learning engineers.
- Homepage: https://graphbook.ai
- Documentation: https://docs.graphbook.ai
- License: MIT
-
Latest release: 0.13.3
published 10 months ago