https://github.com/astorfi/ml-workflow
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: astorfi
- Language: Python
- Default Branch: master
- Size: 327 MB
Statistics
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
MLOps
Cloud agnostic tech stack for starting an MLOps platform (Level 1)
"We'll build a pipeline - after we deploy the model."

Model drift will hit when it's least convenient for you
To run: Make sure docker is running and you have Docker Compose installed.
- Clone the project
bash git clone https://github.com/jmeisele/ml-ops-kafka.git - Change directories into the repo
bash cd ml-ops - Run database migrations and create the first Airflow user account.
bash
docker-compose up airflow-init
- Build our images and launch with docker compose
bash docker-compose pull && docker-compose up - Open a browser and log in to MinIO
user: minioadmin
password : minioadmin
Create a bucket called mlflow

- Open a browser and log in to Grafana
user: admin
password : admin

Both Promethus and InfluxDB data sources have already been provisioned along with an MLOps Demo Dashboard and a Notification Channel.
- Add the alarm channel to some panels

Start the
send_data.pyscript which sends a POST request every 0.1 secondsOpen a browser and turn on the Airflow DAG used to retrain our ML model
user: airflow
password : airflow

- Lower the alarm threshold to see the Airflow DAG pipeline get triggered

Check MLFlow after the Airflow DAG has run to see the model artifacts stored using MinIO as the object storage layer.
(Optional) Send a POST request to our model service API endpoint
bash curl -v -H "Content-Type: application/json" -X POST -d '{ "median_income_in_block": 8.3252, "median_house_age_in_block": 41, "average_rooms": 6, "average_bedrooms": 1, "population_per_block": 322, "average_house_occupancy": 2.55, "block_latitude": 37.88, "block_longitude": -122.23 }' http://localhost/model/predict(Optional) If you are so bold, you can also simluate production traffic using locust, but keep in mind you have a lot of services running on your local machine, you would never deploy a production ML API on your local machine to handle production traffic.
Level 1 Workflow & Platform Architecture
Model Serving Architecture
Services
- nginx: Load Balancer
- python-model-service1: FastAPI Machine Learning API 1
- python-model-service2: FastAPI Machine Learning API 2
- postgresql: RDBMS
- kafka: Event streaming platform
- locust: Load testing and simulate production traffic
- prometheus: Metrics scraping
- minio: Object storage
- mlflow: Machine Learning Experiment Management
- influxdb: Time Series Database
- chronograf: Admin & WebUI for InxfluxDB
- grafana: Performance Monitoring
- redis: Cache
- airflow: Workflow Orchestrator
- bridge server: Receives webhook from Grafana and translates to Airflow REST API
gotchas:
Postgres:
Warning: scripts in /docker-entrypoint-initdb.d are only run if you start the container with a data directory that is empty; any pre-existing database will be left untouched on container startup.
Owner
- Name: Sina Torfi
- Login: astorfi
- Kind: user
- Location: San Jose
- Company: Meta
- Website: https://astorfi.github.io/
- Repositories: 196
- Profile: https://github.com/astorfi
PhD & Developer working on Deep Learning, Computer Vision & NLP
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1