hello-penguins

Machine learning experiments with the Palmer Penguins dataset

https://github.com/h-fuzzy-logic/hello-penguins

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.2%) to scientific vocabulary

Keywords

explainable-ml machine-learning mlflow python

Last synced: 6 months ago · JSON representation

Repository

Machine learning experiments with the Palmer Penguins dataset

Basic Info

Host: GitHub
Owner: h-fuzzy-logic
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 1.36 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

explainable-ml machine-learning mlflow python

Created 11 months ago · Last pushed 11 months ago

Metadata Files

Readme License Citation

Hello, Penguins

Machine learning experiments with the Palmer Penguins dataset

Palmer Penguins illustration

Illustration by @allison_horst

Welcome

Welcome to the “Hello Penguins” repository, a collection of machine learning experiments with the Palmer Penguins dataset.

Inspired by the “Hello, World!” programming tradition, this repository is a series of small experiments to illustrate foundational machine learning concepts. Each experiment includes evaluation metrics and visuals to verify the model predictions make sense and are explainable.

Software engineering concepts are used to ensure the code is testable and reproducible.

To learn more about the dataset, checkout the the official Palmer Penguins GitHub repo.

Training Approach and Technology

MLflow is used for model training and evaluation instead of notebooks.

Training happens locally and the experiment results are shared in an MLflow portfolio that is hosted with Google Cloud Run. The goal is to have the portfolio highly available, but there may be times when it is offline. The portfolio Docker container files are in the docker-portfolio directory.

Pre-Training Checks

Consider data bias

Acknowledgements and Sources

This repo builds on many foundations:

Allison Horst’s Palmer Penguins repo
- Data downloaded 3/16/2025 curl -o data/penguins.csv https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv
Lynn Langit’s mentorship and amazing resources for learning cloud
Santiago Valdarrama’s ML School repo

Owner

Name: Heather Woods
Login: h-fuzzy-logic
Kind: user
Location: United States

Website: https://h-fuzzy-logic.github.io/blog/
Repositories: 1
Profile: https://github.com/h-fuzzy-logic

Always learning new ways to use technology for the greater good. Software Engineer with expertise in Data Engineering and Data Science.

GitHub Events

Total

Delete event: 3
Push event: 7
Pull request event: 5
Create event: 5

Last Year

Delete event: 3
Push event: 7
Pull request event: 5
Create event: 5

Committers

Last synced: 7 months ago

All Time

Total Commits: 15
Total Committers: 1
Avg Commits per committer: 15.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 15
Committers: 1
Avg Commits per committer: 15.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
h-fuzzy-logic	h**c@g**m	15

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 0
Total pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: 4 minutes
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: 4 minutes
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

h-fuzzy-logic (5)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

pyproject.toml pypi

awscurl >=0.36
azure-ai-ml >=1.22.4
azureml-mlflow >=1.58.0
evidently >=0.5.0
ipykernel >=6.29.5
jax [cpu]>=0.4.20,<0.5.0
jupyter >=1.1.1
keras >=3.7.0
metaflow >=2.13
metaflow-card-html >=1.0.2
mlflow [extras]>=2.18.0
mlserver >=1.6.1
mlserver-mlflow >=1.6.1
numpy >=2.0.2
pandas >=2.2.3
pylint >=3.3.2
pytest >=8.3.4
scikit-learn >=1.6.0
seaborn >=0.13.2

docker-portfolio/Dockerfile docker

python 3.12.0-slim-bookworm build

docker-portfolio/requirements.txt pypi

mlflow *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

hello-penguins

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Hello, Penguins

Welcome

Training Approach and Technology

Pre-Training Checks

Acknowledgements and Sources

This repo builds on many foundations:

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies