hello-penguins
Machine learning experiments with the Palmer Penguins dataset
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.2%) to scientific vocabulary
Keywords
Repository
Machine learning experiments with the Palmer Penguins dataset
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Hello, Penguins
Machine learning experiments with the Palmer Penguins dataset
Illustration by @allison_horst
Welcome
Welcome to the “Hello Penguins” repository, a collection of machine learning experiments with the Palmer Penguins dataset.
Inspired by the “Hello, World!” programming tradition, this repository is a series of small experiments to illustrate foundational machine learning concepts. Each experiment includes evaluation metrics and visuals to verify the model predictions make sense and are explainable.
Software engineering concepts are used to ensure the code is testable and reproducible.
To learn more about the dataset, checkout the the official Palmer Penguins GitHub repo.
Training Approach and Technology
MLflow is used for model training and evaluation instead of notebooks.
Training happens locally and the experiment results are shared in an MLflow portfolio that is hosted with Google Cloud Run. The goal is to have the portfolio highly available, but there may be times when it is offline. The portfolio Docker container files are in the docker-portfolio directory.
Pre-Training Checks
- Consider data bias
Acknowledgements and Sources
This repo builds on many foundations:
- Allison Horst’s Palmer Penguins repo
- Data downloaded 3/16/2025
curl -o data/penguins.csv https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv
- Data downloaded 3/16/2025
- Lynn Langit’s mentorship and amazing resources for learning cloud
- Santiago Valdarrama’s ML School repo
Owner
- Name: Heather Woods
- Login: h-fuzzy-logic
- Kind: user
- Location: United States
- Website: https://h-fuzzy-logic.github.io/blog/
- Repositories: 1
- Profile: https://github.com/h-fuzzy-logic
Always learning new ways to use technology for the greater good. Software Engineer with expertise in Data Engineering and Data Science.
GitHub Events
Total
- Delete event: 3
- Push event: 7
- Pull request event: 5
- Create event: 5
Last Year
- Delete event: 3
- Push event: 7
- Pull request event: 5
- Create event: 5
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| h-fuzzy-logic | h****c@g****m | 15 |
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 0
- Total pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: 4 minutes
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: 4 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- h-fuzzy-logic (5)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- awscurl >=0.36
- azure-ai-ml >=1.22.4
- azureml-mlflow >=1.58.0
- evidently >=0.5.0
- ipykernel >=6.29.5
- jax [cpu]>=0.4.20,<0.5.0
- jupyter >=1.1.1
- keras >=3.7.0
- metaflow >=2.13
- metaflow-card-html >=1.0.2
- mlflow [extras]>=2.18.0
- mlserver >=1.6.1
- mlserver-mlflow >=1.6.1
- numpy >=2.0.2
- pandas >=2.2.3
- pylint >=3.3.2
- pytest >=8.3.4
- scikit-learn >=1.6.0
- seaborn >=0.13.2
- python 3.12.0-slim-bookworm build
- mlflow *