https://github.com/apache/hamilton
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic links in README
-
✓Committers with academic emails
3 of 77 committers (3.9%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (2.4%) to scientific vocabulary
Keywords
dag
data-analysis
data-engineering
data-science
dataframe
etl
etl-framework
etl-pipeline
feature-engineering
hacktoberfest
lineage
llmops
machine-learning
mlops
orchestration
pandas
python
rag
software-engineering
Keywords from Contributors
web-crawler
agent
alignment
flexible
transformer
document-parser
unit-testing
gpt-4
jax
gemini
Last synced: 5 months ago
·
JSON representation
Repository
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Basic Info
- Host: GitHub
- Owner: apache
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://hamilton.apache.org/
- Size: 102 MB
Statistics
- Stars: 2,251
- Watchers: 21
- Forks: 158
- Open Issues: 127
- Releases: 117
Topics
dag
data-analysis
data-engineering
data-science
dataframe
etl
etl-framework
etl-pipeline
feature-engineering
hacktoberfest
lineage
llmops
machine-learning
mlops
orchestration
pandas
python
rag
software-engineering
Created almost 3 years ago
· Last pushed 6 months ago
Metadata Files
Readme
Contributing
License
Security
Owner
- Name: The Apache Software Foundation
- Login: apache
- Kind: organization
- Website: https://www.apache.org/
- Repositories: 2,814
- Profile: https://github.com/apache
GitHub Events
Total
- Issues event: 10
- Watch event: 70
- Delete event: 28
- Issue comment event: 68
- Push event: 78
- Pull request review event: 68
- Pull request review comment event: 30
- Pull request event: 67
- Fork event: 7
- Create event: 34
Last Year
- Issues event: 10
- Watch event: 70
- Delete event: 28
- Issue comment event: 68
- Push event: 78
- Pull request review event: 68
- Pull request review comment event: 30
- Pull request event: 67
- Fork event: 7
- Create event: 34
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Stefan Krawczyk | s****n@d****o | 756 |
| elijahbenizzy | e****y@a****u | 605 |
| Thierry Jean | 6****o | 90 |
| jernejfrank | j****k@g****m | 36 |
| Charles Swartz | c****i@g****m | 27 |
| James Lamb | j****0@g****m | 26 |
| zilto | t****n@D****2 | 23 |
| Bryan Galindo | g****8@g****m | 18 |
| JoJo10Smith | j****2@g****m | 13 |
| Swapnil Dewalkar | s****r@g****m | 12 |
| Dev-iL | 6****L | 11 |
| Stefan Krawczyk | s****n@i****m | 10 |
| zilto | t****n@D****n | 10 |
| Sarah Haskins | s****s@g****m | 8 |
| rinsoft-sf | r****t@s****m | 8 |
| Konstantin Tyapochkin | t****s@g****m | 6 |
| Subham Chakravorty | s****2@g****m | 6 |
| Roel Bertens | r****s@g****m | 5 |
| Christopher Prohm | m****l@c****e | 5 |
| Sarah Haskins | s****s@S****m | 4 |
| PJ Fanning | p****g | 4 |
| Ryan Whitten | r****n@b****m | 4 |
| Shelly Jang | 4****g | 4 |
| bovem | a****h@o****m | 4 |
| sT0v | 7****v | 4 |
| flaviassantos | s****a@h****m | 4 |
| Alexander Cai | a****i@q****m | 3 |
| Fran Boon | f****n@g****m | 3 |
| Yaser Martinez Palenzuela | y****z@g****m | 3 |
| AnupJoseph | a****h@g****m | 3 |
| and 47 more... | ||
Committer Domains (Top 20 + Academic)
stitchfix.com: 2
gtri.gatech.edu: 2
dagworks.io: 1
alumni.brown.edu: 1
idibon.com: 1
desktop-v6jdcs2.localdomain: 1
godatadriven.com: 1
cprohm.de: 1
sarahs-air.hvc.rr.com: 1
bestegg.com: 1
quantco.com: 1
ifit.com: 1
spicule.co.uk: 1
creative-resort.com: 1
amazon.com: 1
ibm.com: 1
weaviate.io: 1
dbtlabs.com: 1
estateably.com: 1
siedlaczek.me: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 9
- Total pull requests: 73
- Average time to close issues: 21 days
- Average time to close pull requests: 19 days
- Total issue authors: 4
- Total pull request authors: 10
- Average comments per issue: 0.78
- Average comments per pull request: 0.44
- Merged pull requests: 26
- Bot issues: 0
- Bot pull requests: 43
Past Year
- Issues: 9
- Pull requests: 72
- Average time to close issues: 21 days
- Average time to close pull requests: 14 days
- Issue authors: 4
- Pull request authors: 9
- Average comments per issue: 0.78
- Average comments per pull request: 0.43
- Merged pull requests: 26
- Bot issues: 0
- Bot pull requests: 43
Top Authors
Issue Authors
- pjfanning (5)
- skrawcz (2)
- eric-czech (1)
- lorenzwalthert (1)
Pull Request Authors
- dependabot[bot] (43)
- pjfanning (11)
- skrawcz (9)
- datashaman (2)
- cswartzvi (2)
- jernejfrank (2)
- elijahbenizzy (1)
- Phrogz (1)
- zilto (1)
- Gophersen (1)
Top Labels
Issue Labels
documentation (1)
triage (1)
repo-hygiene (1)
Pull Request Labels
dependencies (43)
python (24)
javascript (19)
core-work (1)