kedro_fork

https://github.com/noklam/kedro_fork

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: noklam
License: apache-2.0
Language: Python
Default Branch: main
Size: 184 MB

Statistics

Stars: 0
Watchers: 2
Forks: 0
Open Issues: 9
Releases: 0

Created almost 3 years ago · Last pushed almost 3 years ago

Metadata Files

Readme Contributing License Code of conduct Citation Codeowners

README.md

Kedro Logo Banner - Light Kedro Logo Banner - Dark CircleCI - Main Branch Develop Branch Build

What is Kedro?

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

Kedro is an open-source Python framework hosted by the LF AI & Data Foundation.

How do I install Kedro?

To install Kedro from the Python Package Index (PyPI) run:

pip install kedro

It is also possible to install Kedro using conda:

conda install -c conda-forge kedro

Our Get Started guide contains full installation instructions, and includes how to set up Python virtual environments.

What are the main features of Kedro?

A pipeline visualisation generated using Kedro-Viz

| Feature | What is this? | | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Project Template | A standard, modifiable and easy-to-use project template based on Cookiecutter Data Science. | | Data Catalog | A series of lightweight data connectors used to save and load data across many different file formats and file systems, including local and network file systems, cloud object stores, and HDFS. The Data Catalog also includes data and model versioning for file-based systems. | | Pipeline Abstraction | Automatic resolution of dependencies between pure Python functions and data pipeline visualisation using Kedro-Viz. | | Coding Standards | Test-driven development using pytest, produce well-documented code using Sphinx, create linted code with support for flake8, isort and black and make use of the standard Python logging library. | | Flexible Deployment | Deployment strategies that include single or distributed-machine deployment as well as additional support for deploying on Argo, Prefect, Kubeflow, AWS Batch and Databricks. |

How do I use Kedro?

The Kedro documentation first explains how to install Kedro and then introduces key Kedro concepts.

The first example illustrates the basics of a Kedro project using the Iris dataset
You can then review the spaceflights tutorial to build a Kedro project for hands-on experience

For new and intermediate Kedro users, there's a comprehensive section on how to visualise Kedro projects using Kedro-Viz and how to work with Kedro and Jupyter notebooks.

Further documentation is available for more advanced Kedro usage and deployment. We also recommend the glossary and the API reference documentation for additional information.

Why does Kedro exist?

Kedro is built upon our collective best-practice (and mistakes) trying to deliver real-world ML applications that have vast amounts of raw unvetted data. We developed Kedro to achieve the following:

To address the main shortcomings of Jupyter notebooks, one-off scripts, and glue-code because there is a focus on creating maintainable data science code
To enhance team collaboration when different team members have varied exposure to software engineering concepts
To increase efficiency, because applied concepts like modularity and separation of concerns inspire the creation of reusable analytics code

The humans behind Kedro

The Kedro product team and a number of open source contributors from across the world maintain Kedro.

Can I contribute?

Yes! Want to help build Kedro? Check out our guide to contributing to Kedro.

Where can I learn more?

There is a growing community around Kedro. Have a look at the Kedro FAQs to find projects using Kedro and links to articles, podcasts and talks.

Who likes Kedro?

There are Kedro users across the world, who work at start-ups, major enterprises and academic institutions like Absa, Acensi, Advanced Programming Solutions SL, AI Singapore, AMAI GmbH, Augment Partners, AXA UK, Belfius, Beamery, Caterpillar, CRIM, Dendra Systems, Element AI, GetInData, GMO, Indicium, Imperial College London, ING, Jungle Scout, Helvetas, Leapfrog, McKinsey & Company, Mercado Libre Argentina, Modec, Mosaic Data Science, NaranjaX, NASA, NHS AI Lab, Open Data Science LatAm, Prediqt, QuantumBlack, ReSpo.Vision, Retrieva, Roche, Sber, Socit Gnrale, Telkomsel, Universidad Rey Juan Carlos, UrbanLogiq, Wildlife Studios, WovenLight and XP.

Kedro won Best Technical Tool or Framework for AI in the 2019 Awards AI competition and a merit award for the 2020 UK Technical Communication Awards. It is listed on the 2020 ThoughtWorks Technology Radar and the 2020 Data & AI Landscape. Kedro has received an honorable mention in the User Experience category in Fast Companys 2022 Innovation by Design Awards.

How can I cite Kedro?

If you're an academic, Kedro can also help you, for example, as a tool to solve the problem of reproducible research. Use the "Cite this repository" button on our repository to generate a citation from the CITATION.cff file.

Owner

Name: Nok Lam Chan
Login: noklam
Kind: user
Location: London
Company: @kedro-org @quantumblacklabs

Website: https://noklam.github.io/blog
Twitter: mediumnok
Repositories: 187
Profile: https://github.com/noklam

Kedro / vscode-kedro maintainer 🇭🇰 Born and raised https://medium.com/@noklam-data (Medium)

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
- family-names: Alam
  given-names: Sajid
- family-names: Chan
  given-names: Nok Lam
- family-names: Dada
  given-names: Yetunde
- family-names: Danov
  given-names: Ivan
- family-names: Datta
  given-names: Deepyaman
- family-names: DeBold
  given-names: Tynan
- family-names: Holzer
  given-names: Jannic
- family-names: Kaiser
  given-names: Stephanie
- family-names: Kanchwala
  given-names: Rashida
- family-names: Katiyar
  given-names: Ankita
- family-names: Kumar Pilla
  given-names: Ravi
- family-names: Koh
  given-names: Amanda
- family-names: Mackay
  given-names: Andrew
- family-names: Merali
  given-names: Ahdra
- family-names: Milne
  given-names: Antony
- family-names: Nguyen
  given-names: Huong
- family-names: Okwa
  given-names: Nero
- family-names: Cano Rodríguez
  given-names: Juan Luis
  orcid: https://orcid.org/0000-0002-2187-161X
- family-names: Schwarzmann
  given-names: Joel
- family-names: Stichbury
  given-names: Jo
- family-names: Theisen
  given-names: Merel
title: Kedro
version: 0.18.11
date-released: 2023-07-03
url: https://github.com/kedro-org/kedro

GitHub Events

Total

Last Year

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 4
Total pull requests: 5
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 4
Bot pull requests: 5

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

github-actions[bot] (4)

Pull Request Authors

dependabot[bot] (5)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

.github/workflows/issues_metrics.yml actions

github/issue-metrics v2 composite
peter-evans/create-issue-from-file v4 composite

tools/circleci/docker_build_img/Dockerfile docker

cimg/python 3.8 build

features/steps/test_plugin/setup.py pypi

features/steps/test_starter/{{ cookiecutter.repo_name }}/pyproject.toml pypi

features/steps/test_starter/{{ cookiecutter.repo_name }}/src/requirements.txt pypi

black * test
flake8 >=3.7.9,<5.0 test
ipython >=7.31.1,<8.0 test
ipython * test
isort * test
jupyter * test
jupyterlab * test
jupyterlab_server >=2.11.1,<2.16.0 test
kedro == test
kedro-telemetry * test
nbstripout * test
pytest * test
pytest-cov * test
pytest-mock >=1.7.1,<2.0 test

features/steps/test_starter/{{ cookiecutter.repo_name }}/src/setup.py pypi

kedro/templates/project/{{ cookiecutter.repo_name }}/pyproject.toml pypi

kedro/templates/project/{{ cookiecutter.repo_name }}/src/requirements.txt pypi

black *
flake8 >=3.7.9,<5.0
ipython >=7.31.1,<8.0
ipython *
isort *
jupyter *
jupyterlab *
jupyterlab_server >=2.11.1,<2.16.0
kedro *
kedro-telemetry *
nbstripout *
pytest *
pytest-cov *
pytest-mock >=1.7.1,<2.0

kedro/templates/project/{{ cookiecutter.repo_name }}/src/setup.py pypi

pyproject.toml pypi

PyYAML >=4.2, <7.0
anyconfig ~=0.10.0
attrs >=21.3
build *
cachetools ~=5.3
click <9.0
cookiecutter >=2.1.1, <3.0
dynaconf >=3.1.2, <4.0
fsspec >=2021.4, <2024.1
gitpython ~=3.0
importlib-metadata >=3.6; python_version >= '3.8'
importlib_metadata >=3.6, <5.0; python_version < '3.8'
importlib_resources >=1.3
jmespath >=0.9.5, <1.0
more_itertools ~=9.0
omegaconf ~=2.3
parse ~=1.19.0
pip-tools ~=6.5
pluggy ~=1.0
rich >=12.0, <14.0
rope >=0.21, <2.0
setuptools >=65.5.1
toml ~=0.10
toposort ~=1.5

setup.py pypi

tools/circleci/requirements.txt pypi

pip >=21.2
setuptools >=65.5.1
twine *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science