great_expectations
Always know what to expect from your data.
Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
11 of 433 committers (2.5%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Always know what to expect from your data.
Basic Info
- Host: GitHub
- Owner: great-expectations
- License: apache-2.0
- Language: Python
- Default Branch: develop
- Homepage: https://docs.greatexpectations.io/
- Size: 222 MB
Statistics
- Stars: 10,690
- Watchers: 86
- Forks: 1,606
- Open Issues: 91
- Releases: 311
Topics
Metadata Files
README.md

About GX Core
GX Core combines the collective wisdom of thousands of community members with a proven track record in data quality deployments worldwide, wrapped into a super-simple package for data teams.
Its powerful technical tools start with Expectations: expressive and extensible unit tests for your data. Expectations foster collaboration by giving teams a common language to express data quality tests in an intuitive way. You can automatically generate documentation for each set of validation results, making it easy for everyone to stay on the same page. This not only simplifies your data quality processes, but helps preserve your organization’s institutional knowledge about its data.
Learn more about how data teams are using GX Core in our featured case studies.
Integration support policy
GX Core supports Python 3.9 through 3.12.
Experimental support for Python 3.13 and later can be enabled by setting a GX_PYTHON_EXPERIMENTAL environment variable when installing great_expectations.
For data sources and other integrations that GX supports, see the compatibility reference for additional information.
Get started
GX recommends deploying GX Core within a virtual environment. For more information about getting started with GX Core, see Introduction to GX Core.
Run the following command in an empty base directory inside a Python virtual environment to install GX Core:
bash title="Terminal input" pip install great_expectationsRun the following command to import the
great_expectations moduleand create a Data Context:```python import great_expectations as gx
context = gx.get_context() ```
Get support from GX and the community
They are listed in the order in which GX is prioritizing the support issues:
- Issues and PRs in the GX GitHub repository
- Questions posted to the GX Core Discourse forum
- Questions posted to the GX Slack community channel
Contribute
We deeply value the contributions of our community. We're now accepting PRs for bug fixes.
To ensure the long-term quality of the GX Core codebase, we're not yet ready to accept feature contributions to the parts of the codebase that don't have clear interfaces for extensions. We're actively working to increase the surface area for contributions. Thank you for being a crucial part of GX Core!
Levels of contribution readiness
🟢 Ready. Have a clear and public interface for extensions.
🟡 Partially ready. Case-by-case.
🔴 Not ready. Will accept contributions that fix existing bugs or workflows.
| GX Component | Readiness | Notes | | -------------------- | ------------------ | ----- | | CredentialStore | 🟢 Ready | | | BatchDefinition | 🟡 Partially ready | Formerly known as splitters | | Action | 🟢 Ready | | | DataSource | 🔴 Not ready | Includes MetricProvider and ExecutionEngine | | DataContext | 🔴 Not ready | Also known as Configuration Stores | | DataAsset | 🔴 Not ready | | | Expectation | 🔴 Not ready | | | ValidationDefinition | 🔴 Not ready | | | Checkpoint | 🔴 Not ready | | | CustomExpectations | 🔴 Not ready | | | Data Docs | 🔴 Not ready | Also known as Renderers |
Code of conduct
Everyone interacting in GX Core project codebases, Discourse forums, Slack channels, and email communications is expected to adhere to the GX Community Code of Conduct.
Owner
- Name: Great Expectations Core
- Login: great-expectations
- Kind: organization
- Email: info@greatexpectations.io
- Location: United States of America
- Website: https://greatexpectations.io
- Twitter: expectgreatdata
- Repositories: 10
- Profile: https://github.com/great-expectations
Revolutionizing the speed and integrity of data collaboration.
Citation (CITATION.cff)
abstract: Great Expectations is a shared, open standard for data quality. It helps
data teams eliminate pipeline debt, through data testing, documentation, and profiling.
authors:
- family-names: Gong
given-names: Abe
- family-names: Campbell
given-names: James
- name: Great Expectations
website: https://greatexpectations.io
email: team@greatexpectations.io
cff-version: 1.2.0
identifiers:
- description: This is the collection of all archived snapshots of all versions of
Great Expectations
type: doi
value: 10.5281/zenodo.5683574
keywords:
- data quality
- pipeline testing
- data testing
- pipeline debt
- data observability
- data monitoring
- data profiling
- data documentation
license: Apache-2.0
message: If you use this software, please cite it using these metadata.
repository-code: https://github.com/great-expectations/great_expectations
title: Great Expectations
Committers
Last synced: 11 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| James Campbell | j****l@g****m | 1,915 |
| Abe | a****g@g****m | 1,230 |
| Chetan Kini | c****n@s****m | 990 |
| Robert Moses Lim | r****m@g****m | 767 |
| Alex Sherstinsky | a****y | 661 |
| Eugene Mandel | e****e@s****m | 511 |
| Anthony Burdi | a****y@g****o | 482 |
| Aylr | A****r | 379 |
| Gabriel | g****g@g****m | 361 |
| Nathan Farmer | N****r | 308 |
| William Shin | w****l@s****m | 293 |
| Bill Dirks | b****l@g****o | 266 |
| Tyler Hoffman | t****n@g****m | 260 |
| Rachel-Reverie | 9****e | 216 |
| Rob Gray | 1****k | 160 |
| kenwade4 | 9****4 | 134 |
| Joshua Stauffer | 6****r | 130 |
| ccnobbli | c****i@n****u | 96 |
| William Shin | w****l@g****o | 94 |
| Austin Ziech Robinson | 4****r | 81 |
| Don Heppner | d****r@g****m | 74 |
| ayirplm | p****a@s****m | 73 |
| talagluck | t****l@s****m | 70 |
| dependabot[bot] | 4****] | 66 |
| anhollis | a****s@n****u | 59 |
| Derek Martin | 4****3 | 59 |
| Christian Selig | c****g@u****u | 58 |
| T Pham | 2****m | 57 |
| Kristen Lavavej | 3****j | 51 |
| Péter Szécsi | s****4@s****u | 50 |
| and 403 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 507
- Total pull requests: 4,369
- Average time to close issues: 7 months
- Average time to close pull requests: 15 days
- Total issue authors: 372
- Total pull request authors: 203
- Average comments per issue: 2.54
- Average comments per pull request: 2.04
- Merged pull requests: 3,220
- Bot issues: 0
- Bot pull requests: 84
Past Year
- Issues: 159
- Pull requests: 1,466
- Average time to close issues: 15 days
- Average time to close pull requests: 7 days
- Issue authors: 112
- Pull request authors: 64
- Average comments per issue: 1.1
- Average comments per pull request: 2.0
- Merged pull requests: 1,068
- Bot issues: 0
- Bot pull requests: 43
Top Authors
Issue Authors
- kujaska (9)
- data-han (7)
- jmcorreia (7)
- MarcelBeining (7)
- victorgrcp (7)
- Erua-chijioke (6)
- Chr96er (6)
- franciskuttivelil (5)
- jschra (5)
- leodrivera (5)
- itaise (5)
- satniks (4)
- gerileka (4)
- VolkovGeoPhy (4)
- tyler-hoffman (4)
Pull Request Authors
- cdkini (625)
- tyler-hoffman (599)
- Kilo59 (330)
- NathanFarmer (296)
- joshua-stauffer (295)
- billdirks (292)
- kwcanuck (184)
- Rachel-Reverie (174)
- klavavej (162)
- Shinnnyshinshin (124)
- anthonyburdi (110)
- JessSaavedra (100)
- deborahniesz (86)
- dependabot[bot] (64)
- TrangPham (54)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 7
-
Total downloads:
- pypi 22,659,103 last-month
- Total docker downloads: 6,750,002
-
Total dependent packages: 62
(may contain duplicates) -
Total dependent repositories: 286
(may contain duplicates) - Total versions: 1,033
- Total maintainers: 11
pypi.org: great-expectations
Always know what to expect from your data.
- Homepage: https://greatexpectations.io
- Documentation: https://great-expectations.readthedocs.io/
- License: Apache-2.0
-
Latest release: 1.5.10
published 6 months ago
Rankings
Maintainers (8)
pypi.org: great-expectations-experimental
Always know what to expect from your data.
- Homepage: https://github.com/great-expectations/great_expectations
- Documentation: https://great-expectations-experimental.readthedocs.io/
- License: Apache-2.0
-
Latest release: 0.1.20240917055
published over 1 year ago
Rankings
Maintainers (4)
proxy.golang.org: github.com/great-expectations/great_expectations
- Documentation: https://pkg.go.dev/github.com/great-expectations/great_expectations#section-documentation
- License: apache-2.0
-
Latest release: v0.7.10
published over 6 years ago
Rankings
conda-forge.org: great-expectations
Great Expectations helps teams save time and promote analytic integrity by offering a unique approach to automated testing: pipeline tests. Pipeline tests are applied to data (instead of code) and at batch time (instead of compile or deploy time). Pipeline tests are like unit tests for datasets: they help you guard against upstream data changes and monitor data quality. Software developers have long known that automated testing is essential for managing complex codebases. Great Expectations brings the same discipline, confidence, and acceleration to data science and engineering teams.
- Homepage: https://github.com/great-expectations/great_expectations
- License: Apache-2.0
-
Latest release: 0.15.32
published over 3 years ago
Rankings
pypi.org: great-expectations-cta
Always know what to expect from your data.
- Homepage: https://github.com/great-expectations/great_expectations
- Documentation: https://great-expectations-cta.readthedocs.io/
- License: Apache-2.0
-
Latest release: 0.15.43
published about 3 years ago
Rankings
Maintainers (1)
pypi.org: acryl-great-expectations
Always know what to expect from your data.
- Homepage: https://github.com/great-expectations/great_expectations
- Documentation: https://acryl-great-expectations.readthedocs.io/
- License: Apache-2.0
-
Latest release: 0.15.50.1
published 10 months ago
Rankings
Maintainers (2)
anaconda.org: great-expectations
Great Expectations helps teams save time and promote analytic integrity by offering a unique approach to automated testing: pipeline tests. Pipeline tests are applied to data (instead of code) and at batch time (instead of compile or deploy time). Pipeline tests are like unit tests for datasets: they help you guard against upstream data changes and monitor data quality. Software developers have long known that automated testing is essential for managing complex codebases. Great Expectations brings the same discipline, confidence, and acceleration to data science and engineering teams.
- Homepage: https://github.com/great-expectations/great_expectations
- License: Apache-2.0
-
Latest release: 1.4.1
published 10 months ago