frictionless
Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.3%) to scientific vocabulary
Repository
Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
Basic Info
- Host: GitHub
- Owner: frictionlessdata
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://framework.frictionlessdata.io
- Size: 136 MB
Statistics
- Stars: 768
- Watchers: 29
- Forks: 155
- Open Issues: 221
- Releases: 0
Metadata Files
README.md
frictionless-py
markdown remark type=primary
Migrating from an older version? Please read **[v5](blog/2022/08-22-frictionless-framework-v5.html)** announcement and migration guide.
Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data (DEVT Framework). It supports a great deal of data sources and formats, as well as provides popular platforms integrations. The framework is powered by the lightweight yet comprehensive Frictionless Standards.
Purpose
- Describe your data: You can infer, edit and save metadata of your data tables. It's a first step for ensuring data quality and usability. Frictionless metadata includes general information about your data like textual description, as well as, field types and other tabular data details.
- Extract your data: You can read your data using a unified tabular interface. Data quality and consistency are guaranteed by a schema. Frictionless supports various file schemes like HTTP, FTP, and S3 and data formats like CSV, XLS, JSON, SQL, and others.
- Validate your data: You can validate data tables, resources, and datasets. Frictionless generates a unified validation report, as well as supports a lot of options to customize the validation process.
- Transform your data: You can clean, reshape, and transfer your data tables and datasets. Frictionless provides a pipeline capability and a lower-level interface to work with the data.
Features
- Open Source (MIT)
- Powerful Python framework
- Convenient command-line interface
- Low memory consumption for data of any size
- Reasonable performance on big data
- Support for compressed files
- Custom checks and formats
- Fully pluggable architecture
- More than 1000+ tests
Installation
bash
$ pip install frictionless
Example
```bash $ frictionless validate data/invalid.csv [invalid] data/invalid.csv
row field code message
3 blank-header Header in field at position "3" is blank
4 duplicate-header Header "name" in field "4" is duplicated
2 3 missing-cell Row "2" has a missing cell in field "field3"
2 4 missing-cell Row "2" has a missing cell in field "name2"
3 3 missing-cell Row "3" has a missing cell in field "field3"
3 4 missing-cell Row "3" has a missing cell in field "name2"
4 blank-row Row "4" is completely blank
5 5 extra-cell Row "5" has an extra value in field "5"
```
Documentation
Please visit our documentation portal: - https://framework.frictionlessdata.io
Owner
- Name: Frictionless Data
- Login: frictionlessdata
- Kind: organization
- Location: Internet
- Website: http://frictionlessdata.io/
- Twitter: frictionlessd8a
- Repositories: 126
- Profile: https://github.com/frictionlessdata
Lightweight specifications and software to shorten the path from data to insight. Code of Conduct: https://frictionlessdata.io/code-of-conduct/
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: 'frictionless: Python library for Data Packages'
message: >-
To cite the Frictionless Python Framework in publications
please use:
type: software
authors:
- given-names: Evgeny
family-names: Karev
affiliation: Datist
email: eskarev@gmail.com
- given-names: Pierre
family-names: Camilleri
affiliation: multi.coop
email: pierre.camilleri@multi.coop
- given-names: Vitor
family-names: Baptista
affiliation: Fiquem Sabendo
- given-names: Georgiana
family-names: Bere
- given-names: Andrea
family-names: Borruso
affiliation: OnData
- given-names: Peter
family-names: Desmet
orcid: 'https://orcid.org/0000-0002-8442-8025'
affiliation: Research Institute for Nature and Forest (INBO)
- given-names: Shashi
family-names: Gharti
affiliation: Robust IT Concepts
- given-names: Augusto
family-names: Herrmann
affiliation: >-
Ministry of Management and Innovation in Public
Services in Brazil
- given-names: Adam
family-names: Kariv
affiliation: While True Industries
- given-names: Chris
family-names: Shaw
affiliation: Democracy Club
- given-names: Paul
family-names: Walsh
affiliation: LinkDigital
- given-names: Lilly
family-names: Winfree
affiliation: Anaconda, Inc.
orcid: 'https://orcid.org/0000-0001-7120-8536'
- given-names: Edgar
family-names: Zanella Alvarenga
affiliation: Digi Sapiens
- given-names: Jesper
family-names: Zedlitz
orcid: 'https://orcid.org/0000-0003-2664-5010'
- name: Open Knowledge Foundation
city: London
country: GB
- given-names: Sara
family-names: Petti
affiliation: Open Knowledge Foundation
email: sara.petti@okfn.org
identifiers:
- type: doi
value: https://doi.org/10.5281/zenodo.4663759
repository: 'https://pypi.org/project/frictionless/'
repository-code: 'https://github.com/frictionlessdata/frictionless-py'
url: 'https://framework.frictionlessdata.io/'
abstract: >-
Data management framework for Python that provides
functionality to describe, extract, validate, and
transform tabular data (DEVT Framework). It supports a
great deal of data sources and formats, as well as
provides popular platforms integrations. The framework is
powered by the lightweight yet comprehensive Frictionless
Data Package (https://datapackage.org/).
license: MIT
GitHub Events
Total
- Create event: 11
- Commit comment event: 1
- Release event: 1
- Issues event: 48
- Watch event: 54
- Delete event: 7
- Issue comment event: 63
- Push event: 54
- Pull request review event: 21
- Pull request review comment event: 18
- Pull request event: 24
- Fork event: 7
Last Year
- Create event: 11
- Commit comment event: 1
- Release event: 1
- Issues event: 48
- Watch event: 54
- Delete event: 7
- Issue comment event: 63
- Push event: 54
- Pull request review event: 21
- Pull request review comment event: 18
- Pull request event: 24
- Fork event: 7
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 17
- Total pull requests: 9
- Average time to close issues: 3 months
- Average time to close pull requests: 4 days
- Total issue authors: 12
- Total pull request authors: 5
- Average comments per issue: 0.29
- Average comments per pull request: 0.22
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 16
- Pull requests: 9
- Average time to close issues: 10 days
- Average time to close pull requests: 4 days
- Issue authors: 11
- Pull request authors: 5
- Average comments per issue: 0.31
- Average comments per pull request: 0.22
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- megin1989 (16)
- pierrecamilleri (11)
- amelie-rondot (5)
- jze (4)
- richardt-engineb (3)
- diego-oncoramedical (3)
- pdelboca (2)
- fjuniorr (2)
- dafeder (2)
- mingjiecn (2)
- ebAbhay (1)
- davidgasquez (1)
- adrien-owkin (1)
- samqi (1)
- paulgirard (1)
Pull Request Authors
- pierrecamilleri (14)
- amelie-rondot (7)
- dependabot[bot] (6)
- jze (5)
- roll (3)
- afuetterer (3)
- richardt-engineb (2)
- barbuz (1)
- hansendx (1)
- sapetti9 (1)
- lwjohnst86 (1)
- areleu (1)
- megin1989 (1)
- pdelboca (1)
- LincolnPuzey (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 6
- Total dependent repositories: 2
- Total versions: 73
conda-forge.org: frictionless
- Homepage: http://github.com/frictionlessdata/frictionless-py
- License: MIT
-
Latest release: 4.40.8
published over 3 years ago
Rankings
Dependencies
- actions/checkout v2 composite
- actions/setup-python v3 composite
- actions/setup-python v2 composite
- codecov/codecov-action v2 composite
- pypa/gh-action-pypi-publish release/v1 composite
- softprops/action-gh-release v1 composite
- stefanzweifel/git-auto-commit-action v4 composite
- mysql 8 docker
- postgres 12 docker
- leonsteinhaeuser/project-beta-automations v1.2.1 composite
- ubuntu 22.04 build