Science Score: 75.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
✓Institutional organization owner
Organization sscu-budapest has institutional domain (sscu-budapest.github.io) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary
Keywords
Repository
Data artifact orchestrator
Basic Info
- Host: GitHub
- Owner: sscu-budapest
- License: mit
- Language: Python
- Default Branch: main
- Size: 765 KB
Statistics
- Stars: 3
- Watchers: 0
- Forks: 2
- Open Issues: 3
- Releases: 38
Topics
Metadata Files
README.md
datazimmer
To create a new project
- make sure that
pythonpoints topython>=3.8and you havepipandgitthenpip install datazimmer - run
dz init project-name- pulls project-template
- add a remote
- both to git and dvc (can run
dz build-metato see available dvc remotes) - git remote can be given with
dz init
- both to git and dvc (can run
- create, register and document steps in a pipeline you will run in different environments
- build metadata to exportable and serialized format with
dz build-meta- if you defined importable data from other artifacts in the config, you can import them with
load-external-data - ensure that you import envs that are served from sources you have access to
- if you defined importable data from other artifacts in the config, you can import them with
- build and run pipeline steps by running
dz run - validate that the data matches the datascript description with
dz validate
Scheduling
- a project as a whole has a cron expression in
zimmer.yamlto determine the schedule of reruns - additionally, aswan projects within the dz project can have different cron expressions for scheduling new runs of the aswan projects
Test projects
TODO: document dogshow and everything else much better here
Lookahead
- overlapping names convention
- resolve naming confusion with colassigner, colaccessor and table feature / composite type / index base classes
- abstract composite type + subclass of entity class
- import ACT, inherit from it and specify
- importing composite type is impossible now if it contains foreign key :(
- add option to infer data type of assigned feature
- can be problematic b/c pandas int/float/nan issue
- create similar sets of features in a dry way
- overlapping in entities
- detect / signal the same type of entity
- exports: postgres, postgis , superset
W3C compliancy plan
- test suite for compliance: https://w3c.github.io/csvw/publishing-snapshots/PR-earl/earl.html
- https://github.com/w3c/csvw
- https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/
- https://www.w3.org/TR/tabular-metadata/
@article{tennison2015model,
title={Model for tabular data and metadata on the web},
author={Tennison, Jeni and Kellogg, Gregg and Herman, Ivan},
year={2015}
}
@article{pollock2015metadata,
title={Metadata vocabulary for tabular data},
author={Pollock, Rufus and Tennison, Jeni and Kellogg, Gregg and Herman, Ivan},
journal={W3C Recommendation},
volume={17},
year={2015}
}
Owner
- Name: Social Science Computing Unit Budapest
- Login: sscu-budapest
- Kind: organization
- Email: borza.endre@krtk.mta.hu
- Website: https://sscu-budapest.github.io/
- Repositories: 13
- Profile: https://github.com/sscu-budapest
Citation (CITATION.cff)
cff-version: 1.2.0 message: If you use this software, please cite it as below. url: https://github.com/sscu-budapest/datazimmer authors: - family-names: Borza given-names: Endre Márk orcid: https://orcid.org/0000-0002-8804-4520 - family-names: Kovács given-names: Bence orcid: https://orcid.org/0000-0002-2225-9895 - family-names: Pap given-names: Sebestyén orcid: https://orcid.org/0000-0002-1987-845X title: sscu-budapest/datazimmer version: 0.5.4 date-released: 2023-07-25
GitHub Events
Total
Last Year
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 724
- Total Committers: 3
- Avg Commits per committer: 241.333
- Development Distribution Score (DDS): 0.012
Top Committers
| Name | Commits | |
|---|---|---|
| Endre Márk Borza | e****a@g****m | 715 |
| papsebestyen | p****n@g****m | 7 |
| kbenya | k****5@g****m | 2 |
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 2
- Total pull requests: 8
- Average time to close issues: N/A
- Average time to close pull requests: 3 days
- Total issue authors: 1
- Total pull request authors: 3
- Average comments per issue: 0.0
- Average comments per pull request: 0.75
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- endremborza (2)
Pull Request Authors
- endremborza (4)
- papsebestyen (3)
- renovate[bot] (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 1,529 last-month
- Total dependent packages: 0
- Total dependent repositories: 4
- Total versions: 38
- Total maintainers: 1
pypi.org: datazimmer
sscu-budapest utilities for scientific data engineering
- Homepage: https://github.com/sscu-budapest/datazimmer
- Documentation: https://datazimmer.readthedocs.io/
- License: mit
-
Latest release: 0.5.3
published over 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- codecov/codecov-action v3 composite
- postgres * docker
- actions/checkout v3 composite
- actions/setup-python v4 composite
- colassigner >=0.2.2
- cookiecutter *
- flit *
- metazimmer *
- pandas >=2.0.1
- parquetranger >=0.2.3
- pip >=22.0.0
- pyyaml *
- setuptools >=60.0.0
- sqlalchemy >=2.0.0
- sqlmermaid *
- structlog *
- toml *
- typer *
- wheel >=0.37.0
- zimmauth >=0.1.0