common-data-format-validator
JSON Schema Validition for the Soccer Common Data Format
https://github.com/unravelsports/common-data-format-validator
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary
Repository
JSON Schema Validition for the Soccer Common Data Format
Basic Info
- Host: GitHub
- Owner: UnravelSports
- License: mit
- Language: Python
- Default Branch: main
- Size: 110 KB
Statistics
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
⚽ Common Data Format Schema Validator
JSON and JSONLines Schema Validition for the Soccer Common Data Format.
Anzer, G., Arnsmeyer, K., Bauer, P., Bekkers, J., Brefeld, U., Davis, J., Evans, N., Kempe, M., Robertson, S. J., Smith, J. W., & Van Haaren, J. (2025). Common Data Format (CDF)—a Standardized Format for Match-Data in Football (Soccer). [Unpublished manuscript / Preprint].
Changelog
See CHANGELOG.md
How To
1. Install package
pip install common-data-format-validator
2. Create your own schema
Create your data schema according to the Common Data Format specificiations for any of: - Offical Match Data - Meta Data - Event Data - Tracking Data - Skeletal Tracking Data
3. Test your schema
Once you have created your schema, you can check it's validity using the available SchemaValidators for each of the above mentioned data types.
```python import cdf
# Example valid tracking data
validator = cdf.TrackingSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/tracking.jsonl")
Example valid meta data
validator = cdf.MetaSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/meta.json")
Example valid event data
validator = cdf.EventSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/event.jsonl")
Example valid match data
validator = cdf.MatchSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/match.json")
Example valid skeletal data
validator = cdf.SkeletalSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/skeletal.jsonl")
Example valid video data
validator = cdf.VideoSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/video.json") ```
Note
The validator checks: - All mandatory fields are provided - Snake case is adhered for each key and for values (except for player names, city names, venue names etc.) - Data types are correct (e.g. boolean, integer etc.) - Value entries for specific fields are correct (e.g. period type can only be one of 5 values) - Position groups and positions follow naming conventions
The validator (currently) does not check: - Correct JSONLines line separator ('\n') - Correct UTF-8 encoding - Correct pitch dimensions - British spelling - Color codes are hex (e.g. #FFC107) - If player_ids (or other ids) in meta are in tracking, event etc. or vice versa - Position labels fit within the formation specifications
Current Version of Common Data Format
This validator currently relies on CDF "alpha" version 2, but includes all logical changes not yet reflected in the text of this version, as discussed in the Changelog
Software by Joris Bekkers
Owner
- Login: UnravelSports
- Kind: user
- Repositories: 2
- Profile: https://github.com/UnravelSports
Citation (CITATION.cff)
cff-version: 1.2.0
authors:
- family-names: "Anzer"
given-names: "Gabriel"
affiliation: "RB Leipzig, Leipzig, Germany"
- family-names: "Arnsmeyer"
given-names: "Kilian"
affiliation: "Deutscher Fußball-Bund (DFB), Frankfurt, Germany"
- family-names: "Bauer"
given-names: "Pascal"
affiliation: "Deutscher Fußball-Bund (DFB), Frankfurt, Germany; Saarland University, Saarbrücken, Germany"
- family-names: "Bekkers"
given-names: "Joris"
affiliation: "U.S. Soccer Federation, Chicago, USA; UnravelSports, Breda, Netherlands; PySport, Eindhoven, Netherlands"
- family-names: "Brefeld"
given-names: "Ulf"
affiliation: "Leuphana University, Lüneburg, Germany"
- family-names: "Davis"
given-names: "Jesse"
affiliation: "KU Leuven, Leuven.AI, & LISS, Heverlee, Belgium"
- family-names: "Evans"
given-names: "Nicolas"
affiliation: "FIFA, Zurich, Switzerland"
- family-names: "Kempe"
given-names: "Matthias"
affiliation: "University of Groningen, Groningen, Netherlands"
- family-names: "Robertson"
given-names: "Samuel J"
affiliation: "FIFA, Zurich, Switzerland"
- family-names: "Smith"
given-names: "Joshua Wyatt"
affiliation: "Wyatt AI Inc., Montreal, Canada; Concordia University, Montreal, Canada"
- family-names: "Van Haaren"
given-names: "Jan"
affiliation: "KU Leuven, Leuven.AI, & LISS, Heverlee, Belgium; Club Brugge, Brugge, Belgium"
title: "Common Data Format (CDF)—a Standardized Format for Match-Data in Football (Soccer)"
version: 0.2.1
date-released: 2025
repository-code: "https://github.com/UnravelSports/common-data-format-validator"
keywords:
- football
- soccer
- data format
- standardization
- match data
license: MIT
GitHub Events
Total
- Release event: 1
- Watch event: 8
- Push event: 1
- Public event: 1
- Pull request event: 1
- Create event: 1
Last Year
- Release event: 1
- Watch event: 8
- Push event: 1
- Public event: 1
- Pull request event: 1
- Create event: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 4 minutes
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 4 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- UnravelSports (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 32 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: common-data-format-validator
A package for validating common data format files
- Homepage: https://github.com/unravelsports/common-data-format-validator
- Documentation: https://common-data-format-validator.readthedocs.io/
- License: mit
-
Latest release: 0.0.4
published 9 months ago
Rankings
Maintainers (1)
Dependencies
- jsonlines ==4.0.0
- jsonschema ==4.23.0
- jsonschema-specifications ==2024.10.1
- requests ==2.32.3
- setuptools ==79.0.0
- jsonlines ==4.0.0
- jsonschema ==4.23.0
- jsonschema-specifications ==2024.10.1
- requests ==2.32.3