common-data-format-validator

JSON Schema Validition for the Soccer Common Data Format

https://github.com/unravelsports/common-data-format-validator

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

JSON Schema Validition for the Soccer Common Data Format

Basic Info
  • Host: GitHub
  • Owner: UnravelSports
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 110 KB
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created 10 months ago · Last pushed 9 months ago
Metadata Files
Readme Changelog License Citation

README.md

⚽ Common Data Format Schema Validator

JSON and JSONLines Schema Validition for the Soccer Common Data Format.

Anzer, G., Arnsmeyer, K., Bauer, P., Bekkers, J., Brefeld, U., Davis, J., Evans, N., Kempe, M., Robertson, S. J., Smith, J. W., & Van Haaren, J. (2025). Common Data Format (CDF)—a Standardized Format for Match-Data in Football (Soccer). [Unpublished manuscript / Preprint].


Changelog

See CHANGELOG.md


How To

1. Install package

pip install common-data-format-validator

2. Create your own schema

Create your data schema according to the Common Data Format specificiations for any of: - Offical Match Data - Meta Data - Event Data - Tracking Data - Skeletal Tracking Data

3. Test your schema

Once you have created your schema, you can check it's validity using the available SchemaValidators for each of the above mentioned data types.

```python import cdf

# Example valid tracking data

validator = cdf.TrackingSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/tracking.jsonl")

Example valid meta data

validator = cdf.MetaSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/meta.json")

Example valid event data

validator = cdf.EventSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/event.jsonl")

Example valid match data

validator = cdf.MatchSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/match.json")

Example valid skeletal data

validator = cdf.SkeletalSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/skeletal.jsonl")

Example valid video data

validator = cdf.VideoSchemaValidator() validator.validate_schema(sample=f"cdf/files/v{cdf.VERSION}/sample/video.json") ```


Note

The validator checks: - All mandatory fields are provided - Snake case is adhered for each key and for values (except for player names, city names, venue names etc.) - Data types are correct (e.g. boolean, integer etc.) - Value entries for specific fields are correct (e.g. period type can only be one of 5 values) - Position groups and positions follow naming conventions

The validator (currently) does not check: - Correct JSONLines line separator ('\n') - Correct UTF-8 encoding - Correct pitch dimensions - British spelling - Color codes are hex (e.g. #FFC107) - If player_ids (or other ids) in meta are in tracking, event etc. or vice versa - Position labels fit within the formation specifications


Current Version of Common Data Format

This validator currently relies on CDF "alpha" version 2, but includes all logical changes not yet reflected in the text of this version, as discussed in the Changelog


Software by Joris Bekkers

Owner

  • Login: UnravelSports
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
authors:
  - family-names: "Anzer"
    given-names: "Gabriel"
    affiliation: "RB Leipzig, Leipzig, Germany"
  - family-names: "Arnsmeyer"
    given-names: "Kilian"
    affiliation: "Deutscher Fußball-Bund (DFB), Frankfurt, Germany"
  - family-names: "Bauer"
    given-names: "Pascal"
    affiliation: "Deutscher Fußball-Bund (DFB), Frankfurt, Germany; Saarland University, Saarbrücken, Germany"
  - family-names: "Bekkers"
    given-names: "Joris"
    affiliation: "U.S. Soccer Federation, Chicago, USA; UnravelSports, Breda, Netherlands; PySport, Eindhoven, Netherlands"
  - family-names: "Brefeld"
    given-names: "Ulf"
    affiliation: "Leuphana University, Lüneburg, Germany"
  - family-names: "Davis"
    given-names: "Jesse"
    affiliation: "KU Leuven, Leuven.AI, & LISS, Heverlee, Belgium"
  - family-names: "Evans"
    given-names: "Nicolas"
    affiliation: "FIFA, Zurich, Switzerland"
  - family-names: "Kempe"
    given-names: "Matthias"
    affiliation: "University of Groningen, Groningen, Netherlands"
  - family-names: "Robertson"
    given-names: "Samuel J"
    affiliation: "FIFA, Zurich, Switzerland"
  - family-names: "Smith"
    given-names: "Joshua Wyatt"
    affiliation: "Wyatt AI Inc., Montreal, Canada; Concordia University, Montreal, Canada"
  - family-names: "Van Haaren"
    given-names: "Jan"
    affiliation: "KU Leuven, Leuven.AI, & LISS, Heverlee, Belgium; Club Brugge, Brugge, Belgium"
title: "Common Data Format (CDF)—a Standardized Format for Match-Data in Football (Soccer)"
version: 0.2.1
date-released: 2025
repository-code: "https://github.com/UnravelSports/common-data-format-validator"
keywords:
  - football
  - soccer
  - data format
  - standardization
  - match data
license: MIT  

GitHub Events

Total
  • Release event: 1
  • Watch event: 8
  • Push event: 1
  • Public event: 1
  • Pull request event: 1
  • Create event: 1
Last Year
  • Release event: 1
  • Watch event: 8
  • Push event: 1
  • Public event: 1
  • Pull request event: 1
  • Create event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 4 minutes
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 4 minutes
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • UnravelSports (2)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 32 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
pypi.org: common-data-format-validator

A package for validating common data format files

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 32 Last month
Rankings
Dependent packages count: 9.2%
Average: 30.6%
Dependent repos count: 52.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • jsonlines ==4.0.0
  • jsonschema ==4.23.0
  • jsonschema-specifications ==2024.10.1
  • requests ==2.32.3
  • setuptools ==79.0.0
setup.py pypi
  • jsonlines ==4.0.0
  • jsonschema ==4.23.0
  • jsonschema-specifications ==2024.10.1
  • requests ==2.32.3