https://github.com/claromes/volleystats

🏐 Command-line tool to scrape volleyball statistics from Data Project Web Competition websites

https://github.com/claromes/volleystats

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • β—‹
    CITATION.cff file
  • βœ“
    codemeta.json file
    Found codemeta.json file
  • β—‹
    .zenodo.json file
  • β—‹
    DOI references
  • β—‹
    Academic publication links
  • β—‹
    Committers with academic emails
  • β—‹
    Institutional organization owner
  • β—‹
    JOSS paper metadata
  • β—‹
    Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary

Keywords

data-project data-volley python scraping scrapy sports-data volleyball
Last synced: 6 months ago · JSON representation

Repository

🏐 Command-line tool to scrape volleyball statistics from Data Project Web Competition websites

Basic Info
Statistics
  • Stars: 15
  • Watchers: 3
  • Forks: 1
  • Open Issues: 4
  • Releases: 14
Topics
data-project data-volley python scraping scrapy sports-data volleyball
Created over 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme License

README.md

Volley Stats

PyPI PyPI

Command-line tool to scrape volleyball statistics from Data Project Web Competition websites.

Volley Stats facilitates the export of data in CSV format of volleyball matches and competitions organized by entities that use Data Project WCM. The tool streamlines the collection of individual matches, match lists, and automates the retrieval of individual match data from the competition matches list.

Additionally, it documents the structure of URLs for Web Competition websites, simplifying the search for identifiers (mID, ID, PID), and also supplies acronyms for the main entities utilizing Data Project Management.

This tool is not affiliated with Genius Sports Italy.

Installation

Requirement

  • Python 3.8+

shell pip install volleystats

Documentation

Extracted Data

  • Competition

    • Competition ID
    • Home Team
    • Guest Team
    • Home Points
    • Guest Points
    • Date
    • Stadium
  • Match

    • Match ID
    • Match date
    • Home Team
    • Guest Team
    • Coach
    • Stadium
    • Total Points
    • Break Points
    • Win-Lost
    • Total Serves
    • Serve Erros
    • Serve Points
    • Total Receptions
    • Reception Erros
    • Positive Pass Percentage (Pos%)
    • Excellent/ Perfect Pass Percentage (Exc.%)
    • Total Attacks
    • Attack Erros
    • Blocked Attack
    • Attack Points (Exc.)
    • Attack Points Percentage (Exc.%)
    • Block Points

Usage

volleystats [--help] --fed FED (--match MATCH | --comp COMP | --batch CSV_FILE_PATH) [--pid PID] [--log]

  • --fed, -f: Federation Acronym (required)
  • --match, -m: Statistics of a single match (required, unless --comp or --batch are provided)
  • --comp, -c: List of matches in a competition (required, unless --match or --batch are provided)
  • --pid, -p: PID of the competition (optional, only when --comp is provided)
  • --batch, -b: CSV file path with Match IDs (Competition Matches output) (required, unless --match or --comp are provided)
  • --log, -l: View the logging during scraping
  • --help, -h: Show help message

Match

shell volleystats --fed FED --match MATCH

Examples

  • Brazilian Volleyball Confederation

    • Data Project website: https://cbv-web.dataproject.com/MatchStatistics.aspx?mID=1623
    • Federation Acronym: CBV
    • Match ID: 1623
    • Command: $ volleystats --fed cbv --match 1623
    • Output files: data/cbv-1623-22-10-28-guest-baruerivolleyballclub.csv data/cbv-1623-22-10-28-home-fluminense.csv
  • Lithuanian Volleyball Federation

    • Data Project website: https://lvf-web.dataproject.com/MatchStatistics.aspx?mID=2093
    • Federation Acronym: LVF
    • Match ID: 2093
    • Command: $ volleystats --fed lvf --match 2093
    • Output files: data/lvf-2093-2022-11-23-guest-jonavossc.csv data/lvf-2093-2022-11-23-home-svaja-viktorija-lsu.csv

Competition Matches

shell volleystats --fed FED --comp COMP

Example

  • Brazilian Volleyball Confederation
    • Data Project website: https://cbv-web.dataproject.com/CompetitionMatches.aspx?ID=18
    • Federation Acronym: CBV
    • Competition ID: 18
    • Command: $ volleystats --fed cbv --comp 18
    • Output file: data/cbv-18-2022-2023-competition-matches.csv

Competition Matches with PID

In some competitions, PID can be used to distinguish between seasons, such as regular season and playoffs. Therefore, it is necessary to submit this value to obtain statistics separately.

shell volleystats --fed FED --comp COMP --pid PID

Examples

  • Bundesliga
    • Data Project website: https://vbl-web.dataproject.com/CompetitionMatches.aspx?ID=162&PID=173
    • Federation Acronym: VBL
    • Competition ID: 162
    • PID: 173
    • Season: Regular
    • Command: $ volleystats --fed vbl --comp 162 --pid 173
    • Output file: data/vbl-162-173-2022-2023-competition-matches.csv ---
    • Data Project website: https://vbl-web.dataproject.com/CompetitionMatches.aspx?ID=162&PID=174
    • Federation Acronym: VBL
    • Competition ID: 162
    • PID: 174
    • Season: Playoffs
    • Command: $ volleystats --fed vbl --comp 162 --pid 174
    • Output file: data/vbl-162-174-2023-2023-competition-matches.csv

Matches via Competition Matches file

shell volleystats --fed FED --batch CSV_FILE_PATH

Example

  • Brazilian Volleyball Confederation
    • Data Project website: https://cbv-web.dataproject.com/MatchStatistics.aspx?mID=ID
    • Federation Acronym: CBV
    • CSV file path (output of the Competition Matches): data/cbv-18-2022-2023-competition-matches.csv
    • Command: $ volleystats --fed cbv --batch data/cbv-18-2022-2023-competition-matches.csv
    • Output files: data/cbv-1623-22-10-28-guest-baruerivolleyballclub.csv data/cbv-1623-22-10-28-home-fluminense.csv data/cbv-1618-2022-11-01-guest-energis8sΓ£ocaetano.csv data/cbv-1618-2022-11-01-home-esporteclubepinheiros.csv data/cbv-1619-2022-11-01-guest-abelmodavolei.csv data/cbv-1619-2022-11-01-home-gerdauminas.csv ...

Help

shell volleystats --help

Log

shell volleystats --fed FED (--match MATCH | --comp COMP | --batch CSV_FILE_PATH) --log

Output messages

`` . |. | . |-_. | -_ `._ _________________|- |___________, ', -| ', ', | ', ', | ', ',________________|___________________',

volleystats: started volleystats: data/cbv-1623-22-10-28-home-fluminense.csv file was created volleystats: data/cbv-1623-22-10-28-guest-baruerivolleyballclub.csv file was created volleystats: finished ```

Data Project Web Competition URLs structure

  • Hostname: <Fed_Acronym>-web.dataproject.com

  • Pathnames and search parameters:

    • /MainHome
    • /History?ID=<Fed_ID>
    • /CompetitionHome?ID=<Category_ID> (could be Women, Men, Pro or Youth, e.g.)
    • /CompetitionMatches?ID=<Competition_ID>&PID=<PID> (PID could be regular season or playoffs, e.g.)
    • /MatchStatistics?mID=<Match_ID>&ID=<Competition_ID>

Federations, Confederations and Leagues Acronyms

European Volleyball

South American Volleyball

Troubleshooting

Match files collected from batch file

In some cases, empty files may be returned, usually named as <fed_acronym>-<match_id>-guest_stats.csv and <fed_acronym>-<match_id>-home_stats.csv. This can happen due to the hiding of a match in the competition listing, either because it was canceled or incorrectly entered. The match is hidden from view, but it remains accessible in the HTML, causing the tool to return an empty file. In such cases, simply ignore and delete this file.

It can also happen that the data is only available in PDF, which makes scraping impossible.

Development

$ git clone git@github.com:claromes/volleystats.git

$ cd volleystats

$ pip install -r requirements.txt

$ pip install --editable .

Author

Claromes

Owner

  • Login: claromes
  • Kind: user

GitHub Events

Total
  • Watch event: 3
  • Fork event: 1
Last Year
  • Watch event: 3
  • Fork event: 1

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 95
  • Total Committers: 1
  • Avg Commits per committer: 95.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 48
  • Committers: 1
  • Avg Commits per committer: 48.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Claromes c****s@h****m 95
Committer Domains (Top 20 + Academic)
hey.com: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 14
  • Total pull requests: 14
  • Average time to close issues: 4 months
  • Average time to close pull requests: about 8 hours
  • Total issue authors: 3
  • Total pull request authors: 1
  • Average comments per issue: 0.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 14
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • claromes (11)
  • jamiemoran2 (1)
  • raymondben (1)
Pull Request Authors
  • claromes (17)
Top Labels
Issue Labels
documentation (1) bug (1)
Pull Request Labels