https://github.com/autogluon/fev

Forecast evaluation library

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary

Keywords

benchmarking datasets forecasting huggingface-datasets time-series time-series-forecasting timeseries

Keywords from Contributors

transformers interactive projection sequences automl genomics observability autograding hacking shellcodes

Last synced: 6 months ago · JSON representation

Repository

Forecast evaluation library

Basic Info

Host: GitHub
Owner: autogluon
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 267 KB

Statistics

Stars: 97
Watchers: 5
Forks: 10
Open Issues: 4
Releases: 14

Topics

benchmarking datasets forecasting huggingface-datasets time-series time-series-forecasting timeseries

Created about 1 year ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Code of conduct Notice

fev

A lightweight library that makes it easy to benchmark time series forecasting models.

Extensible: Easy to define your own forecasting tasks and benchmarks.
Reproducible: Ensures that the results obtained by different users are comparable.
Easy to use: Compatible with most popular forecasting libraries.
Minimal dependencies: Just a thin wrapper on top of 🤗datasets.

How is `fev` different from other benchmarking tools?

Existing forecasting benchmarks usually fall into one of two categories:

Standalone datasets without any supporting infrastructure. These provide no guarantees that the results obtained by different users are comparable. For example, changing the start date or duration of the forecast horizon totally changes the meaning of the scores.
Bespoke end-to-end systems that combine models, datasets and forecasting tasks. Such packages usually come with lots of dependencies and assumptions, which makes extending or integrating these libraries into existing systems difficult.

fev aims for the middle ground - it provides the core benchmarking functionality without introducing unnecessary constraints or bloated dependencies. The library supports point & probabilistic forecasting, different types of covariates, as well as all popular forecasting metrics.

Installation

pip install fev

Quickstart

Create a task from a dataset stored on Hugging Face Hub ```python import fev

task = fev.Task( datasetpath="autogluon/chronosdatasets", datasetconfig="monashkddcup2018", horizon=12, ) Load data available as input to the forecasting modelpython pastdata, futuredata = task.getinputdata() ``-pastdatacontains the past data before the forecast horizon (item ID, past timestamps, target, all covariates). -futuredata` contains future data that is known at prediction time (item ID, future timestamps, and known covariates)

Make predictions ```python def naive_forecast(y: list, horizon: int) -> list: return [y[-1] for _ in range(horizon)]

predictions = [] for ts in pastdata: predictions.append( {"predictions": naiveforecast(y=ts[task.targetcolumn], horizon=task.horizon)} ) Get an evaluation summarypython task.evaluationsummary(predictions, model_name="naive")

{'model_name': 'naive',

'datasetname': 'chronosdatasetsmonashkddcup2018',

'datasetpath': 'autogluon/chronosdatasets',

'datasetconfig': 'monashkddcup2018',

'horizon': 12,

'cutoff': -12,

'lead_time': 1,

'mincontextlength': 1,

'maxcontextlength': None,

'seasonality': 1,

'eval_metric': 'MASE',

'extra_metrics': [],

'quantile_levels': None,

'id_column': 'id',

'timestamp_column': 'timestamp',

'target_column': 'target',

'generateunivariatetargets_from': None,

'pastdynamiccolumns': [],

'excluded_columns': [],

'test_error': 3.3784518866750513,

'trainingtimes': None,

'inferencetimes': None,

'dataset_fingerprint': 'a22d13d4c1e8641c',

'trainedonthis_dataset': False,

'fev_version': '0.5.0',

'MASE': 3.3784518866750513}

``` The evaluation summary contains all information necessary to uniquely identify the forecasting task.

Multiple evaluation summaries produced by different models on different tasks can be aggregated into a single table. ```python

Dataframes, dicts, JSON or CSV files supported

summaries = "https://raw.githubusercontent.com/autogluon/fev/refs/heads/main/benchmarks/example/results/results.csv" fev.leaderboard(summaries)

| modelname | gmeanrelativeerror | avgrank | avginferencetime_s | ... |

|:---------------|-----------------------:|-----------:|-----------------------:|------:|

| auto_theta | 0.874 | 2 | 5.501 | ... |

| auto_arima | 0.887 | 2 | 21.799 | ... |

| auto_ets | 0.951 | 2.667 | 0.737 | ... |

| seasonal_naive | 1 | 3.333 | 0.004 | ... |

```

Tutorials

Quickstart: Define a task and evaluate a model.
Datasets: Use fev with your own datasets.
Tasks & benchmarks: Advanced features for defining tasks and benchmarks.
Models: Evaluate your models and submit results to the leaderboard.

Examples of model implementations compatible with fev are available in examples/.

Leaderboards

We host leaderboards obtained using fev under https://huggingface.co/spaces/autogluon/fev-leaderboard.

Currently, the leaderboard includes the results from the Benchmark II introduced in Chronos: Learning the Language of Time Series. We expect to extend this list in the future.

Datasets

Repositories with datasets in format compatible with fev: - chronos_datasets - fev_datasets

Owner

Name: autogluon
Login: autogluon
Kind: organization

Repositories: 5
Profile: https://github.com/autogluon

GitHub Events

Total

Fork event: 10
Create event: 38
Issues event: 7
Release event: 11
Watch event: 85
Delete event: 20
Issue comment event: 11
Member event: 2
Public event: 2
Push event: 62
Pull request review comment event: 32
Pull request review event: 38
Pull request event: 39

Last Year

Fork event: 10
Create event: 38
Issues event: 7
Release event: 11
Watch event: 85
Delete event: 20
Issue comment event: 11
Member event: 2
Public event: 2
Push event: 62
Pull request review comment event: 32
Pull request review event: 38
Pull request event: 39

Committers

Last synced: 8 months ago

All Time

Total Commits: 42
Total Committers: 5
Avg Commits per committer: 8.4
Development Distribution Score (DDS): 0.143

Past Year

Commits: 42
Committers: 5
Avg Commits per committer: 8.4
Development Distribution Score (DDS): 0.143

Top Committers

Name	Email	Commits
Oleksandr Shchur	s**o@a**m	36
dependabot[bot]	4****]	2
Abdul Fatir	A**s@g**m	2
Andreas Auer	1****a	1
Amazon GitHub Automation	5****o	1

Committer Domains (Top 20 + Academic)

amazon.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 5
Total pull requests: 49
Average time to close issues: 13 days
Average time to close pull requests: 1 day
Total issue authors: 3
Total pull request authors: 5
Average comments per issue: 0.8
Average comments per pull request: 0.22
Merged pull requests: 34
Bot issues: 0
Bot pull requests: 10

Past Year

Issues: 5
Pull requests: 49
Average time to close issues: 13 days
Average time to close pull requests: 1 day
Issue authors: 3
Pull request authors: 5
Average comments per issue: 0.8
Average comments per pull request: 0.22
Merged pull requests: 34
Bot issues: 0
Bot pull requests: 10

View more stats

Top Authors

Issue Authors

shchur (3)
WenWeiTHU (1)
niskrev (1)

Pull Request Authors

shchur (32)
dependabot[bot] (10)
abdulfatir (4)
apointa (2)
AzulGarza (1)

Top Labels

Issue Labels

Pull Request Labels

dependencies (10) python (10)

https://github.com/autogluon/fev

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

fev

How is fev different from other benchmarking tools?

Installation

Quickstart

{'model_name': 'naive',

'datasetname': 'chronosdatasetsmonashkddcup2018',

'datasetpath': 'autogluon/chronosdatasets',

'datasetconfig': 'monashkddcup2018',

'horizon': 12,

'cutoff': -12,

'lead_time': 1,

'mincontextlength': 1,

'maxcontextlength': None,

'seasonality': 1,

'eval_metric': 'MASE',

'extra_metrics': [],

'quantile_levels': None,

'id_column': 'id',

'timestamp_column': 'timestamp',

'target_column': 'target',

'generateunivariatetargets_from': None,

'pastdynamiccolumns': [],

'excluded_columns': [],

'test_error': 3.3784518866750513,

'trainingtimes': None,

'inferencetimes': None,

'dataset_fingerprint': 'a22d13d4c1e8641c',

'trainedonthis_dataset': False,

'fev_version': '0.5.0',

'MASE': 3.3784518866750513}

Dataframes, dicts, JSON or CSV files supported

| modelname | gmeanrelativeerror | avgrank | avginferencetime_s | ... |

|:---------------|-----------------------:|-----------:|-----------------------:|------:|

| auto_theta | 0.874 | 2 | 5.501 | ... |

| auto_arima | 0.887 | 2 | 21.799 | ... |

| auto_ets | 0.951 | 2.667 | 0.737 | ... |

| seasonal_naive | 1 | 3.333 | 0.004 | ... |

Tutorials

Leaderboards

Datasets

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

How is `fev` different from other benchmarking tools?