https://github.com/ant-research/easytemporalpointprocess

EasyTPP: Towards Open Benchmarking Temporal Point Processes

https://github.com/ant-research/easytemporalpointprocess

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    2 of 10 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary

Keywords

benchmarking machine-learning-algorithms stochastic-processes temporal-data time-series
Last synced: 5 months ago · JSON representation

Repository

EasyTPP: Towards Open Benchmarking Temporal Point Processes

Basic Info
Statistics
  • Stars: 307
  • Watchers: 1
  • Forks: 37
  • Open Issues: 7
  • Releases: 12
Topics
benchmarking machine-learning-algorithms stochastic-processes temporal-data time-series
Created over 2 years ago · Last pushed 7 months ago
Metadata Files
Readme

README.md

EasyTPP [ICLR 2024]

Python Version Code License Last Commit
PyPI version Downloads Hugging Face Open Issues

EasyTPP is an easy-to-use development and application toolkit for Temporal Point Process (TPP), with key features in configurability, compatibility and reproducibility. We hope this project could benefit both researchers and practitioners with the goal of easily customized development and open benchmarking in TPP.

| Features | Model List | Dataset | Quick Start | Benchmark |Documentation |Todo List | Citation |Acknowledgement | Star History |

News

Features [Back to Top]

  • Configurable and customizable: models are modularized and configurable,with abstract classes to support developing customized TPP models.
  • PyTorch-based implementation: EasyTPP implements state-of-the-art TPP models using PyTorch 1.7.0+, providing a clean and modern deep learning framework.
  • Reproducible: all the benchmarks can be easily reproduced.
  • Hyper-parameter optimization: a pipeline of optuna-based HPO is provided.

Model List [Back to Top]

We provide reference implementations of various state-of-the-art TPP papers:

| No | Publication | Model | Paper | Implementation | |:---:|:-----------:|:-------------:|:-----------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------| | 1 | KDD'16 | RMTPP | Recurrent Marked Temporal Point Processes: Embedding Event History to Vector | PyTorch | | 2 | NeurIPS'17 | NHP | The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process | PyTorch | | 3 | NeurIPS'19 | FullyNN | Fully Neural Network based Model for General Temporal Point Processes | PyTorch | | 4 | ICML'20 | SAHP | Self-Attentive Hawkes process | PyTorch | | 5 | ICML'20 | THP | Transformer Hawkes process | PyTorch | | 6 | ICLR'20 | IntensityFree | Intensity-Free Learning of Temporal Point Processes | PyTorch | | 7 | ICLR'21 | ODETPP | Neural Spatio-Temporal Point Processes (simplified) | PyTorch | | 8 | ICLR'22 | AttNHP | Transformer Embeddings of Irregularly Spaced Events and Their Participants | PyTorch |

Dataset [Back to Top]

We preprocessed one synthetic and five real world datasets from widely-cited works that contain diverse characteristics in terms of their application domains and temporal statistics: - Synthetic: a univariate Hawkes process simulated by Tick library. - Retweet (Zhou, 2013): timestamped user retweet events. - Taxi (Whong, 2014): timestamped taxi pick-up events. - StackOverflow (Leskovec, 2014): timestamped user badge reward events in StackOverflow. - Taobao (Xue et al, 2022): timestamped user online shopping behavior events in Taobao platform. - Amazon (Xue et al, 2022): timestamped user online shopping behavior events in Amazon platform.

Per users' request, we processed two non-anthropogenic datasets - Earthquake: timestamped earthquake events over the Conterminous U.S from 1996 to 2023, processed from USGS. - Volcano eruption: timestamped volcano eruption events over the world in recent hundreds of years, processed from The Smithsonian Institution.

All datasets are preprocess to the Gatech format dataset widely used for TPP researchers, and saved at Google Drive with a public access.

Quick Start [Back to Top]

Colab Tutorials

Explore the following tutorials that can be opened directly in Google Colab:

  • Open in Colab Tutorial 1: Dataset in EasyTPP.
  • Open in Colab Tutorial 2: Tensorboard in EasyTPP.
  • Open in Colab Tutorial 3: Training and Evaluation of TPPs.

End-to-end Example

We provide an end-to-end example for users to run a standard TPP model with EasyTPP.

Step 1. Installation

First of all, we can install the package either by using pip or from the source code on Github.

To install the latest stable version: bash pip install easy-tpp

To install the latest on GitHub: bash git clone https://github.com/ant-research/EasyTemporalPointProcess.git cd EasyTemporalPointProcess python setup.py install

Step 2. Prepare datasets

We need to put the datasets in a local directory before running a model and the datasets should follow a certain format. See OnlineDoc - Datasets for more details.

Suppose we use the taxi dataset in the example.

Step 3. Train the model

Before start training, we need to set up the config file for the pipeline. We provide a preset config file in Example Config. The details of the configuration can be found in OnlineDoc - Training Pipeline.

After the setup of data and config, the directory structure is as follows:

```bash

data
 |______taxi
         |____ train.pkl
         |____ dev.pkl
         |____ test.pkl

configs
 |______experiment_config.yaml

```

Then we start the training by simply running the script

```python

import argparse from easytpp.configfactory import Config from easy_tpp.runner import Runner

def main(): parser = argparse.ArgumentParser()

parser.add_argument('--config_dir', type=str, required=False, default='configs/experiment_config.yaml',
                    help='Dir of configuration yaml to train and evaluate the model.')

parser.add_argument('--experiment_id', type=str, required=False, default='NHP_train',
                    help='Experiment id in the config file.')

args = parser.parse_args()

config = Config.build_from_yaml_file(args.config_dir, experiment_id=args.experiment_id)

model_runner = Runner.build_from_config(config)

model_runner.run()

if name == 'main': main()

```

A more detailed example can be found at OnlineDoc - QuickStart.

Documentation [Back to Top]

The classes and methods of EasyTPP have been well documented so that users can generate the documentation by:

shell cd doc pip install -r requirements.txt make html NOTE: * The doc/requirements.txt is only for documentation by Sphinx, which can be automatically generated by Github actions .github/workflows/docs.yml. (Trigger by pull request.)

The full documentation is available on the website.

Benchmark [Back to Top]

In the examples folder, we provide a script to benchmark the TPPs, with Taxi dataset as the input.

To run the script, one should download the Taxi data following the above instructions. The config file is readily setup up. Then run

shell cd examples python run_retweet.py

License [Back to Top]

This project is licensed under the Apache License (Version 2.0). This toolkit also contains some code modified from other repos under other open-source licenses. See the NOTICE file for more information.

Todo List [Back to Top]

  • [x] New dataset:
    • [x] Earthquake: the source data is available in USGS.
    • [x] Volcano eruption: the source data is available in NCEI.
  • [ ] New model:
    • [ ] Meta Temporal Point Process, ICLR 2023.
    • [ ] Model-based RL via TPP, AAAI 2022.

Citation [Back to Top]

If you find EasyTPP useful for your research or development, please cite the following paper: @inproceedings{xue2024easytpp, title={EasyTPP: Towards Open Benchmarking Temporal Point Processes}, author={Siqiao Xue and Xiaoming Shi and Zhixuan Chu and Yan Wang and Hongyan Hao and Fan Zhou and Caigao Jiang and Chen Pan and James Y. Zhang and Qingsong Wen and Jun Zhou and Hongyuan Mei}, booktitle = {International Conference on Learning Representations (ICLR)}, year = {2024}, url ={https://arxiv.org/abs/2307.08097} }

Acknowledgment [Back to Top]

The project is jointly initiated by Machine Intelligence Group, Alipay and DAMO Academy, Alibaba.

The following repositories are used in EasyTPP, either in close to original form or as an inspiration:

Star History [Back to Top]

Star History Chart

Owner

  • Name: Ant Research
  • Login: ant-research
  • Kind: organization

GitHub Events

Total
  • Create event: 4
  • Release event: 4
  • Issues event: 30
  • Watch event: 65
  • Issue comment event: 61
  • Push event: 25
  • Pull request event: 6
  • Fork event: 12
Last Year
  • Create event: 4
  • Release event: 4
  • Issues event: 30
  • Watch event: 65
  • Issue comment event: 61
  • Push event: 25
  • Pull request event: 6
  • Fork event: 12

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 164
  • Total Committers: 10
  • Avg Commits per committer: 16.4
  • Development Distribution Score (DDS): 0.341
Past Year
  • Commits: 37
  • Committers: 6
  • Avg Commits per committer: 6.167
  • Development Distribution Score (DDS): 0.459
Top Committers
Name Email Commits
iLampard s****e@1****m 108
siqiao.xsq s****q@a****m 30
Yuxin Chang c****l@g****m 11
iLampard s****q@a****m 6
Robin van de Water r****r@g****m 2
Kaleb-Wang 8****g 2
zefang-liu l****g@g****u 2
Ant OSS 4****s 1
ajboyd2 a****d@c****u 1
Kostadin Cvejoski c****i@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 55
  • Total pull requests: 11
  • Average time to close issues: 12 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 34
  • Total pull request authors: 6
  • Average comments per issue: 2.96
  • Average comments per pull request: 0.45
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 26
  • Pull requests: 8
  • Average time to close issues: 19 days
  • Average time to close pull requests: 1 day
  • Issue authors: 17
  • Pull request authors: 5
  • Average comments per issue: 2.54
  • Average comments per pull request: 0.38
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • HaochenWang1243 (6)
  • zhjcp (5)
  • SiriusHou (4)
  • HojunYou (3)
  • yuzhenmao (3)
  • 943fansi (2)
  • rvandewater (2)
  • ivan-chai (2)
  • yinzhu-quan (2)
  • David-Berghaus (2)
  • RunningGuo (2)
  • arunksagotra (2)
  • ElNino9495 (1)
  • liu-yang-maker (1)
  • huangx06 (1)
Pull Request Authors
  • iLampard (4)
  • zefang-liu (4)
  • ajboyd2 (3)
  • cvejoski (2)
  • yuxinc17 (2)
  • HaochenWang1243 (1)
  • rvandewater (1)
Top Labels
Issue Labels
good first issue (24) bug (5) question (2) enhancement (1) help wanted (1) documentation (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 79 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 11
  • Total maintainers: 1
pypi.org: easy-tpp

An easy and flexible tool for neural temporal point process

  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 79 Last month
Rankings
Dependent packages count: 7.2%
Average: 24.3%
Dependent repos count: 41.3%
Maintainers (1)
Last synced: 6 months ago