smoe

Spatial Mixture-of-Experts

https://github.com/spcl/smoe

Science Score: 62.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
    Organization spcl has institutional domain (spcl.inf.ethz.ch)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Spatial Mixture-of-Experts

Basic Info
  • Host: GitHub
  • Owner: spcl
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Size: 48.8 KB
Statistics
  • Stars: 20
  • Watchers: 7
  • Forks: 1
  • Open Issues: 2
  • Releases: 0
Created over 3 years ago · Last pushed over 3 years ago
Metadata Files
Readme License Citation

README.md

Spatial Mixture-of-Experts

This is the official repository for Spatial Mixture-of-Experts. A Spatial Mixture-of-Experts (SMoE) layer learns the underlying location dependence of a dataset.

If you find this useful, please cite:

@inproceedings{ title={Spatial Mixture-of-Experts}, author={Nikoli Dryden and Torsten Hoefler}, booktitle={Advances in Neural Information Processing Systems (NeurIPS)}, year={2022} }

This currently contains code for our heat diffusion experiments and several baselines. Code for additional experiments (WeatherBench, ENS-10, etc.) will be released soon.

Location-Dependent Heat Diffusion

Please see Section 3.1 of the paper for full details.

Generating Heat Diffusion Data

Note: Due to differences in random number generation, we do not guarantee an identical dataset will be generated even with the same random seed.

We include the region map we used (heat/mask.npy). A different region map can be generated using the generate_mask.py script (see documentation therein).

The heat diffusion data is generated using the generate_data.py script. (Run it with --help for full options.) To replicate our dataset, run as follows: ```

Training data:

python generate_data.py --diffusivity 0.0025 0.025 0.25 --num-runs 1000 train

Validation data:

python generate_data.py --diffusivity 0.0025 0.025 0.25 --num-runs 20 --seed 546981 val

Test data:

python generate_data.py --diffusivity 0.0025 0.025 0.25 --num-runs 20 --seed 865124 test

Move data:

mv {train.npy,val.npy,test.npy} heat ```

Running Experiments

Experiments can be run using the train.py script (--help gives full options).

A basic run of an SMoE using our configuration is as follows: python train.py --output-dir out --data-path heat --data-no-norm --job-id smoe --fp16 --epochs 150 --schedule plateau --plateau-epochs 15 --early-stop 30 --optimizer adam --lr 0.001 --loss mse --initialization default --metric mse mask-mse prcntclose mask-prcntclose --mask heat/mask.npy --prcntclose 0.01 --opt-metric prcntclose1.0 --opt-metric-max --prcntclose-tol-scale 0.01 0.1 1 --save-on-best --stop-on-metric-level 100 --model smoe --last-layer-experts 3 --gate-type latent --unweighted-smoe --rc-loss --dampen-expert-error --routing-error-quantile 0.3

Additional models can be selected using the --model argument. For example, to train a CNN: python train.py --output-dir out --data-path heat --data-no-norm --job-id smoe --fp16 --epochs 150 --schedule plateau --plateau-epochs 15 --early-stop 30 --optimizer adam --lr 0.001 --loss mse --initialization default --metric mse mask-mse prcntclose mask-prcntclose --mask heat/mask.npy --prcntclose 0.01 --opt-metric prcntclose1.0 --opt-metric-max --prcntclose-tol-scale 0.01 0.1 1 --save-on-best --stop-on-metric-level 100 --model cnn --layers conv conv --conv-filters 4

Owner

  • Name: SPCL
  • Login: spcl
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
title: "Spatial Mixture-of-Experts"
message: "If you use SMoEs, please cite as"
authors:
  - family-names: Dryden
    given-names: Nikoli
  - family-names: Hoefler
    given-names: Torsten
preferred-citation:
  title: "Spatial Mixture-of-Experts"
  year: "2022"
  type: conference-paper
  collection-title: "Advances in Neural Information Processing Systems (NeurIPS)"
  authors:
    - family-names: Dryden
      given-names: Nikoli
    - family-names: Hoefler
      given-names: Torsten

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • KennyNH (1)
  • linkenghong (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels