stable-nalu

Code for Neural Arithmetic Units (ICLR) and Measuring Arithmetic Extrapolation Performance (SEDL|NeurIPS)

https://github.com/andreasmadsen/stable-nalu

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Code for Neural Arithmetic Units (ICLR) and Measuring Arithmetic Extrapolation Performance (SEDL|NeurIPS)

Basic Info
Statistics
  • Stars: 147
  • Watchers: 8
  • Forks: 14
  • Open Issues: 1
  • Releases: 0
Created almost 7 years ago · Last pushed over 4 years ago
Metadata Files
Readme License Citation

README.md

Neural Arithmetic Units

This code encompass two publiations. The ICLR paper is still in review, please respect the double-blind review process.

Hidden Size results

Figure, shows performance of our proposed NMU model.

Publications

SEDL Workshop at NeurIPS 2019

Reproduction study of the Neural Arithmetic Logic Unit (NALU). We propose an improved evaluation criterion of arithmetic tasks including a "converged at" and a "sparsity error" metric. Results will be presented at SEDL|NeurIPS 2019. – Read paper.

bib @inproceedings{maep-madsen-johansen-2019, author={Andreas Madsen and Alexander Rosenberg Johansen}, title={Measuring Arithmetic Extrapolation Performance}, booktitle={Science meets Engineering of Deep Learning at 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)}, address={Vancouver, Canada}, journal={CoRR}, volume={abs/1910.01888}, month={October}, year={2019}, url={http://arxiv.org/abs/1910.01888}, archivePrefix={arXiv}, primaryClass={cs.LG}, arxivId = {2001.05016}, eprint={1910.01888} }

ICLR 2020 (Spotlight)

Our main contribution, which includes a theoretical analysis of the optimization challenges with the NALU. Based on these difficulties we propose several improvements. – Read paper.

bib @inproceedings{mnu-madsen-johansen-2020, author = {Andreas Madsen and Alexander Rosenberg Johansen}, title = {{Neural Arithmetic Units}}, booktitle = {8th International Conference on Learning Representations, ICLR 2020}, volume = {abs/2001.05016}, year = {2020}, url = {http://arxiv.org/abs/2001.05016}, archivePrefix={arXiv}, primaryClass={cs.LG}, arxivId = {2001.05016}, eprint={2001.05016} }

Install

bash python3 setup.py develop

This will install this code under the name stable-nalu, and the following dependencies if missing: numpy, tqdm, torch, scipy, pandas, tensorflow, torchvision, tensorboard, tensorboardX.

Experiments used in the paper

All experiments results shown in the paper can be exactly reproduced using fixed seeds. The lfs_batch_jobs directory contains bash scripts for submitting jobs to an LFS queue. The bsub and its arguments, can be replaced with python3 or an equivalent command for another queue system.

The export directory contains python scripts for converting the tensorboard results into CSV files and contains R scripts for presenting those results, as presented in the paper.

Naming changes

As said earlier the naming convensions in the code are different from the paper. The following translations can be used:

  • Linear: --layer-type linear
  • ReLU: --layer-type ReLU
  • ReLU6: --layer-type ReLU6
  • NAC-add: --layer-type NAC
  • NAC-mul: --layer-type NAC --nac-mul normal
  • NAC-sigma: --layer-type PosNAC --nac-mul normal
  • NAC-nmu: --layer-type ReRegualizedLinearPosNAC --nac-mul normal --first-layer ReRegualizedLinearNAC
  • NALU: --layer-type NALU
  • NAU: --layer-type ReRegualizedLinearNAC
  • NMU: --layer-type ReRegualizedLinearNAC --nac-mul mnac

Extra experiments

Here are 4 experiments in total, they correspond to the experiments in the NALU paper.

python3 experiments/simple_function_static.py --help # 4.1 (static) python3 experiments/sequential_mnist.py --help # 4.2

Example with using NMU on the multiplication problem:

bash python3 experiments/simple_function_static.py \ --operation mul --layer-type ReRegualizedLinearNAC --nac-mul mnac \ --seed 0 --max-iterations 5000000 --verbose \ --name-prefix test --remove-existing-data

The --verbose logs network internal measures to the tensorboard. You can access the tensorboard with:

tensorboard --logdir tensorboard

Owner

  • Name: Andreas Madsen
  • Login: AndreasMadsen
  • Kind: user
  • Location: Copenhagen, Denmark
  • Company: MILA

Researching interpretability for Machine Learning because society needs it.

Citation (CITATION.cff)

authors:
  - given-names: Andreas
    family-names: Madsen
  - given-names: Alexander
    family-names: Rosenberg Johansen
cff-version: 1.2.0
title: Neural Arithmetic Units
message: If you use this software, please cite our Neural Arithmetic Units paper.
preferred-citation:
  title: "Neural Arithmetic Units"
  authors:
    - given-names: Andreas
      family-names: Madsen
    - given-names: Alexander
      family-names: Rosenberg Johansen
  conference:
    name: "8th International Conference on Learning Representations, ICLR 2020"
  date-published: 2016-06-11
  month: 3
  year: 2020
  type: proceedings
  url: http://arxiv.org/abs/2001.05016
repository-code: https://github.com/AndreasMadsen/stable-nalu
license: MIT

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 1
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 7 days
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • UlrikHagerRoiser (1)
Pull Request Authors
  • cookfish (1)
  • AndreasMadsen (1)
Top Labels
Issue Labels
Pull Request Labels