https://github.com/google-deepmind/neural_networks_chomsky_hierarchy

Neural Networks and the Chomsky Hierarchy

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.9%) to scientific vocabulary

Keywords from Contributors

mujoco

Last synced: 6 months ago · JSON representation

Repository

Neural Networks and the Chomsky Hierarchy

Basic Info

Host: GitHub
Owner: google-deepmind
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://arxiv.org/abs/2207.02098
Size: 147 KB

Statistics

Stars: 205
Watchers: 11
Forks: 23
Open Issues: 0
Releases: 0

Created almost 4 years ago · Last pushed about 2 years ago

Metadata Files

Readme Contributing License

Neural Networks and the Chomsky Hierarchy

Overview figure

This repository provides an implementation of our ICLR 2023 paper Neural Networks and the Chomsky Hierarchy.

Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (2200 models, 16 tasks) to investigate whether insights from the theory of computation can predict the limits of neural network generalization in practice. We demonstrate that grouping tasks according to the Chomsky hierarchy allows us to forecast whether certain architectures will be able to generalize to out-of-distribution inputs. This includes negative results where even extensive amounts of data and training time never led to any non-trivial generalization, despite models having sufficient capacity to perfectly fit the training data. Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, LSTMs can solve regular and counter-language tasks, and only networks augmented with structured memory (such as a stack or memory tape) can successfully generalize on context-free and context-sensitive tasks.

It is based on JAX and Haiku and contains all code, datasets, and models necessary to reproduce the paper's results.

Content

. ├── models | ├── ndstack_rnn.py - Nondeterministic Stack-RNN (DuSell & Chiang, 2021) | ├── rnn.py - RNN (Elman, 1990) | ├── stack_rnn.py - Stack-RNN (Joulin & Mikolov, 2015) | ├── tape_rnn.py - Tape-RNN, loosely based on Baby-NTM (Suzgun et al., 2019) | └── transformer.py - Transformer (Vaswani et al., 2017) ├── tasks | ├── cs - Context-sensitive tasks | ├── dcf - Determinisitc context-free tasks | ├── regular - Regular tasks | └── task.py - Abstract GeneralizationTask ├── experiments | ├── constants.py - Training/Evaluation constants | ├── curriculum.py - Training curricula (over sequence lengths) | ├── example.py - Example training script (RNN on the Even Pairs task) | ├── range_evaluation.py - Evaluation loop (over unseen sequence lengths) | ├── training.py - Training loop | └── utils.py - Utility functions ├── README.md └── requirements.txt - Dependencies

tasks contains all tasks, organized in their Chomsky hierarchy levels (regular, dcf, cs). They all inherit the abstract class GeneralizationTask, defined in tasks/task.py.

models contains all the models we use, written in jax and haiku, two open source libraries.

training contains the code for training models and evaluating them on a wide range of lengths. We also included an example to train and evaluate an RNN on the Even Pairs task. We use optax for our optimizers.

Installation

Clone the source code into a local directory: bash git clone https://github.com/google-deepmind/neural_networks_chomsky_hierarchy.git cd neural_networks_chomsky_hierarchy

pip install -r requirements.txt will install all required dependencies. This is best done inside a conda environment. To that end, install Anaconda. Then, create and activate the conda environment: bash conda create --name nnch conda activate nnch

Install pip and use it to install all the dependencies: bash conda install pip pip install -r requirements.txt

If you have a GPU available (highly recommended for fast training), then you can install JAX with CUDA support. bash pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html Note that the jax version must correspond to the existing CUDA installation you wish to use (CUDA 12 in the example above). Please see the JAX documentation for more details.

Usage

Before running any code, make sure to activate the conda environment and set the PYTHONPATH: bash conda activate nnch export PYTHONPATH=$(pwd)/..

We provide an example of a training and evaluation run at: bash python experiments/example.py

Citing This Work

bibtex @inproceedings{deletang2023neural, author = {Gr{\'{e}}goire Del{\'{e}}tang and Anian Ruoss and Jordi Grau{-}Moya and Tim Genewein and Li Kevin Wenliang and Elliot Catt and Chris Cundy and Marcus Hutter and Shane Legg and Joel Veness and Pedro A. Ortega}, title = {Neural Networks and the Chomsky Hierarchy}, booktitle = {11th International Conference on Learning Representations}, year = {2023}, }

License and Disclaimer

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0

All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode

Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.

This is not an official Google product.

Owner

Name: Google DeepMind
Login: google-deepmind
Kind: organization

Website: https://www.deepmind.com/
Repositories: 245
Profile: https://github.com/google-deepmind

GitHub Events

Total

Watch event: 20
Fork event: 8

Last Year

Watch event: 20
Fork event: 8

Committers

Last synced: 10 months ago

All Time

Total Commits: 40
Total Committers: 5
Avg Commits per committer: 8.0
Development Distribution Score (DDS): 0.45

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Anian Ruoss	a**r@g**m	22
Gregoire Deletang	g**t@g**m	15
Peter Hawkins	p**s@g**m	1
DeepMind	n**y@g**m	1
DeepMind	n**y@d**m	1

Committer Domains (Top 20 + Academic)

google.com: 4 deepmind.com: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 8
Total pull requests: 1
Average time to close issues: 21 days
Average time to close pull requests: about 1 month
Total issue authors: 8
Total pull request authors: 1
Average comments per issue: 4.75
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

kirk86 (1)
FlorianDietz (1)
shawntan (1)
chijames (1)
gbrlfaria (1)
tami1082 (1)
jabowery (1)
Thomas-MMJ (1)

Pull Request Authors

gbrlfaria (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

requirements.txt pypi

absl-py *
dm-haiku *
dm-tree *
jax *
numpy *
optax *
tqdm *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/google-deepmind/neural_networks_chomsky_hierarchy

Science Score: 36.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

Neural Networks and the Chomsky Hierarchy

Content

Installation

Usage

Citing This Work

License and Disclaimer

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies