DeepSynth
DeepSynth: Scaling Neural Program Synthesis with Distribution-based Search - Published in JOSS (2022)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Scientific Fields
Repository
General-purpose program synthesiser
Basic Info
- Host: GitHub
- Owner: nathanael-fijalkow
- License: mit
- Language: Slash
- Default Branch: main
- Size: 138 MB
Statistics
- Stars: 47
- Watchers: 6
- Forks: 9
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
DeepSynth
DeepSynth is a general-purpose program synthesizer in the programming by example framework: the user provides a few examples as pairs of input and output, DeepSynth finds a program matching the examples.
See the official webpage
This is the repository for the code of the paper "Scaling Neural Program Synthesis with Distribution-based Search" published in the conference proceedings of the AAAI Conference on Artificial Intelligence, AAAI'22 and selected for Oral Presentation.
The code was published in the Journal of Open Source Software.
Authors: Nathanaël Fijalkow, Guillaume Lagarde, Théo Matricon, Kevin Ellis, Pierre Ohlmann, Akarsh Potta
Overview
DeepSynth is a tool for automatically synthesizing programs from examples. It combines machine learning predictions with efficient enumeration techniques in a very generic way.
The following figure shows the pipeline.

- The first step is to define a domain-specific language (DSL), which is the programming language specifically designed to solve particular tasks.
- The second step is a compilation step: a context-free grammar (CFG) describing the set of all programs is compiled from the DSL and a number of syntactic constraints. The grammar is used to enumerate programs in an efficient way. However the number of programs grows extremely fast with size, making program synthesis very hard computationally. We believe that the path to scalability is to leverage machine learning predictions in combination with efficient enumeration techniques.
- The third step is to obtain predictions from the examples: a prediction model outputs predictions in the form of probabilities for the rules of the grammar, yielding a probabilistic context-free grammar (PCFG).
- The fourth and final step is the search: enumerating programs from the PCFG. We introduced the distribution-based search as a theoretical framework to analyse algorithms, and constructed two new algorithms: HeapSearch and SQRT Sampling.
Usage
Installation
```bash
clone this repository
git clone https://github.com/nathanael-fijalkow/DeepSynth.git
create your new env
conda create -n deep_synth "python>=3.7"
activate it
conda activate deep_synth
install pip
yes | conda install pip
install this package and the dependencies
pip install -r requirements.txt
or to do it manually
conda install -c conda-forge cython tqdm numpy matplotlib scipy conda install -c pytorch "pytorch>=1.8" pip install git+https://github.com/MaxHalford/vose
For flashfill dataset
pip install sexpdata
If you want to do the parallel experiments
pip install ray
If you run in an ValueError: numpy.ufunc size changed
pip install --upgrade numpy
You are good to go :)
To test your installation you can run the following tests:
python unittestsalgorithms.py python unittestsprograms.py python unittestspredictions.py
Only if you installed ray
python unittestsparallel.py ```
File structure
bash
./
Algorithms/ # the search algorithms + parallel pipeline
DSL/ # DSL: dreamcoder, deepcoder, flashfill
list_dataset/ # DreamCoder dataset in pickle format
Predictions/ # all files related to the ANN for prediction of the grammars
Documentation and examples
Table of contents:
- Creating a DSL
- Compiling a DSL into a CFG
- From a CFG to a PCFG
- Synthesis
- Creating a model
- Training a model
- Predictions from a model
Creating a DSL
The folder DSL contains some Domain-Specific-Languages (DSL), you can use these or create your own.
The DSL we use for the dreamcoder dataset is based on the DeepCoder paper, it is defined in DSL/deepcoder.py.
It contains two important objects: primitive_types and semantics.
The former (primitive_types) is a dictionary mapping the primitive names of the DSL to their types, and the latter (semantics) maps the primitive names to their semantics, which are either values or functions to be used when evaluating the primitives.
Note that a primitive can be simply a constant such as 0, 1, 2 (see the list.py DSL for an example of that).
Let us first discuss the type system, imported with from type_system import *.
The atomic types called PrimitiveType are INT and BOOL, you can create your own with PrimitiveType(type_name).
There are two constructors: List constructs lists by List(type), and Arrow(a, b) represents a function from a to b.
For instance, the type of + is Arrow(INT, Arrow(INT, INT)). Note that this is different from Arrow(Arrow(INT, INT), INT).
We support polymorphic types: PolymorphicType(name).
For example with t0 = PolymorphicType("t0"), t1 = PolymorphicType("t1"), the type Arrow(t0, Arrow(t1), t0)) can be instantiated into Arrow(INT, Arrow(INT), INT)) but not into a Arrow(INT, Arrow(INT), BOOL)) since both t0 must correspond to the same type.
One important point: when defining the semantics of functions, Python must be able to evaluate them one argument at a time: for instance, for addition: lambda a: lambda b: a + b, not lambda a,b: a + b.
The optional no_repetitions is a set of primitives than cannot be repeated such as SORT (indeed it is useless to sort a sorted list). This reduces the program space.
To create a DSL object we use the following:
dsl = DSL(primitive_types, semantics, no_repetitions) (no_repetitions can be None).
Compiling a DSL into a CFG
We can build a Context-Free Grammar (CFG) from a DSL, using the following:
dsl.DSL_to_CFG(type_request,
max_program_depth=4,
min_variable_depth=1,
upper_bound_type_size=10,
n_gram=2).
The only required field is type_request, it defines the type of the program you want to synthesize.
For example if the goal is to synthesize the multiplication function that the type request would be Arrow(INT, Arrow(INT, INT)).
The max_program_depth: int gives an upper bound on the depth of the programs.
The min_variable_depth: int gives a lower bound on the depth of all variables in a program.
The upper_bound_type_size: int is used when instantitating polymorphic types. The type size is defined as follows: a PrimitiveType has size 1, a List(a) has size 1 + size(a) and an Arrow(a, b) has size 1 + size(a) + size(b).
The n_gram: int chooses the granularity of the CFG, meaning how many primitives are used to choose the next primitive. The two expected values are 1 and 2.
From a CFG to a PCFG
To turn a CFG into a Probabilistic CFG (PCFG), we need to add probabilities to derivation rules.
The expected way to do that is by training a neural network, we'll discuss that later.
For testing purposes there are two easier ways to get a PCFG:
* the uniform PCFG can be obtained using cfg.CFG_to_Uniform_PCFG()
* a random PCFG with cfg.CFG_to_Random_PCFG(alpha=0.7). The parameter alpha serves as temperature: the larger the closer to uniform.
To see some programs generated by the PCFG you can run the following code:
```python for program in pcfg.sampling(): print(program)
Note that this is an infinite loop
```
Synthesis
We can now solve a programming by example task.
Let us consider the following: the type request is Arrow(INT, Arrow(INT, INT)) and the set of examples
python
examples = [
([1, 1], 5),
([10, 0], 13),
([0, 2], 1),
]
A solution program is lambda x: lambda y: x - y + 3, there could be others (more complicated).
Following the above let us assume we have constructed a DSL and a PCFG.
We will use the following function from run_experiment.py:
python
run_algorithm(is_correct_program: Callable[[Program, bool], bool],
pcfg: PCFG,
algo_index: int)
Let us explain what are the three arguments:
* is_correct_program is a function that checks if the program is correct, it can be easily created with the help of experiment_helper.py which provides the following function:
python
make_program_checker(dsl: DSL, examples) -> Callable[[Program, bool], bool]
pcfgis thePCFGalgo_indexis the index of the algorithm to use for the search. Here is the mapping:
python
0 => Heap Search
1 => SQRT
2 => Threshold
3 => Sort & Add
4 => DFS
5 => BFS
6 => A*
We can further tune three parameters in run_experiment.py:
* timeout: int = 100 is the timeout in seconds before the search is stopped
* total_number_programs: int = 1_000_000 maximum number of programs enumerated before the search is stopped. On a personal computer this is about 30sec of search for Heap Search.
* use_heap_search_cached_eval = True only when using Heap Search this enables caching of evaluations of partial programs and thus provides a much faster evaluation at the cost of additional memory.
Now, the run_experiment returns a tuple: program, search_time, evaluation_time, nb_programs, cumulative_probability, probability.
Times are in seconds. Program is None is no solution was found.
probability is the probability of the latest program enumerated if no solution was found and the probability of the solution program otherwise.
Parallelisation
There is in run_experiment.py the parallel variant run_algorithm_parallel which takes additional arguments:
splits: intthe number of splits of the grammarn_filters: int = 4the number of evaluators threads (a CPU per evaluator is required)transfer_queue_size: int = 500_000the size of the queue between enumerators and evaluatorstransfer_batch_size: int = 10the size of batches transferred from enumerators to the queue and from the queue to evaluators
The output is the same, except for some metrics which are now in list form which means they are per enumerator or per evaluator.
Up to now we did not use machine learned models. Let us get to the machine learning part.
Model Creation
The file model_loader.py contains generic functions for the two types of models: int list models and the generic models.
The former support only one type request while the latter support generic type requests.
For instance, the function build_deepcoder_generic_model creates a BigGramsPredictor.
There are a few parameters for our model:
nb_arguments_maxis the maximum number of arguments a function can havelexiconis the list of all symbols that can be encountered in examples, it helps us with the first step to create an integer encoding of the examplesembedding_output_dimensionthe embedding size of an example, then for each example they are embedded using a classic torch embeddingsize_hiddenis the sizes of the inner layers and output layers of the RNN which is run on all examples of a task sequentiallynumber_layers_RNNis the number of layers the RNN should have
Model training
The reference file is produce_network.py.
After loading a model, it generates valid tasks with their solution for the model and train the model on these tasks.
The model is saved at each epoch.
The part you might want to change according to your needs is:
```python dataset_name = "dreamcoder"
dataset_name = "deepcoder"
dataset_name = "flashfill"
Set to None for generic model
type_request = Arrow(List(INT), List(INT))
type_request = None
datasetsize: int = 10000 nbepochs: int = 1 batchsize: int = 128
TRAINING
if datasetname == "dreamcoder": curdsl, cfg, model = builddreamcoderintlistmodel() elif datasetname == "deepcoder": if typerequest is None: _, typerequests = \ deepcoderdatasetloader.loadtasks("./deepcoderdataset/T=3test.json") curdsl, cfgdict, model = builddeepcodergenericmodel(typerequests) else: curdsl, cfg, model = builddeepcoderintlistmodel() elif datasetname == "flashfill": curdsl, cfgdict, model = buildflashfillgenericmodel() else: assert False, f"Unrecognized dataset: {datasetname}"
print("Training model:", getmodelname(model), "on", datasetname) print("Type Request:", typerequest or "generic")
if typerequest: nbexamplesmax: int = 2 else: nbexamples_max: int = 5 ```
Predictions from a model
Now we have a model, let us see how to use it to construct a PCFG.
This is model dependent, but the good news is that the function task_set2dataset from experiment_helper.py does it for us.
The arguments are:
tasksa list where for each task there is a list of tuples (input, output) which are the examples of the taskmodelis the modeldslis the DSL
The output is for each task the tuple (task_name, PCFG_predicted_by_model, is_correct_program).
Reproducing the experiments from the AAAI'22 paper
For the experiments, you only need to run the produce_network.py file (editing the parameters based on the dataset or the batch size).
A .weights file should appear at the root folder.
This will train a neural network on random generated programs as described in Appendix F in the paper.
All of the files mentioned in this section are located in the root folder and follow this pattern run_*_experiments*.py.
Here is a short summary of each experiment:
run_random_PCFGsearch.pyproduce a list of all programs generated under Xsec of search time by all algorithms.run_random_PCFGsearch_parallel.pysame experiment but iwth the grammar_splitter and multiple CPUs.run_experiments_<dataset>.pytry to find solutions using an ANN to predict the grammar and for each algorithm logs the search data for the corresponding<dataset>. The suffixparallelcan also be found indicating that the algorithms are run in parallel. The semantics experiments in the paper used a trained model thatn can be obtained usingproduce_network.pyor directly in the repository. The results can be plotted usingplot_results_semantics.py.
Note that for the DreamCoder experiment in our paper, we did not use the cached evaluation of HeapSearch, this can be reproduced by setting use_heap_search_cached_eval to False in run_experiment.py.
How to download the DeepCoder dataset
First, download the archive from here (DeepCoder repo): https://storage.googleapis.com/deepcoder/dataset.tar.gz in a folder deepcoder_dataset at the root of DeepSynth.
Then you simply need to:
bash
gunzip dataset.tar.gz
tar -xf dataset.tar
You should see a few JSON files.
Report an issue
If you find any issue, please create a GitHub issue with specifics steps to reproduce the bug.
Contribute
Contributions are welcome! However, feature-wise we believe DeepSynth is in a maintenance state. Please first, create an issue with what your contribution should be about. Then you can create a pull request.
Owner
- Name: Nathanaël Fijalkow
- Login: nathanael-fijalkow
- Kind: user
- Location: Bordeaux, France
- Company: CNRS
- Website: http://games-automata-play.com/
- Repositories: 12
- Profile: https://github.com/nathanael-fijalkow
Computer science researcher working on program synthesis
JOSS Publication
DeepSynth: Scaling Neural Program Synthesis with Distribution-based Search
Authors
CNRS, LaBRI and Université de Bordeaux, France, The Alan Turing Institute of data science, United Kingdom
CNRS, LaBRI and Université de Bordeaux, France
Cornell University, United States
Tags
program synthesis programming by example neuro-symbolic methods parallel search proceduresGitHub Events
Total
- Issues event: 2
- Watch event: 13
- Issue comment event: 1
- Push event: 1
- Pull request review event: 1
- Pull request event: 4
- Fork event: 2
Last Year
- Issues event: 2
- Watch event: 13
- Issue comment event: 1
- Push event: 1
- Pull request review event: 1
- Pull request event: 4
- Fork event: 2
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Theomat | t****n@g****m | 226 |
| Nathanael Fijalkow | n****w@g****m | 50 |
| Guillaume Lagarde | g****e@g****m | 8 |
| ellisk42@gmail.com | e****2@g****m | 6 |
| Alex Bezzubov | b****z@a****g | 3 |
| Francesco Compagno | 4****h | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 2
- Total pull requests: 4
- Average time to close issues: 11 days
- Average time to close pull requests: 1 day
- Total issue authors: 2
- Total pull request authors: 4
- Average comments per issue: 3.0
- Average comments per pull request: 0.0
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: 43 minutes
- Average time to close pull requests: 39 minutes
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- kataph (1)
- njuaplusplus (1)
Pull Request Authors
- kataph (2)
- bzz (1)
- arfon (1)
- ellisk42 (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Cython ==0.29.32
- Pillow ==9.2.0
- contourpy ==1.0.5
- cycler ==0.11.0
- fonttools ==4.37.4
- kiwisolver ==1.4.4
- matplotlib ==3.6.0
- numpy ==1.23.3
- packaging ==21.3
- pyparsing ==3.0.9
- python-dateutil ==2.8.2
- scipy ==1.9.1
- six ==1.16.0
- torch >=1.8
- tqdm ==4.64.1
- typing_extensions ==4.4.0