https://github.com/google-deepmind/onetwo

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.2%) to scientific vocabulary

Keywords from Contributors

jax deep-neural-networks distributed research

Last synced: 6 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: google-deepmind
License: apache-2.0
Language: Python
Default Branch: main
Size: 1.96 MB

Statistics

Stars: 227
Watchers: 15
Forks: 13
Open Issues: 3
Releases: 3

Created about 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme Changelog Contributing License

OneTwo

TL;DR: OneTwo is a Python library designed to simplify interactions with large (language and multimodal) foundation models, primarily aimed at researchers in prompting and prompting strategies.

Foundation Models are increasingly being used in complex scenarios with multiple back-and-forth interactions between the model and some traditional code (possibly generated by the model itself). This leads to the emergence of programs that combine ML models with traditional code. The goal of the OneTwo library is to enable the creation and execution of such programs. It is designed for researchers (and developers) who want to explore how to get the best out of foundational models in a situation where it is not necessarily possible to change their weights (i.e. perform fine-tuning).

Some properties of OneTwo that are particularly impactful for researcher productivity include the following:

Model-agnostic: Provides a uniform API to access different models that can easily be swapped and compared.
Flexible: Supports implementation of arbitrarily complex computation graphs involving combinations of sequential and parallel operations, including interleaving of calls to foundation models and to other tools.
Efficient: Automatically optimizes request batching and other details of model server interactions under-the-hood for maximizing throughput, while allowing prompting strategies to be implemented straightforwardly, as if they were dealing with just single requests.
Reproducible: Automatically caches requests/replies for easy stop-and-go or replay of experiments.

Features

Uniform API: OneTwo defines a set of primitives or so-called built-in functions representing the common ways to interact with foundation models. Since different open-source models or public APIs may have different capabilities and expose calls with different parameters, OneTwo attempts to provide a uniform way to access all of those, implementing some best-effort conversion to guarantee that a OneTwo program will run on any model.
Convenient syntax: Different syntaxes are proposed to enable writing prompts or sequences of prompts conveniently and in a modular fashion. One can use formatted strings or jinja templates or can assemble the prompt as a concatenation of (possibly custom-defined) function calls.
Experimentation: OneTwo offers functionality to easily run experiments on datasets and collect metrics, with automatic caching of the results so that minor modifications of the workflow can be replayed with minimal impact, without having to perform the same requests to the models, while at the same time being aware of the sampling behaviour (e.g. when different samples are needed for a given request).
Tool use: OneTwo supports different ways of calling tools from a model, including running model-generated Python code involving tool calls in a sandbox.
Agents: OneTwo provides abstractions for defining agents, i.e. functions that perform some complex action in a step-by-step manner, while updating some internal state. These agents can be combined naturally and used within generic optimization algorithms.
Execution model: The execution model takes care of the details of orchestrating the calls to the models and tools. It offers the following functionality:
- Asynchronous execution: Thanks to the use of asynchronous execution, the actual calls to the models and tools can be performed in parallel. However, the user can define a complex workflow in an intuitive and functional manner without having to think of the details of the execution.
- Batching: Similarly, whenever the backends support multiple simultaneous requests, or batched requests, the OneTwo library will leverage this and group the requests for maximizing throughput.
- Smart Caching: In addition to making it easy to replay all or part of an experiment, the caching provided by the library handles the case of multiple random samples obtained from one model, keeping track of how many samples have been obtained for each request, so that no result is wasted and actual requests are minimized while the user is tuning their workflow.

Quick start

Installation

You may want to install the package in a virtual environment, in which case you will need to start a virtual environment with the following command:

```shell python3 -m venv PATHTODIRECTORYFORVIRTUAL_ENV

Activate it.

. PATHTODIRECTORYFORVIRTUAL_ENV/bin/activate ```

Once you no longer need it, this virtual environment can be deleted with the following command:

shell deactivate

Install the package:

shell pip install git+https://github.com/google-deepmind/onetwo

To start using it, import it with:

python from onetwo import ot

Running unit tests

In order to run the tests, first clone the repository:

shell git clone https://github.com/google-deepmind/onetwo

Then from the cloned directory you can invoke pytest:

shell pytest onetwo/core

However, doing pytest onetwo will not work as pytest collects all the test names without keeping their directory of origin so there may be name clashes, so you have to loop through the subdirectories.

Tutorial and Documentation

This Colab is a good starting point and demonstrates most of the features available in onetwo.

Some background on the basic concepts of the library can be found here: Basics.

Some of the frequently asked questions are discussed here: FAQ.

Citing OneTwo

To cite this repository:

bibtex @software{onetwo2024github, author = {Olivier Bousquet and Nathan Scales and Nathanael Sch{\"a}rli and Ilya Tolstikhin}, title = {{O}ne{T}wo: {I}nteracting with {L}arge {M}odels}, url = {https://github.com/google-deepmind/onetwo}, version = {0.2.1}, year = {2024}, }

In the above BibTeX entry, names are in alphabetical order, the version number is intended to be the one returned by ot.__version__ (i.e., the latest version mentioned in version.py and in the CHANGELOG, and the year corresponds to the project's open-source release.

License

This code is licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Disclaimer

This is not an official Google product.

Owner

Name: Google DeepMind
Login: google-deepmind
Kind: organization

Website: https://www.deepmind.com/
Repositories: 245
Profile: https://github.com/google-deepmind

GitHub Events

Total

Create event: 1
Issues event: 2
Release event: 1
Watch event: 62
Issue comment event: 6
Push event: 2
Pull request event: 1
Fork event: 3

Last Year

Create event: 1
Issues event: 2
Release event: 1
Watch event: 62
Issue comment event: 6
Push event: 2
Pull request event: 1
Fork event: 3

Committers

Last synced: 10 months ago

All Time

Total Commits: 160
Total Committers: 13
Avg Commits per committer: 12.308
Development Distribution Score (DDS): 0.5

Past Year

Commits: 85
Committers: 8
Avg Commits per committer: 10.625
Development Distribution Score (DDS): 0.424

Top Committers

Name	Email	Commits
Nathan Scales	n**s@g**m	80
OneTwo Authors	n**y@g**m	43
Olivier Bousquet	o**t@g**m	12
Ilya Tolstikhin	t**n@g**m	11
Rebecca Chen	r**n@g**m	2
Parth Kothari	p**i@g**m	2
Holt Skinner	h**r@g**m	2
Fabian Pedregosa	p**a@g**m	2
Nathanael Schaerli	s**i@g**m	2
Misha Brukman	m**n@g**m	1
Dan Liebling	d**g@g**m	1
Hootan Nakhost	h**n@g**m	1
Cían Hughes	c**h@g**m	1

Committer Domains (Top 20 + Academic)

google.com: 13

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 5
Total pull requests: 3
Average time to close issues: 5 months
Average time to close pull requests: 11 days
Total issue authors: 4
Total pull request authors: 2
Average comments per issue: 0.8
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 4
Pull requests: 1
Average time to close issues: 7 months
Average time to close pull requests: N/A
Issue authors: 3
Pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

holtskinner (2)
LMMilewski (2)
bimission (1)
yagneshgooglegithub (1)

Pull Request Authors

holtskinner (3)
marblejenka (2)
eltociear (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

pyproject.toml pypi

absl-py *
aenum *
dataclasses-json *
fastapi *
gemma @git+https://github.com/google-deepmind/gemma.git
google-generativeai *
immutabledict *
jinja2 *
numpy *
openai *
pillow *
portpicker *
pytest *
pyyaml *
termcolor *
tqdm *
typing_extensions *
uvicorn *

https://github.com/google-deepmind/onetwo

Science Score: 26.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

OneTwo

Features

Quick start

Installation

Activate it.

Running unit tests

Tutorial and Documentation

Citing OneTwo

License

Disclaimer

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies