https://github.com/google-research/android_world

AndroidWorld is an environment and benchmark for autonomous agents

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary

Keywords from Contributors

research

Last synced: 8 months ago · JSON representation

Repository

AndroidWorld is an environment and benchmark for autonomous agents

Basic Info

Host: GitHub
Owner: google-research
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 20.7 MB

Statistics

Stars: 451
Watchers: 7
Forks: 93
Open Issues: 27
Releases: 0

Created about 2 years ago · Last pushed 9 months ago

Metadata Files

Readme Contributing License

AndroidWorld

Website • Paper • Tasks • Leaderboard

Overview

AndroidWorld is an environment for building and benchmarking autonomous computer control agents.

It runs on a live Android emulator and contains a highly reproducible benchmark of 116 hand-crafted tasks across 20 apps, which are dynamically instantiated with randomly-generated parameters to create millions of unique task variations.

In addition to the built-in tasks, AndroidWorld also supports the popular web benchmark, MiniWoB++ from Liu et al..

Key features of AndroidWorld include:

📝 116 diverse tasks across 20 real-world apps
🎲 Dynamic task instantiation for millions of unique variations
🏆 Durable reward signals for reliable evaluation
🐳 Experimental Docker Support for simplified setup and consistent environments (as of 06/02/2025)
🌐 Open environment with access to millions of Android apps and websites
💾 Lightweight footprint (2 GB memory, 8 GB disk)
🔧 Extensible design to easily add new tasks and benchmarks
🖥️ Integration with MiniWoB++ web-based tasks

See demo videos on our website. o

Installation

Set up the Android Emulator
1. Download Android Studio here
2. Create an Android Virtual Device (AVD) by following these instructions. For hardware select Pixel 6, for System Image select Tiramisu, API Level 33, and choose AVD name as AndroidWorldAvd. Watch the setup video.
Launch the Android Emulator from the command line

Launch the emulator from the command line, not using the Android Studio UI, with the -grpc 8554 flag which is needed communication with accessibility forwarding app.

```bash

Typically it's located in ~/Android/Sdk/emulator/emulator or

~/Library/Android/sdk/emulator/emulator

EMULATORNAME=AndroidWorldAvd # From previous step ~/Library/Android/sdk/emulator/emulator -avd $EMULATORNAME -no-snapshot -grpc 8554 ```
[Optional] It's recommended to use conda, which you can download here.

conda create -n android_world python=3.11.8 conda activate android_world
Install AndroidWorld. Note: Python 3.11 or above is required.

python git clone https://github.com/google-research/android_world.git cd ./android_world pip install -r requirements.txt python setup.py install
Add model provider APIs as environment variables.

```bash

Add to .bashrc.

export OPENAIAPIKEY=your-key export GCPAPIKEY=your-key ```
Install ffmpeg, if not already installed.

```bash

Linux (Ubuntu/Debian)

sudo apt update && sudo apt install ffmpeg

macOS

brew install ffmpeg ```

Quickstart

Run the minimal_task_runner.py script to see the basic mechanics of AndroidWorld components. It initializes the environment, sets up a task, and runs the default agent, M3A, on it. bash python minimal_task_runner.py --task=ContactsAddContact

If you don't specify a task, a random task will be selected. NOTE: If you want to try open-source apps, i.e. not included with Android OS, please run --perform_emulator_setup in the script below.

Note on Model Cost: The minimal_task_runner.py script uses a legacy model gpt-4-turbo-2024-04-09 by default. This model can be expensive. For serious usage, you can switch to a more cost-effective model, by modifying the model_name in the script.

Docker Support (Experimental)

AndroidWorld now offers Docker support. This allows you to run the Android environment and server within a Docker container, which can simplify setup and ensure a consistent environment.

Note: This feature is experimental and has not been extensively tested.

Build the Docker image:

Navigate to the root directory of the android_world repository and run: bash docker build -t android_world:latest .
Run the Docker container: bash docker run --privileged -p 5000:5000 -it android_world:latest This will start the Android emulator and the FastAPI server inside the container. The server will be accessible on http://localhost:5000.
Interact with the environment: You can see the scripts/run_suite_on_docker.py script as an example client to interact with the Android environment server running in Docker.

Note for Apple Silicon users

There are known issues with installing the required package emulator on ARM chips (Apple Silicon). To get around this, if building images locally, you should build images for the AMD64/x86_64 instruction set, by running: bash docker buildx build --platform linux/amd64 -t android-emulator:latest .

Note, running in a Docker container like this, on an Apple Silicon device will run quite slowly compared to running the Android Device and Emulator natively (because you end up running an Android Emulator inside a Linux Emulator...).

Run the benchmark

Note: Task Step Limits Update As of 11/18/2024, the maxsteps/stepbudget for each task in AndroidWorld have been updated to approximately 2x the human average completion time. This adjustment ensures agents have sufficient time to complete tasks, while also reducing overhead of running thebenchmark. Here are the per-task updates.

bash python run.py \ --suite_family=android_world \ --agent_name=t3a_gpt4 \ --perform_emulator_setup \ --tasks=ContactsAddContact,ClockStopWatchRunning \ # Optional: Just run on a subset.

The first time you run this script, you must install the necessary apps and set permissions by specifying --perform_emulator_setup. This is a one-time setup. It may take several minutes depending on the connection speed.

Above we specify the optional --tasks flag to run on a subset of tasks. Leave it empty to run on the entire AndroidWorld suite.

The n_task_combinations argument specifies how many parameter permutations to use for each task. For example, for an SMS task, it would correspond to different phone number/message combinations for each run.

If a run fails part-way through, you can resume it by re-running the script with the --checkpoint_dir flag pointing to the output directory from the original run.

Running MiniWoB++ tasks

To run the MiniWoB++ web-based tasks in AndroidWorld, simply set --suite_family=miniwob and --perform_emulator_setup in the command above.

A key advantage of running MiniWoB++ tasks is that common input elements are rendered as native, commonly used Android UI widgets, rather than as HTML. Thus agents must learn to use universal widgets such as time- and date-pickers:

Create your own agent

In addition to the agents we provide here, you can also easily create your own agent and run the benchmark with it as follows.

Create an agent class that inherits from EnvironmentInteractingAgent and implement the step method. In the current workflow, the agent tries to complete a task in a for loop. In each round, the step method will be called and this is where you implement your agent's logic. A typical approach involves first gathering information like the current screenshot, the UI elements (like buttons, icons) through the AndroidEnv instance within the agent, selecting one of the supported actions, executing it through the AndroidEnv and returning an AgentInteractionResult. The done property on AgentInteractionResult should be set to true to indicate that the task is finished.
Import your agent in run.py and also add it into the getagent method which takes in your agent's name and return an instance of it.
Now you can run the benchmark with your new agent using the command above with the agent_name flag changed to your agent's name.

Adding new tasks

Please see the guide on adding new tasks to AndroidWorld.

Citation

If you use our environment or data, please cite our paper:

@misc{rawles2024androidworlddynamicbenchmarkingenvironment, title={AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents}, author={Christopher Rawles and Sarah Clinckemaillie and Yifan Chang and Jonathan Waltz and Gabrielle Lau and Marybeth Fair and Alice Li and William Bishop and Wei Li and Folawiyo Campbell-Ajala and Daniel Toyama and Robert Berry and Divya Tyamagundlu and Timothy Lillicrap and Oriana Riva}, year={2024}, eprint={2405.14573}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2405.14573}, }

This is not an officially supported Google product.

Owner

Name: Google Research
Login: google-research
Kind: organization
Location: Earth

Website: https://research.google
Repositories: 226
Profile: https://github.com/google-research

GitHub Events

Total

Create event: 89
Issues event: 100
Watch event: 273
Delete event: 87
Member event: 1
Issue comment event: 252
Push event: 344
Pull request review comment event: 7
Pull request review event: 9
Pull request event: 225
Fork event: 73

Last Year

Create event: 89
Issues event: 99
Watch event: 273
Delete event: 87
Member event: 1
Issue comment event: 252
Push event: 344
Pull request review comment event: 7
Pull request review event: 9
Pull request event: 225
Fork event: 72

Committers

Last synced: about 1 year ago

All Time

Total Commits: 210
Total Committers: 11
Avg Commits per committer: 19.091
Development Distribution Score (DDS): 0.638

Past Year

Commits: 200
Committers: 11
Avg Commits per committer: 18.182
Development Distribution Score (DDS): 0.645

Top Committers

Name	Email	Commits
Chris Rawles	c**s@g**m	76
The android_world Authors	n**y@g**m	72
Sarah Clinckemaillie	s**k@g**m	29
Ning Li	2**8@q**m	13
Nevan Wichers	w**n@g**m	9
Chris Rawles	c**s@g**m	5
Alice Li	l**e@g**m	2
Michael Hoisie	h**e@g**m	1
Luke Granger-Brown	l**b@g**m	1
Gabrielle Lau	g**u@g**m	1
Izzeddin Gur	i**n@g**m	1

Committer Domains (Top 20 + Academic)

google.com: 9 qq.com: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 72
Total pull requests: 428
Average time to close issues: 14 days
Average time to close pull requests: 10 days
Total issue authors: 49
Total pull request authors: 12
Average comments per issue: 1.49
Average comments per pull request: 0.45
Merged pull requests: 243
Bot issues: 4
Bot pull requests: 388

Past Year

Issues: 63
Pull requests: 217
Average time to close issues: 6 days
Average time to close pull requests: 7 days
Issue authors: 43
Pull request authors: 11
Average comments per issue: 1.25
Average comments per pull request: 0.57
Merged pull requests: 122
Bot issues: 2
Bot pull requests: 177

View more stats

Top Authors

Issue Authors

anunay1 (5)
copybara-service[bot] (4)
cjfcsjt (4)
Nid989 (3)
LYXFOREVER (3)
NingLi670 (3)
ThakurAnunaya (3)
lgy0404 (2)
flibbertigibbet-Y (2)
RainBowLuoCS (2)
boyugou (2)
jangyicheng (1)
linyangde (1)
njucckevin (1)
mengzchen (1)

Pull Request Authors

copybara-service[bot] (386)
NingLi670 (24)
yuefengz (4)
tlc4418 (2)
A-Mahla (2)
dependabot[bot] (2)
rossamurphy (2)
xieincz (2)
chingkt (1)
celilygt (1)
overmindy (1)
lakesoi (1)

Top Labels

Issue Labels

Pull Request Labels

copybara-import (11) dependencies (2)

Dependencies

.github/workflows/pytest.yml actions

actions/checkout v4 composite
actions/setup-python v5 composite

android_world/env/setup_device/setup.py pypi

requirements.txt pypi

IPython *
absl-py ==2.1.0
dm_env ==1.6
fuzzywuzzy ==0.18.0
google-generativeai ==0.5.1
grpcio-tools *
immutabledict ==2.0.0
jsonschema ==4.17.3
matplotlib ==3.6.1
numpy ==1.26.3
opencv-python *
pandas ==2.1.4
pydub *
pytest *
python-Levenshtein *
requests ==2.31.0
tenacity *
termcolor *

setup.py pypi

https://github.com/google-research/android_world

Science Score: 36.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

AndroidWorld

Installation

Typically it's located in ~/Android/Sdk/emulator/emulator or

~/Library/Android/sdk/emulator/emulator

Add to .bashrc.

Linux (Ubuntu/Debian)

sudo apt update && sudo apt install ffmpeg

macOS

Quickstart

Docker Support (Experimental)

Note for Apple Silicon users

Run the benchmark

Running MiniWoB++ tasks

Create your own agent

Adding new tasks

Citation

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies