Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: minghchen
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 5.94 MB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

WebArena: A Realistic Web Environment for Building Autonomous Agents

Logo
WebArena is a standalone, self-hostable web environment for building autonomous agents

Python 3.10 pre-commit Code style: black Checked with mypy bear-ified

WebsitePaperLeaderboard

Overview

Notice

Here is the branch of WebArena used in AutoManual. We have made the following corrections and improvements to the original WebArena so that LLM Agents can obtain correct information:

  • After each action is executed, an additional "none_action" is executed, to ensure that the action has been completed.
  • The scroll bar information of the web page is added to the observation, to indicate the location of the current visible area.
  • When a focused element has a property of "hasPopup", its current value and options in the popup menu will be displayed. (Fix the issue that the original env cannot display the content in the popup menu)
  • Add "select" action for id-based actions, allowing the agent to select an option in a popup menu.
  • Add "fill" action for id-based actions, allowing the agent to clear existing content before type.
  • Fix the issue of fuzzy match a list.

To facilitate comparison with the original branch, we added a comment starting with "# [cmh]:" to each modified code.

We also fixed some annotation issues. (We will continue to update more annotation issues)

| Reddit: | | :----------------------------------------------------------- | | taskid 28: "Tell me the count of comments that have received more downvotes than upvotes for the user who made the latest post on the Worcester forum." The answer should be 1 instead of 0. | | taskid 407, 408: "Upvote the newest post in deeplearning subreddit" The original answer checked the hottest post instead of the latest post. | | taskid 583: "Create a new forum named PlantsForCatParents, with a description of Cat parents & plan lovers, and include ['Cat friendly', 'Local vendors', 'Promotion', 'Toxic plants!'] in the sidebar?" It should be "plant lovers" instead of "plan lovers". | | taskid 584: "Create a new forum named Karaoke, with a description of Place for Karaoke lovers, and include ['devices', 'setup'] in the sidebar?" The URL in the answer check should be changed to "__REDDIT__/f/Karaoke/edit" instead of "__REDDIT__/f/Karaoke". | | taskid 603, 608: "Post my question, \"safe and budge apartment to live in nyc\", in a subreddit where I'm likely to get an answer" in which "budge apartment" should be "budget apartment". | | taskid 644: "Post a notice on a virtual meetup for Tears of Kingdom enthusiasts on Dec 15th in the games subreddit" There is only a "gaming" subreddit in the environment, not a "games" subreddit. |

Install

```bash

Python 3.10+

conda create -n webarena python=3.10; conda activate webarena pip install -r requirements.txt playwright install pip install -e . ```

Then setup the standalone environment. Please check out this page for details.

[!IMPORTANT] After evaluating the 812 examples, reset the environment to the initial state following the instructions here.

Solve the issue of the rate limit for reddit:

```

Find the container id of postmill

docker container ls

Enter the container

docker exec -it a7b6610b623c bash

Modify the user trust level to send 15 messages every 5 minutes

psql -U postmill -d postmill UPDATE users SET trusted = true WHERE username = 'MarvelsGrantMan136'; \q ```

Configurate the urls for each website by setting your AWS hostname.

```bash export AWS_HOSTNAME=""

export OPENAIAPIKEY="" # a valid OpenAI API key starts with sk- export OPENAIBASEURL="" # e.g., https://api.openai.com/v1 ```

Citation

If you use the environment or data, please cite the paper: @article{zhou2023webarena, title={WebArena: A Realistic Web Environment for Building Autonomous Agents}, author={Zhou, Shuyan and Xu, Frank F and Zhu, Hao and Zhou, Xuhui and Lo, Robert and Sridhar, Abishek and Cheng, Xianyi and Bisk, Yonatan and Fried, Daniel and Alon, Uri and others}, journal={arXiv preprint arXiv:2307.13854}, year={2023} }

Owner

  • Name: Minghao Chen
  • Login: minghchen
  • Kind: user
  • Location: Zhejiang, China

ZheJiang University CAD&CG Phd student. Major in deep learning, computer vision.

Citation (CITATION.cff)

@article{zhou2023webarena,
  title={WebArena: A Realistic Web Environment for Building Autonomous Agents},
  author={Zhou, Shuyan and Xu, Frank F and Zhu, Hao and Zhou, Xuhui and Lo, Robert and Sridhar, Abishek and Cheng, Xianyi and Bisk, Yonatan and Fried, Daniel and Alon, Uri and others},
  journal={arXiv preprint arXiv:2307.13854},
  year={2023}
}

GitHub Events

Total
  • Issues event: 2
  • Push event: 3
Last Year
  • Issues event: 2
  • Push event: 3

Dependencies

.github/workflows/pre-commit.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pre-commit/action v3.0.0 composite
.github/workflows/tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
requirements.txt pypi
  • Pillow *
  • aiolimiter *
  • beartype ==0.12.0
  • evaluate *
  • flask *
  • gymnasium *
  • nltk *
  • openai ==1.35.3
  • playwright ==1.32.1
  • text-generation *
  • tiktoken *
  • transformers ==4.33.2
  • types-tqdm *
setup.py pypi