webarena-chat-react
fork of webarena, modified with ReAct + chat functionality
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Repository
fork of webarena, modified with ReAct + chat functionality
Basic Info
- Host: GitHub
- Owner: nicholaschenai
- License: apache-2.0
- Language: Python
- Default Branch: chat-react-public
- Size: 5.61 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Attribution
The original authors of WebArena can be found here: [Code] [Site] [Paper]
The original authors of ReAct can be found here: [Paper] [Site] [Code]
Intro
This repo is a modification of WebArena, forked from version 1c0f414dd7827f0eff5cbfe13cf17331f43e9793 Sep 18, 2023
Modification: ReAct Prompting
Thought-Action-Observation cycle which works on WebShop. Difference from default CoT: Refactored the code to accommodate ‘chat style’ messages and roles. Previously, it was quite monolithic with all the history being put in the user message, which can cause confusion to the agents as it mistakes the history for instruction. Originally wanted to have thought-action-observation as separate messages but this confuses the agent and it starts deviating from the action format, so the current format is to still group Thought and Action in the same message. Also, agent will do an observation summary to remind itself of the past, without exceeding context limit
Usage:
python
python run.py --instruction_path agent/prompts/jsons/react.json --agent_type reactprompt --test_start_idx 0 --test_end_idx 812 --model gpt-3.5-turbo --result_dir outputs/react_test
Upgrades
- Just like the original paper, take the most recent x characters to manage context len
- even so, raw observation takes up a huge chunk of the context that at best, the context can only 2 observations
- To circumvent this, agent prompted to also do an Observation Summary. Only the most recent observation is displayed in full, past observations appear in summarized form
- prompt includes stating location of element id so agent doesnt confuse whether it should be on the left or right
- Token counting (approximate, assume 4char=1token)
- prompt to hover over new tab if page has not loaded, but that causes agent to do this more often
- Safeguards in
gotoaction to allow whitelist-only sites
Owner
- Name: Nicholas Chen
- Login: nicholaschenai
- Kind: user
- Company: Artificial Intelligence Research
- Repositories: 1
- Profile: https://github.com/nicholaschenai
Citation (CITATION.cff)
@article{zhou2023webarena,
title={WebArena: A Realistic Web Environment for Building Autonomous Agents},
author={Zhou, Shuyan and Xu, Frank F and Zhu, Hao and Zhou, Xuhui and Lo, Robert and Sridhar, Abishek and Cheng, Xianyi and Bisk, Yonatan and Fried, Daniel and Alon, Uri and others},
journal={arXiv preprint arXiv:2307.13854},
year={2023}
}