webarena-autogpt

fork of webarena with autogpt implementation

https://github.com/nicholaschenai/webarena-autogpt

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

fork of webarena with autogpt implementation

Basic Info
  • Host: GitHub
  • Owner: nicholaschenai
  • License: apache-2.0
  • Language: Python
  • Default Branch: public-autogpt-merged
  • Size: 5.73 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

Attribution

The original authors of WebArena can be found here: [Code] [Site] [Paper]

This uses the AutoGPT LangChain implementation in "Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions" [Code] [Paper]

Intro

This repo is a modification of WebArena, forked from version e32b71e3f5b2463bb102457591bc06c0f2c93acf Oct 21, 2023

Modification: AutoGPT

Key components include tool use, chat memory, memory retrieval and reflection. As this uses LangChain, it inherits the validation benefits

Usage for 4k context:

bash python lc_run.py --instruction_path agent/prompts/jsons/langchain_prompt.json --agent_type lc_agent --test_start_idx 0 --test_end_idx 812 --model gpt-3.5-turbo --lc_type autogpt --max_tokens 250 --max_obs_length 950 --result_dir outputs/autogpt

Usage for 16k context:

bash python lc_run.py --instruction_path agent/prompts/jsons/langchain_prompt.json --agent_type lc_agent --test_start_idx 0 --test_end_idx 812 --model gpt-3.5-turbo-16k --lc_type autogpt --max_tokens 250 --max_obs_length 3000 --send_token_limit 16385 --base_plus_mem_tokens 8400 --result_dir outputs/autogpt-16k

Warning This does not use early stopping from WebArena so it can potentially repeat actions for the whole 30 steps

Warning For 16k model, longer context + increased price per token makes this expensive (~$120+)

Modification: LangChain Structured Tool Chat (STRUCTUREDCHATZEROSHOTREACT_DESCRIPTION)

This also supports the LangChain Structured Tool Chat experiment

Usage:

bash python lc_run.py --instruction_path agent/prompts/jsons/langchain_prompt.json --agent_type lc_agent --test_start_idx 0 --test_end_idx 812 --model gpt-3.5-turbo --result_dir outputs/langchain-agent

Owner

  • Name: Nicholas Chen
  • Login: nicholaschenai
  • Kind: user
  • Company: Artificial Intelligence Research

Citation (CITATION.cff)

@article{zhou2023webarena,
  title={WebArena: A Realistic Web Environment for Building Autonomous Agents},
  author={Zhou, Shuyan and Xu, Frank F and Zhu, Hao and Zhou, Xuhui and Lo, Robert and Sridhar, Abishek and Cheng, Xianyi and Bisk, Yonatan and Fried, Daniel and Alon, Uri and others},
  journal={arXiv preprint arXiv:2307.13854},
  year={2023}
}

GitHub Events

Total
Last Year