https://github.com/bytedance/ui-tars-desktop

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

https://github.com/bytedance/ui-tars-desktop

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 30 committers (3.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.9%) to scientific vocabulary

Keywords

agent agent-tars browser-use computer-use gui-agent gui-operator mcp mcp-server multimodal tars ui-tars vision vlm

Keywords from Contributors

transformers developer-tools distributed qwen productivity gemma deepseek anthropic langchain speaker-encoder
Last synced: 5 months ago · JSON representation

Repository

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

Basic Info
  • Host: GitHub
  • Owner: bytedance
  • License: apache-2.0
  • Language: TypeScript
  • Default Branch: main
  • Homepage: https://agent-tars.com
  • Size: 176 MB
Statistics
  • Stars: 18,227
  • Watchers: 151
  • Forks: 1,730
  • Open Issues: 295
  • Releases: 36
Topics
agent agent-tars browser-use computer-use gui-agent gui-operator mcp mcp-server multimodal tars ui-tars vision vlm
Created about 1 year ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Security

README.md

Agent TARS Banner


Introduction

English | 简体中文

TARS* is a Multimodal AI Agent stack, currently shipping two projects: Agent TARS and UI-TARS-desktop:

Agent TARS UI-TARS-desktop
Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.

It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools.
UI-TARS Desktop is a desktop application that provides a native GUI Agent based on the UI-TARS model.

It primarily ships a local and remote computer as well as browser operators.

Table of Contents

News

  • [2025-06-25] We released a Agent TARS Beta and Agent TARS CLI - Introducing Agent TARS Beta, a multimodal AI agent that aims to explore a work form that is closer to human-like task completion through rich multimodal capabilities (such as GUI Agent, Vision) and seamless integration with various real-world tools.
  • [2025-06-12] - 🎁 We are thrilled to announce the release of UI-TARS Desktop v0.2.0! This update introduces two powerful new features: Remote Computer Operator and Remote Browser Operator—both completely free. No configuration required: simply click to remotely control any computer or browser, and experience a new level of convenience and intelligence.
  • [2025-04-17] - 🎉 We're thrilled to announce the release of new UI-TARS Desktop application v0.1.0, featuring a redesigned Agent UI. The application enhances the computer using experience, introduces new browser operation features, and supports the advanced UI-TARS-1.5 model for improved performance and precise control.
  • [2025-02-20] - 📦 Introduced UI TARS SDK, is a powerful cross-platform toolkit for building GUI automation agents.
  • [2025-01-23] - 🚀 We updated the Cloud Deployment section in the 中文版: GUI模型部署教程 with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.


Agent TARS

npm version downloads node version Discord Community Official Twitter 飞书交流群 Ask DeepWiki

Agent TARS is a general multimodal AI Agent stack, it brings the power of GUI Agent and Vision into your terminal, computer, browser and product.

It primarily ships with a CLI and Web UI for usage. It aims to provide a workflow that is closer to human-like task completion through cutting-edge multimodal LLMs and seamless integration with various real-world MCP tools.

Showcase

Please help me book the earliest flight from San Jose to New York on September 1st and the last return flight on September 6th on Priceline

https://github.com/user-attachments/assets/772b0eef-aef7-4ab9-8cb0-9611820539d8


Booking Hotel Generate Chart with extra MCP Servers
Instruction: I am in Los Angeles from September 1st to September 6th, with a budget of $5,000. Please help me book a Ritz-Carlton hotel closest to the airport on booking.com and compile a transportation guide for me Instruction: Draw me a chart of Hangzhou's weather for one month

For more use cases, please check out #842.

Core Features

  • 🖱️ One-Click Out-of-the-box CLI - Supports both headful Web UI and headless server) execution.
  • 🌐 Hybrid Browser Agent - Control browsers using GUI Agent, DOM, or a hybrid strategy.
  • 🔄 Event Stream - Protocol-driven Event Stream drives Context Engineering and Agent UI.
  • 🧰 MCP Integration - The kernel is built on MCP and also supports mounting MCP Servers to connect to real-world tools.

Quick Start

Agent TARS CLI

```bash

Luanch with npx.

npx @agent-tars/cli@latest

Install globally, required Node.js >= 22

npm install @agent-tars/cli@latest -g

Run with your preferred model provider

agent-tars --provider volcengine --model doubao-1-5-thinking-vision-pro-250428 --apiKey your-api-key agent-tars --provider anthropic --model claude-3-7-sonnet-latest --apiKey your-api-key ```

Visit the comprehensive Quick Start guide for detailed setup instructions.

Documentation

🌟 Explore Agent TARS Universe 🌟

Category Resource Link Description
🏠 Central Hub Website Your gateway to Agent TARS ecosystem
📚 Quick Start Quick Start Zero to hero in 5 minutes
🚀 What's New Blog Discover cutting-edge features & vision
🛠️ Developer Zone Docs Master every command & features
🎯 Showcase Examples View use cases built by the official and community
🔧 Reference API Complete technical reference




UI-TARS Desktop

UI-TARS

UI-TARS Desktop is a native GUI agent for your local computer, driven by UI-TARS and Seed-1.5-VL/1.6 series models.

   📑 Paper    | 🤗 Hugging Face Models   |   🫨 Discord   |   🤖 ModelScope  
🖥️ Desktop Application    |    👓 Midscene (use in browser)   

Showcase

| Instruction | Local Operator | Remote Operator | | :----------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------: | | Please help me open the autosave feature of VS Code and delay AutoSave operations for 500 milliseconds in the VS Code setting. |

Features

  • 🤖 Natural language control powered by Vision-Language Model
  • 🖥️ Screenshot and visual recognition support
  • 🎯 Precise mouse and keyboard control
  • 💻 Cross-platform support (Windows/MacOS/Browser)
  • 🔄 Real-time feedback and status display
  • 🔐 Private and secure - fully local processing

Quick Start

See Quick Start

Contributing

See CONTRIBUTING.md.

License

This project is licensed under the Apache License 2.0.

Citation

If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:

BibTeX @article{qin2025ui, title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents}, author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others}, journal={arXiv preprint arXiv:2501.12326}, year={2025} }

Owner

  • Name: Bytedance Inc.
  • Login: bytedance
  • Kind: organization
  • Location: Singapore

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 306
  • Total Committers: 30
  • Avg Commits per committer: 10.2
  • Development Distribution Score (DDS): 0.601
Past Year
  • Commits: 306
  • Committers: 30
  • Avg Commits per committer: 10.2
  • Development Distribution Score (DDS): 0.601
Top Committers
Name Email Commits
Charles j****1@b****m 122
ULIVZ c****i@b****m 80
skychx s****x@h****m 18
wanghaoming.whm w****m@b****m 16
heh 3****h 15
yangxingyuan 3****4 9
Willem Jiang w****g@g****m 6
ycjcl868 c****n@g****m 5
Lynx • 琅邪 l****a@g****m 3
duguangyu.d d****d@b****m 3
le0zh n****t@q****m 3
yc.ai 7****n 3
Aaron Young l****1@g****m 2
omahs 7****s 2
Willem Jiang 1****d 2
StepSecurity Bot b****t@s****o 2
Eno Cyrus g****e@g****m 2
老果冻 4****g 1
5101good 5****d@1****m 1
Ikko Eltociear Ashimine e****r@g****m 1
Prateek Garg 5****8 1
QuietlyChan q****n@f****m 1
Sangyun_LEE (이상윤) f****6@c****r 1
Shixian Sheng s****2@p****m 1
SunShineGo 5****8 1
Yiyao (Jim) Wan y****n@u****u 1
fix-echo l****8@q****m 1
quicksand q****n@g****m 1
yuyutaotao 1****o 1
zac_ma y****g@o****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 415
  • Total pull requests: 1,283
  • Average time to close issues: 3 days
  • Average time to close pull requests: about 20 hours
  • Total issue authors: 233
  • Total pull request authors: 48
  • Average comments per issue: 0.76
  • Average comments per pull request: 0.94
  • Merged pull requests: 975
  • Bot issues: 0
  • Bot pull requests: 36
Past Year
  • Issues: 415
  • Pull requests: 1,283
  • Average time to close issues: 3 days
  • Average time to close pull requests: about 20 hours
  • Issue authors: 233
  • Pull request authors: 48
  • Average comments per issue: 0.76
  • Average comments per pull request: 0.94
  • Merged pull requests: 975
  • Bot issues: 0
  • Bot pull requests: 36
Top Authors
Issue Authors
  • ulivz (136)
  • jjy1000 (5)
  • PepperPlatypus (4)
  • ycjcl868 (4)
  • gitboy123 (4)
  • youngjuning (3)
  • zhuzeyu22 (3)
  • wisepmlin (3)
  • zz6zz666 (3)
  • asunoiwin (3)
  • NiksaSprinting (2)
  • OlafStolle (2)
  • skychx (2)
  • SimFG (2)
  • MurphyYe (2)
Pull Request Authors
  • ulivz (676)
  • ycjcl868 (272)
  • ZhaoHeh (89)
  • skychx (60)
  • dependabot[bot] (36)
  • cjraft (21)
  • WillemJiang (14)
  • sanyuan0704 (14)
  • le0zh (10)
  • youngjuning (6)
  • lynxlangya (6)
  • helio9cn (6)
  • knoxnoe (4)
  • omahs (4)
  • step-security-bot (4)
Top Labels
Issue Labels
Bug (182) Agent TARS (143) Feature (83) UI TARS (34) bug (19) enhancement (15) Need reproduction (9) Contribution Welcome (8) Agent TARS - Agent (8) Agent TARS - CLI (7) model (7) Agent TARS - UI (7) Agent TARS - Search (6) Question (6) Good first issue (4) need context (4) Agent TARS - Web UI (4) Agent TARS - MCP (3) ui (3) good first issue (3) question (3) MCP (2) UI-TARS (2) workspace (1) high-priority (1) session-management (1) agent-core (1) architecture (1) critical (1) database (1)
Pull Request Labels
dependencies (36) javascript (30) bug (14) enhancement (9) v0.0.5 (6) github_actions (6) Agent TARS (5) Good first contribution (5) Bug (5) Feature (2) Agent TARS - Tool (2)

Packages

  • Total packages: 24
  • Total downloads:
    • npm 463,478 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 648
  • Total maintainers: 6
proxy.golang.org: github.com/bytedance/ui-tars-desktop
  • Versions: 19
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.6%
Average: 5.8%
Dependent repos count: 6.0%
Last synced: 6 months ago
proxy.golang.org: github.com/bytedance/UI-TARS-desktop
  • Versions: 19
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.6%
Average: 5.8%
Dependent repos count: 6.0%
Last synced: 6 months ago
npmjs.org: @gui-agent/operator-aio

AIO (All-in-One) operator for GUI Agent

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 212 Last month
Rankings
Stargazers count: 0.6%
Forks count: 0.8%
Average: 14.9%
Dependent repos count: 23.9%
Dependent packages count: 34.5%
Maintainers (1)
Last synced: 5 months ago
npmjs.org: @ui-tars/operator-adb

Operator Android SDK for UI-TARS

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 599 Last month
Rankings
Stargazers count: 0.8%
Forks count: 1.1%
Downloads: 13.2%
Average: 15.2%
Dependent repos count: 24.8%
Dependent packages count: 35.9%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @gui-agent/operator-browser

Native-browser operator for GUI Agent

  • Versions: 12
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,036 Last month
Rankings
Stargazers count: 0.7%
Forks count: 0.9%
Average: 15.2%
Dependent repos count: 24.2%
Dependent packages count: 34.9%
Maintainers (1)
Last synced: 5 months ago
npmjs.org: @ui-tars/operator-browser

Native-browser operator for UI-TARS

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 621 Last month
Rankings
Downloads: 8.8%
Average: 23.0%
Dependent repos count: 24.7%
Dependent packages count: 35.7%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @agent-infra/mcp-server-commands

An MCP server to run arbitrary commands

  • Versions: 65
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,338 Last month
Rankings
Downloads: 10.5%
Average: 23.8%
Dependent repos count: 24.9%
Dependent packages count: 36.0%
Last synced: 5 months ago
npmjs.org: @gui-agent/action-parser

Action parser SDK for general action parser

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,024 Last month
Rankings
Dependent repos count: 24.0%
Average: 29.3%
Dependent packages count: 34.6%
Maintainers (1)
Last synced: 5 months ago
npmjs.org: @hivelogic/mcp-server-browser

MCP server for browser use access

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 11 Last month
Rankings
Dependent repos count: 24.1%
Average: 29.4%
Dependent packages count: 34.8%
Maintainers (1)
Last synced: 6 months ago
npmjs.org: create-new-mcp

Create MCP server template

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 18 Last month
Rankings
Dependent repos count: 24.5%
Average: 29.9%
Dependent packages count: 35.3%
Maintainers (1)
Last synced: 5 months ago
npmjs.org: @agent-infra/mcp-http-server

High performance HTTP Server for MCP

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 29 Last month
Rankings
Dependent repos count: 24.6%
Average: 30.1%
Dependent packages count: 35.5%
Last synced: 6 months ago
npmjs.org: mcp-http-server

High performance HTTP Server for MCP

  • Versions: 16
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 100,006 Last month
Rankings
Dependent repos count: 24.6%
Average: 30.1%
Dependent packages count: 35.5%
Maintainers (1)
Last synced: 6 months ago
npmjs.org: @agent-infra/mcp-shared

MCP shared

  • Versions: 66
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 2,889 Last month
Rankings
Dependent repos count: 24.8%
Average: 30.3%
Dependent packages count: 35.8%
Last synced: 5 months ago
npmjs.org: @agent-infra/mcp-server-filesystem

MCP server for filesystem access

  • Versions: 65
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,435 Last month
Rankings
Dependent repos count: 24.9%
Average: 30.4%
Dependent packages count: 35.9%
Last synced: 5 months ago
npmjs.org: @agent-infra/mcp-client

An MCP Client to run servers for Electron apps, support same-process approaching

  • Versions: 68
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 2,834 Last month
Rankings
Dependent repos count: 24.9%
Average: 30.4%
Dependent packages count: 35.9%
Last synced: 5 months ago
npmjs.org: @agent-infra/mcp-server-browser

MCP server for browser use access

  • Versions: 73
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 128,249 Last month
Rankings
Dependent repos count: 24.9%
Average: 30.4%
Dependent packages count: 35.9%
Last synced: 5 months ago
npmjs.org: @ui-tars/operator-browserbase

Operator Browserbase SDK for UI-TARS

  • Versions: 21
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 20 Last month
Rankings
Dependent repos count: 25.0%
Average: 30.6%
Dependent packages count: 36.1%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/operator-nut-js

Operator Nut JS SDK for UI-TARS

  • Versions: 32
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 647 Last month
Rankings
Dependent repos count: 25.0%
Average: 30.6%
Dependent packages count: 36.2%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/cli

CLI for UI-TARS

  • Versions: 33
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 41 Last month
Rankings
Dependent repos count: 25.0%
Average: 30.6%
Dependent packages count: 36.2%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/electron-ipc

Type-safe Electron inter-process communication for UI-TARS

  • Versions: 26
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 41 Last month
Rankings
Dependent repos count: 25.0%
Average: 30.6%
Dependent packages count: 36.2%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/sdk

A powerful cross-platform(ANY device/platform) toolkit for building GUI automation agents for UI-TARS

  • Versions: 32
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,784 Last month
Rankings
Dependent repos count: 25.0%
Average: 30.6%
Dependent packages count: 36.2%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/visualizer

Visualizer for Computer Use forked from @midscene/visualizer

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 3 Last month
Rankings
Dependent repos count: 25.1%
Average: 30.7%
Dependent packages count: 36.3%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/shared

Shared types for UI-TARS

  • Versions: 30
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 111,040 Last month
Rankings
Dependent repos count: 25.2%
Average: 30.8%
Dependent packages count: 36.4%
Maintainers (3)
Last synced: 6 months ago
npmjs.org: @ui-tars/action-parser

Action parser SDK for UI-TARS

  • Versions: 30
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 109,601 Last month
Rankings
Dependent repos count: 25.2%
Average: 30.8%
Dependent packages count: 36.4%
Maintainers (3)
Last synced: 6 months ago