ByteDance: UI-TARS 7B

ByteDance (Doubao)

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, we

Ultra-low costMultimodal vision
TextVision

Specifications

VendorByteDance (Doubao)
Familydoubao
Context128K · 96k words(≈ a thesis)
Max output128K · 96k words(≈ a thesis)Below average
Released2026-06-19
Open sourceNo
Tool use

Pricing across channels

per 1M tokens, USD

ChannelTypeInputOutput
OpenRouter (auto)Aggregator$0.10$0.20visitAd

Related models

Similar tier, vendor or price range