AsianFin -- Microsoft has released the latest version 2.0 of its OmniParser framework, which can convert large models like OpenAI (4/0/1/03-mini), DeepSeek (R1), Qwen (2.5VL), and Anthropic (Sonnet) into "Computer Use Agents." Compared to its predecessor, OmniParser V2 offers higher precision in detecting smaller interactive elements and faster inference speeds.
Specifically, V2 improves performance by training with a larger dataset of interactive element detection and icon feature descriptions, while reducing the input image size for icon descriptions, resulting in a 60% reduction in inference delay compared to the previous version.
Explore more exclusive insights at nextfin.ai.