NextFin

Beijing Robotics Center Releases Pelican-VL, Touts It as Largest Open-Source Embodied Multimodal Model

Summarized by NextFin AI
  • Beijing's Humanoid Robotics Innovation Center has open-sourced Pelican-VL 1.0, the most powerful open-source vision-language model for embodied intelligence.
  • The model is available in 7-billion and 72-billion parameter versions, outperforming GPT-5 models by 15.79% in benchmark tests.
  • Pelican-VL's release is expected to enhance visual-language perception for multi-step task planning across various sectors, including commercial services and robotics.

Beijing’s Humanoid Robotics Innovation Center has fully open-sourced its latest vision-language model for embodied intelligence, Pelican-VL 1.0, positioning it as the most powerful open-source model of its kind to date, according to a statement published Thursday.

The model comes in 7-billion and 72-billion parameter versions and is described as the “largest open-source embodied multimodal model” currently available. Benchmark tests show Pelican-VL outperforming comparable GPT-5 models by 15.79%, while also surpassing leading domestic systems such as Alibaba’s Qwen and Shanghai AI Lab’s InternLM-XComposer, the center said.

The open release of Pelican-VL 1.0 is expected to significantly advance real-world applications of embodied intelligence, enhancing visual-language perception for multi-step task planning in commercial services, general and heavy industry, hazardous operations and household robotics.

Explore more exclusive insights at nextfin.ai.

Insights

What is embodied intelligence and how does it relate to robotics?

How does Pelican-VL compare to other multimodal models in terms of performance?

What are the key features of Pelican-VL's 7-billion and 72-billion parameter versions?

What implications does the open-source nature of Pelican-VL have for the robotics industry?

How does Pelican-VL's performance against GPT-5 models impact the AI landscape?

What are the potential real-world applications of Pelican-VL in various industries?

What recent advancements were highlighted in the release of Pelican-VL?

How might Pelican-VL influence future developments in humanoid robotics?

What challenges does the open-source model face in terms of adoption and implementation?

How does the release of Pelican-VL fit into the broader trend of AI model open-sourcing?

What feedback have users provided regarding Pelican-VL's capabilities?

How can Pelican-VL enhance task planning in hazardous operations?

What are the key benchmark tests that Pelican-VL has excelled in?

What historical context led to the development of multimodal models like Pelican-VL?

How do international collaborations affect advancements in robotics and AI technology?

What are the ethical considerations surrounding the development of open-source AI models?

How do domestic systems like Alibaba's Qwen compare to Pelican-VL?

What trends can we expect in the field of embodied intelligence in the coming years?

What limitations might Pelican-VL encounter in its applications?

How has the robotics community responded to the launch of Pelican-VL?

Search
NextFinNextFin
NextFin.Al
No Noise, only Signal.
Open App