NextFin News - In a significant shift for the global artificial intelligence landscape, Bengaluru-based startup Sarvam AI has unveiled performance data showing its localized models outperforming industry titans Google Gemini and OpenAI’s ChatGPT in specialized benchmarks. As of February 8, 2026, the company’s "sovereign AI" initiative has moved from theoretical ambition to empirical success, particularly in the domains of Optical Character Recognition (OCR) and multilingual voice synthesis tailored for the Indian subcontinent.
According to LatestLY, Sarvam AI’s OCR tool, "Sarvam Vision," achieved an accuracy score of 84.3 percent on the olmOCR-Bench, surpassing Google’s Gemini 3 Pro and DeepSeek OCR v2. On the OmniDocBench v1.5, which evaluates the interpretation of complex real-world documents, the model scored 93.28 percent. These results are particularly notable in their handling of technical tables and mathematical formulas—areas where general-purpose Large Language Models (LLMs) frequently falter due to the lack of localized training data and structural nuances inherent in Indian administrative and academic documents.
The startup, co-founded by Pratyush Kumar, has also released "Bulbul V3," a text-to-speech (TTS) model supporting 35 voices across 11 Indian languages. According to News24, Kumar, an alumnus of IIT Bombay and ETH Zurich, has positioned Sarvam AI as a cornerstone of India’s "Atmanirbhar Bharat" (Self-Reliant India) mission. The success of these models has forced a re-evaluation among global tech critics. Deedy Das, a prominent Silicon Valley commentator who previously questioned the viability of Indic-language models, acknowledged on X that Sarvam has filled a critical gap ignored by larger Western labs, offering superior performance at a more reasonable price point.
The rise of Sarvam AI represents a broader trend in the "Sovereign AI" movement, where nations seek to reduce dependency on Silicon Valley by building foundational models that reflect local linguistic, cultural, and regulatory realities. While U.S. President Trump has emphasized a "zero-tariff" trade environment with India, the digital realm remains a battlefield for data sovereignty. By training models on indigenous datasets, Sarvam avoids the "cultural flattening" often seen in Western models, which frequently treat non-English languages as secondary translation tasks rather than primary cognitive functions.
From a financial and industrial perspective, Sarvam’s strategy of vertical specialization—focusing on OCR and TTS—is a masterclass in resource efficiency. Training a general-purpose LLM to compete with GPT-5 or Gemini 2 requires billions in compute capital. However, by targeting the "last mile" of AI utility—such as digitizing messy, multilingual government records or providing low-latency voice interfaces for rural banking—Sarvam has created a high-moat business model. Pratik Desai, founder of KissanAI, noted that the cost-to-performance ratio of Bulbul V3 makes global alternatives like ElevenLabs impractical for the Indian market.
Looking ahead, the success of Sarvam AI is likely to trigger a wave of "National AI" projects across the Global South. As the cost of inference drops and the value of high-quality, localized data rises, the dominance of monolithic, one-size-fits-all AI models is being challenged. The next phase for Sarvam involves expanding Bulbul to 22 languages and integrating Vision into the Indian government’s digital public infrastructure (DPI). If Sarvam can maintain its lead in accuracy while scaling its API ecosystem, it will not only secure India’s digital borders but also provide a blueprint for how mid-sized tech powers can achieve AI parity through strategic specialization.
Explore more exclusive insights at nextfin.ai.
