
In a significant milestone for India’s domestic artificial intelligence ecosystem, Bengaluru-based startup Sarvam AI has launched new models that demonstrate exceptional performance in tasks involving Indian languages and real-world document understanding.
These advancements, particularly in optical character recognition (OCR) and text-to-speech (TTS), have enabled Sarvam to outperform leading global systems such as Google Gemini and ChatGPT in several India-relevant benchmarks, earning widespread recognition from the international tech community.
Launch of Sarvam Vision and Bulbul V3: Designed for India’s Linguistic Diversity
In early February 2026, Sarvam AI released Sarvam Vision, a 3-billion-parameter vision-language model built for multilingual document intelligence, and Bulbul V3, an advanced text-to-speech model focused on high-quality Indic voice generation.
Sarvam Vision is engineered to process complex documents across English and 22 official Indian languages, including Hindi, Bengali, Tamil, Telugu, Marathi, Malayalam, Kannada, Gujarati, Punjabi, Urdu, Assamese, and others. It excels at handling real-world challenges such as diverse layouts, technical tables, mathematical formulas, nested content, scanned archives, handwriting, charts, and mixed-language documents common in India.
Bulbul V3 delivers natural, expressive, and production-ready voices in 11 Indian languages (with a roadmap to expand to all 22), offering more than 35 professional-grade voice options and supporting high-fidelity audio output up to 48 kHz. The model has been optimized for India-specific use cases, including regional accents, code-mixing, numerics, STEM content, and telephony-grade (8 kHz) applications.
Benchmark Performance: Strong Globally, Exceptional in Indic Contexts
Sarvam Vision has demonstrated competitive performance on widely accepted global benchmarks (English-only subsets):
– olmOCR-Bench: 84.3% accuracy outperforming Gemini 3 Pro, DeepSeek OCR v2, and several other frontier OCR models.
– OmniDocBench v1.5: 93.28% overall score, particularly strong on complex layouts, technical tables, and mathematical formulas.
These English results establish that Sarvam Vision is capable of competing at the highest level even outside its core focus area.
However, the model’s most significant advantage appears in Indic-specific evaluation. Sarvam created the Sarvam Indic OCR Bench — a dedicated dataset containing 20,267 samples spanning 22 scheduled Indian languages to properly assess performance on regional scripts and real-world Indian documents. On this benchmark, Sarvam Vision achieved outstanding word accuracy scores, including:
– Hindi: 95.91%
– Tamil: 93.42%
– Bengali: 92.61%
– Marathi: 93.13%
– Malayalam: 91.66%
These figures surpass leading global models (including variants of Gemini, GPT, and Claude) on Indic OCR tasks, where international systems frequently struggle due to limited training data on non-Latin scripts, code-mixed content, and India-specific formatting.
Bulbul V3 has also received strong feedback in third-party blind listening tests across Indic languages, showing higher listener preference, lower error rates, and better stability than competitors such as ElevenLabs in telephony and voice-agent scenarios.
Expert Endorsements and Global Recognition
Silicon Valley investor and tech commentator Deedy Das publicly acknowledged Sarvam’s progress on X, stating:
> “I was wrong about Sarvam AI. Their Indic models are world-class, filling gaps big labs ignore. Value, quality, pricing all spot on.”
Pratik Desai, founder of KissanAI, endorsed Bulbul V3, noting:
> “Improvements make it ideal for Indic tasks—better and more affordable than ElevenLabs.”
Sarvam AI co-founder Pratyush Kumar shared the launch highlights on X, underscoring that the models represent sovereign AI developed and controlled entirely within India. The global developer and research community has responded with widespread discussion, praising Sarvam for addressing long-ignored challenges in Indic language processing.
Sarvam AI: Background and Strategic Importance
Founded in 2023 by Dr. Vivek Raghavan (a key architect of Aadhaar and UPI) and Dr. Pratyush Kumar (former leader at AI4Bharat, IIT Madras), Sarvam AI set out to build a full-stack generative AI platform optimized for India’s linguistic diversity and sovereign needs.
The company raised $41 million in December 2023 in one of the largest early-stage AI rounds in India (led by Lightspeed Venture Partners, Peak XV, and Khosla Ventures), later extending the round to approximately $53.8 million. In April 2025, Sarvam was selected by the Government of India (out of 67 shortlisted proposals) under the ₹10,300 crore IndiaAI Mission to develop the country’s sovereign foundational model.
In January 2026, Sarvam signed a landmark MoU with the Government of Tamil Nadu to establish India’s first full-stack Sovereign AI Park in Chennai, a ₹10,000 crore project over five years that will include massive compute clusters, secure data frameworks, research labs, innovation clusters, and an Institute for AI in Governance.
Sarvam’s product lineup now includes:
– Sarvam-1 / Sarvam 2B (Indic-first LLM)
– Sarvam Agents and Sarvam Samvaad (multilingual conversational AI)
– Sarvam Translate (22-language translation)
– Sarvam-M (reasoning + Indic model)
– Sarvam Vision (OCR & visual understanding)
– Bulbul V3 (TTS)
Let’s deep dive into Key Products and Technologies
Sarvam AI has developed a comprehensive suite of generative AI tools, many of which are open-source or open-weights to encourage ecosystem collaboration.
Sarvam-1 (often referred to as Sarvam 2B): Released in 2024, this 2-billion-parameter LLM was trained on a proprietary dataset of 4 trillion tokens, optimized for 10 Indian languages with a focus on efficiency and cost-effectiveness.
Sarvam Agents and Sarvam Samvaad: Conversational AI platforms supporting 11 Indian languages across multiple channels (voice calls, WhatsApp, web, and apps). These tools provide natural dialogue, analytics, and business insights.
Bulbul V3: Launched on February 5, 2026, this advanced text-to-speech (TTS) model delivers natural, expressive, and production-ready voices across Indian languages, regional accents, and scripts. It supports high-fidelity audio (up to 48 kHz) and has been praised for outperforming global benchmarks in listener preference and naturalness. Developers received unlimited API access through February 28, 2026, as part of a promotional rollout.
Sarvam Vision: A multimodal model (approximately 3 billion parameters) for optical character recognition (OCR), layout understanding, and visual reasoning in 22 Indian languages. As of February 2026, it has demonstrated superior performance over models like Google Gemini, ChatGPT, and Anthropic Claude in Indic-specific OCR benchmarks.
Sarvam Translate: An open-weights translation model handling 22 Indic languages, including formal, colloquial, and mixed contexts.
Sarvam-M: A hybrid model fine-tuned for Indic languages alongside reasoning tasks such as mathematics and programming.
Additional offerings include Sarvam Audio (speech recognition) and Sarvam Dub (dubbing for Indian languages).
These products prioritize affordability, low-latency performance, and integration with Indian digital ecosystems.
By prioritizing open-weights models and deep Indic optimization, Sarvam is positioning itself as a cornerstone of India’s sovereign AI strategy, reducing dependence on foreign infrastructure while making advanced AI accessible to India’s non-English-speaking majority.
Broader Implications for India’s AI Future
This latest release not only demonstrates technical excellence but also reinforces India’s ambition to lead in context-aware, inclusive AI for emerging economies. As Sarvam prepares to showcase its full ecosystem at theIndia AI Impact Summit 2026 (February 16–20, New Delhi), these developments highlight the country’s growing ability to build world-class AI solutions that address local needs while competing globally.
With continued government backing, strategic infrastructure investments, and increasing international recognition, Sarvam AI is helping shape a future where India is not just a consumer of AI, but a leading creator of sovereign, culturally attuned intelligence.




