Sarvam AI, a cool Indian tech company, launched OpenHathi-Hi-0.1, a super-smart Hindi language tool that is free for everyone. They want to help make Hindi AI better and this is the reason they are sharing it with everyone. It is like planting a seed for more awesome Hindi tech ideas.
The foundation of OpenHathi-Hi-0.1 lies in Meta AI’s Llama 2-7B model, establishing itself as a formidable contender comparable to GPT-3.5 for Indic languages. The company’s blog talked about the tough parts they faced while making their cool language tool. They pointed out a tricky thing called “tokenization,” which is like a puzzle piece in making text tools work well. It is an important part of handling words in big language tools. Tokenization proved to be more resource-intensive for Hindi than English due to the scarcity of training text in the Hindi language. The team fixed the problem in two steps by making the process better and cheaper.
To see how well the model works, they tested it on different things like translating and checking if the text is mean or not.
Sarvam AI has taken a commendable step by making the base model available on the Hugging Face platform. This means that developers can make the model work even better for certain jobs, working together to make AI better for everyone.
Co-founders Pratyush Kumar and Vivek Raghavan, both with prior experience at AI4Bharat, joined forces with Sarvam AI to utilize the language resources and benchmarks from AI4Bharat for training OpenHathi.
Sarvam AI, with about 18 people, wants to make talking computers that understand Indian voices well. They are doing this to help everyone in India, considering how people speak differently in different places. It is like they want tech that fits perfectly with how we all talk.
In a recent funding milestone, Sarvam AI secured million in Series A funding, led by Lightspeed Ventures and featuring participation from Peak XV and Khosla Ventures. This injection of capital positions the startup to expand its efforts further. In addition to OpenHathi-Hi-0.1, Sarvam AI is actively working on a suite of enterprise-grade models through its full-stack Generative AI platform, set to be unveiled in the near future.