The gap is real, especially for regional languages, highlights Arrowhead founder Devyani Gupta

AI should be leveraged to solve real problems and create opportunities for Bharat, Ankush Sabharwal, founder and CEO of CoRover.ai, said on Thursday at the Fortune India Startup Summit held in Bengaluru. “There’s a huge need to solve the problems for society. And there were a few problems that we kind of were living with and had already made terms with, and we were thinking we would never be able to solve... I think now we've got a genie called AI,” he said.
He added that companies need to understand users deeply. “You cannot be in India and try to solve [for] Bharat. You need to live that life and then figure out what problems they’re facing. And probably not only just the problem, maybe [create] some opportunities for them,” he said during a session titled ‘Bharat-led Innovation’.
Arrowhead founder Devyani Gupta, who was also part of the panel, highlighted the scale of the language gap—out of around 900 million internet users in India, only 5–10% speak English. “Which means we have 800 million plus people who are online but don’t know English as a language. And these are like your farmers, your rural workers, etc.,” she said.
According to Gupta, language is critical not just for access but also for trust. “Being able to speak in their language is not about reaching them, it’s actually about just providing them with the agency to access basic needs, such as healthcare, government schemes, credit schemes, etc.”
She also pointed out that even today, many users prefer voice interactions. “Interestingly, even for very low ticket-size items like motor insurance, people will still get on a call to buy motor insurance. What that means is that as a company, you’re actually leaving a lot of revenue on the table if you are not able to speak in that customer’s language,” she said.
Both Gupta and Sabharwal concurred that the opportunity goes beyond translation. Gupta pointed to education as a major use case. “Today, when they study, they have English textbooks that have been directly translated into, let’s say, Hindi. So, someone who’s a student in Bihar is reading a textbook that’s been directly translated. But all the context is still based on the American context. So, it’s not relatable for the people actually reading it.”
“It’s not just about translation… It’s also about can we contextualise it to the person that we’re talking to in their world and in their language," she said.
Sabharwal spoke about how building AI tools is becoming easier and more accessible. “So, now, we have the platforms… people can now just create voice agents just by speaking. So, it has become so easy for everyone to create solutions. Now the technology creation is not just with a few of us,” he said.
He added that more people need to step in as builders. “We all should really think like the creator. Because the problems are so many,” Sabharwal said.
The speakers differed on whether a lack of data is a key challenge. "If you pick the right problem to solve, there’s no challenge of data,” Sabharwal said. He explained that many services rely on limited user intents. “If I see the dashboard, there are only 10 intents. They have to book, they have to cancel, they have to change the boarding station, check the refund, and all that. Only 10 things,” he said.
He added that domain understanding matters more. “You have to work with people who already know the domain. They have the data already,” Sabharwal said.
Gupta, however, said the gap is real, especially for regional languages. “If you look at the LLMs that exist today, for English there are trillions of tokens… but for your regional languages, you have in the single-digit billions,” she said, highlighting that companies often have to build datasets from scratch. “We actually are working with agencies to get thousands and thousands of hours of speech calls from very local Malayali speakers to be able to train our models because they just don’t exist,” Gupta said.
The issue is solvable but only with wider support, Gupta said. “Just like UPI was more of a national initiative, I think developing this corpus of data also needs to be a national initiative… leaving it to startups to do that is going to be inefficient and slow,” she added.