I walk into Chandauli, a village near Alwar, Rajasthan, carrying Lev Grossman’s Time magazine cover on Mark Zuckerberg’s plans to bring the Internet to the unconnected. To confirm that this is indeed the place the Facebook founder had visited back in September—a trip that he called “an incredible experience”—I show the cover to a couple of men lazing by the side of the main approach road to the village. “Juckerberg!” they say in unison, before walking me all the way to a pink building next to the village school.

This building, Cyber Gram, was what drew Zuck to Chandauli, some 240 km from Delhi. Housing 24 laptops and a broadband connection, it has become a case study on the government’s efforts to spread Internet literacy, with help from organisations which work on bridging the digital divide, in this case the Digital Empowerment Foundation. Fakruddin Ali Ahmed, a member of the Cyber Gram staff, tells me that in less than a year, more than 2,000 people from the village and neighbouring areas, a lot of them kids under 15, have learnt to surf the Net.

The village school, though, is Hindi-medium, making it tempting to assume that the “learning” Ahmed is referring to is much narrower than what it ideally ought to be, a throwback to the government’s practice of routinely exaggerating real literacy. The Internet, after all, is overwhelmingly in English. About 55% of all websites are estimated to be in English, though less than 5% of the world’s population speaks it as a first language. In comparison, web data firm W3Techs says Hindi, spoken by 41% of Indians as per census data, accounts for less than 0.1 % of websites, trailing relatively quaint languages like Moldovan and Finnish. Kids at the Cyber Gram tell me that they get by with whatever little English they know, and when in doubt, they get on to Google Translate and figure their way out. That’s enterprising, but hardly a scalable model to unleash the transformative power of the Internet upon the unconnected hordes, as Zuckerberg, and the Indian government, want.

IT’S A TRUISM that the vernacular rules over Indian villages and a bulk of its smaller towns—the holy grail for Internet evangelists. As Rajan Anandan, Google’s India head, never tires of saying, this is the constituency that will provide the next big bump-up to Internet use in India (see ‘The Net’s Next 150 million’, Fortune India, October 2013). Every think tank bets on this demographic driving India past 500 million Internet users (just shy of 50% of the total population) in a matter of a couple of years, rivalling the medium’s growth in China (where Internet penetration is said to be a little over 46%). But such gung-ho estimates ignore a critical nuance: China has had a carefully designed, thriving language Internet—from search engines (Baidu) to social media (Sina Weibo) to e-retailers (TMall, Dangdang)—that makes it relevant and usable to pretty much anyone in that country.

In India, on the other hand, the ecosystem has been badly broken. Over the past few decades, successive governments, through agencies like the Centre for Development of Advanced Computing (C-DAC), have tried to enable Indian-language computing, but it never really caught on. Even in the mobile era, the conversation has hardly matured beyond infrastructure: making the Internet more accessible by bringing down the cost of compatible devices and data services. The first half of that equation is now firmly in place, with a glut of smartphones in the price range of $35 to $100 (Rs 2,200 to Rs 6,300). Cost of data, on the other hand, has increased, but networks are being enhanced and coverage is increasing, making access possible for more and more people.

The real pain point is the lack of a rich content experience in non-English languages, without which the smartest device is but a dumb screen for a vast number of users. Local-language consumers, traditionally served by newspapers and TV channels, have long been treated as second-class citizens in the digital age, as evidenced by primitive websites—often translations of English sites rather than original content—plagued by font-rendering issues.

A handful of players such as MadRat Games, a Bangalore-based startup which now sells board games, did make a fist of it. The company started dabbling in language-based online and mobile games in 2010 and even won digital inclusion awards for its language-based innovations, convincing handset makers like Nokia to include its content in some of its phones. But after nearly two years of grind, MadRat failed to scale the business and gave up. “Our plan was to look at digital local-language games as a consumer business, but I think it was way too early,” says CEO Rajat Dhariwal.

The vernacular is, of course, huge in every other medium. It dominates India’s media and entertainment sector—only one out of the top 10 newspapers in India is in English, according to the Indian Readership Survey 2013—and the vernacular press is a multibillion-dollar industry. The top television channels in India are in Hindi and so are the top fiction and non-fiction shows. It is difficult to find English-language radio stations. Movies from Hollywood started raking in crores after they were dubbed into local languages. Why should the Internet be any different?

That question leads to a classic chicken-or-egg quandary: While non-English publishers and content owners have themselves been lackadaisical about the Internet, advertiser apathy has given them little reason to change their stance (see ‘Rebirth of the Native’, Fortune India, October 2014).

Arvind Pani, chief executive of language-tech startup Reverie Technologies, says the latter is the real deal-breaker. English content, says Pani, is equated with upward mobility, and hence a more desirable demographic for advertisers. ​This prejudice is changing in print, with the growth of vernacular papers outpacing English. The Internet, on the other hand, continues to feed the bias, making scalability a challenge.

GREEN SHOOTS of change started appearing after India topped 200 million Internet users last year, prompting companies to view the language Internet as a part of the wider, hugely lucrative digital economy rather than just an access puzzle. Profitability is still in question, but a few nascent business models have begun to emerge.

The business that best exemplifies the changing mindset is Newshunt. Launched in 2009, the local-language news-aggregator app sprang into prominence after Bangalore-based Ver Se’ Innovation acquired it in 2012. It has since crossed 75 million downloads and raised millions in funding, including a $40 million round led by New York-based Falcon Edge Capital in February.
In addition to aggregating news, Newshunt acts as a marketplace for e-books and magazines in regional languages that are at times priced as low as Rs 4. Readers pay through a proprietary system where the mobile network operator either adds the amount to the phone bill or deducts it from the talktime for prepaid customers. This makes it possible for people who don’t have credit cards or Net-banking access to buy content.

Ver Se’ founder and chief executive Virendra Gupta acknowledges the complex challenge that vernacular players face compared with their English counterparts, given that they have to simultaneously invest in content as well as the technology to support it. But Gupta also blames myopic thinking across the board for past failures in this space. “[Building a] local-language business cannot be taken as an NGO-type activity,” he says. “It is a problem for businesses to solve. The mistake everyone makes is to look at the local-language user as a pay-zero user. We are trying to create a market for local-language content.” But, he adds, for the language Internet to really take off, there’ll have to be a billion-dollar investment in technology and acquiring customers. “Small companies simply can’t afford that,” he says.

That’s where American moneybags Google, Facebook, Twitter, Microsoft, and Qualcomm, who lord over the global Internet ecosystem, enter the fray. These companies can afford to fund the language Internet without short-term revenue pressures, and they have ample skin in the game since India is their largest growth market.

Under Rajan Anandan’s watch, Google took the lead with the November launch of the very un-Google-sounding Indian Language Internet Alliance (ILIA), which aims to get 500 million vernacular speakers online by 2017. The emphasis, at least initially, is on news-based content, as is evident from ILIA’s choice of partners: ABP News, Amar Ujala, NDTV, Network 18, Oneindia.com, and the Patrika group. Hindiweb, a news aggregator website, is the first launch under the ILIA. “The intention is to drive discoverability and encourage wide adoption. Right now, Hindi is the only language, but lots of companies have reached out and we are looking to partner other regional-language publications as well,” Anandan says.

Compared to Google’s language-specific approach, Facebook’s push with the Internet.org initiative, the fulcrum of Zuckerberg’s India visit, has been described as an attempt to build a global Internet freeway. Kevin D’Souza, head of growth, Facebook India, believes that partnerships at every point of the ecosystem are the way to go when it comes to making the Internet truly inclusive. In India, Internet.org’s partners include Reliance Communications, Samsung, and Qualcomm.

Giving people a reason to use the Internet, says D’Souza, would entail identifying apps that can make a difference in their lives, and then delivering them in the language of their choice. Thus, the Internet.org app in India, launched in February after Facebook piloted it in five African countries, is an eclectic mix of 38 free services, including high-traffic ones like Aaj Tak, BBC News, Cleartrip, ESPN Cricinfo, Hungama Music, IBNLive, Maharashtra Times, OLX, Times of India, and Wikipedia. The app is available in Tamil Nadu, Maharashtra, Andhra Pradesh, Gujarat, Kerala, and Telangana, and Facebook says most of its content partners support multiple languages, including Hindi, Tamil, Telugu, Malayalam, Gujarati, and Marathi. The nature of the service has ruffled some who say it will promote selective access to the Internet, thereby violating the tenets of network neutrality. Facebook’s reply so far has been to reiterate its “commitment to ... [increasing] global Internet access”.

Facebook’s Valley rival Twitter recently introduced hashtags in Indian languages including Bengali, Tamil, Telugu, Malayalam, Kannada, Gujarati, Punjabi, Marathi, Oriya, and Sanskrit. Promptly, Hindi and Tamil hashtags started trending. The company calls this a natural extension of its support for Hindi tweets, launched four years ago, and it is easy to see how it might gain traction among advertisers who could use vernacular posts to reach hitherto Internet-dark populations.

Microsoft too has been working for years on what it calls Project Bhasha with the goal “to stimulate local language computing and take IT to the masses”. The company says it has managed to put in place localised user interface as well as user assistance in 14 Indian languages for Microsoft Windows and Office. It has also developed a tool that helps vernacular users convert their non-Unicode documents to Unicode ones and vice versa. However, Alok Lall, director, Microsoft Office at Microsoft India, says driving uptake is a challenge since most users still consider English as a status symbol and a shortcut to a better life. Lall says the government’s Digital India initiative, which aims at universal digital literacy and availability of digital resources in Indian languages, could act as a force-multiplier.

Another key catalyst could be India’s e-commerce behemoths, which have a direct interest in talking to India’s burgeoning vernacular consumers. Snapdeal, the Delhi-based e-retailer valued at some $5 billion, sniffed the opportunity middle of last year, adding Tamil and Hindi as options for its shoppers. The company is currently investing in expanding its language-operations team.
“Our move beyond English is driven from the top,” says Ankit Khanna, its senior vice president, product. “There is real demand for it from what we have seen,” he adds. Eight months after the launch, Snapdeal claims 6% to 7% of its traffic come from vernacular shoppers.

FOR SUCH ATTEMPTS to be sustainable, the language Internet has to first overcome pesky technological issues. As Mahesh Kulkarni, programme co-ordinator at C-DAC, who looks after its language technology group, points out, Indian languages are among the world’s most complex because of their much more sprawling alphabets compared with English. Vernacular scripts also have additional characters and accents, and their relationship with consonants needn’t always follow a logical flow as in English.
Jayanth Kolla, founder and partner at telecom consulting firm Convergence Catalyst, says that the failure of some early movers—mobile-phone makers who experimented with complicated local language keypads—set the industry back by a few years. “There was a lot of baggage in the industry,” Kolla says.

According to Ram Prakash Hanumanthappa, founder and chief executive of language-technology firm Tachyon Technologies, many of those early keyboards didn’t work because they required users to re-learn the rules of typing. Hanumanthappa, who entered this space in 2006, uses machine learning and artificial intelligence (e.g., looking for and recognising phonetic patterns) to improve user experience. Inscrutable math formulas coexist with a jumble of letters from various scripts on a whiteboard in his Bangalore office, from where Hanumanthappa has launched products like Quillpad, which helps users write in various Indian languages by typing each word’s phonetic equivalent in English. Even though the output may not be 100% accurate, Quillpad’s intelligent engine can detect phonetic patterns with reasonable accuracy.

“Cracking the business model was not easy in our initial years,” says Hanumanthappa. “Back then, device manufacturers told me to my face that people won’t need Indian languages on the Internet at all. Now things have changed and there is a lot more interest. While we have not been actively looking for funding, there have been a lot of approaches recently.”

Arvind Pani’s Reverie Technologies, which offers a suite of local-language fonts and language-rendering and transliteration services, is another company to watch out for. A number of device manufacturers, developers, and publishers, including Micromax, Panasonic, Intel, and the Jagran group, have signed up to use Reverie’s language engine. Recently, the company launched its first consumer app Swalekh, which allows users to type and read in 11 languages. “The demand is here, and it is very real,” Pani says.

Like Tachyon, Reverie too struggled in its early days, but winning Qualcomm’s QPrize in 2011 changed things. The recognition came with funding from Qualcomm Ventures as well as access to Qualcomm’s own high-profile client network. “Besides pre-integrating Reverie’s software in some of our solutions, we also introduce Reverie to OEM partners across the world,” says Avneesh Agrawal, president, Qualcomm India & South Asia.

KeyPoint Technologies, a Hyderabad-based company that looks at human-machine interaction, has also been adding languages, both Indian and foreign, to its keyboard called Adaptxt. Adaptxt supports over 100 languages and boasts more than a million downloads on Google Play. But KeyPoint CEO Sumit Goswami says that’s about as far a small company like his can run without sustained support from publishers and the government.

HOW WILL THE evolution of technology and content consumption habits impact Indian languages? English will clearly remain a metaphor for aspiration for years to come, especially as technology permeates social strata. The real threat to languages will come from technology-induced standardisation, leading to obsolescence of dialects or the variety of usages. Reuters quotes the People’s Linguistic Survey of India 2013 to point out that “220 Indian languages have disappeared in the last 50 years and another 150 could vanish in the next half century as speakers die and their children fail to learn their ancestral tongues.” Unfortunately, technology is likely to only accelerate that.

Then there is the spectre of voice commands and of technology listening to local languages. Though the scenario appears a stretch at the moment, it could be brought to bear rapidly as mobile phones bring the Internet to the barely literate or not literate parts of the population. Qualcomm’s Agrawal concurs that beyond the first 500 million Internet users, voice inputting will take over. Thanks to Siri and Cortana, English-language voice commands have become simpler. Agrawal says efforts are on to replicate that for local languages. “Leading OS (operating system) vendors and universities globally are trying to get the experience of voice inputs for native languages as close as possible to the English experience. Google’s recent attempt with Hindi as part of its Android One ecosystem looks promising. Qualcomm is also developing solutions for voice navigation on smartphones in multiple languages,” he says.

Cue #AccheDin for Chandauli’s children.

Follow us on Facebook, Twitter, YouTube & Instagram to never miss an update from Fortune India. To buy a copy, visit Amazon.