Just last week it added the ‘underrepresented’ language Zulu to its Voice Search application, taking another step forward in the area, the like of which you don’t see from other companies.
Personally, the tools I’ve been most appreciative of are those for writing and translating Bengali, a language I’ve been learning for a few years now.
Indian languages, with their non-romanised script present various problems to online users, not to mention the coders and developers striving to make them accessible. (An in-depth, well-informed view of these challenges can be found at the blog of the W3C’s internet web internationalisation activity lead Richard Isida.)
But Google has released a number of clever little tools that make writing in languages like Bengali a bit easier. These include a transliteration web page, where you can type on a standard keyboard and see your letters transformed into their Bengali equivalent.
Thus, ami am bhalo lage (I like mangoes) is magically transformed into আমি আম ভালো লাগে as you type – if you only see a row of box icons you’re missing the correct fonts.
Given that the Bengali alphabet has more letters than English, grammatical symbols not present in Western languages and many conjuncts that combine two or more letters to form a new, different letter, the options presented aren’t always totally accurate, but they are remain remarkably useful.
Slightly more haphazard – though still useful – is the Bengali transliteration bookmarklet. You click on this in your browser and then can start typing in Bengali anywhere a webpage has a text field, which is just what I did two paragraphs earlier.
Finally there’s a multilingual dictionary, which can shift between nearly 30 languages, including 8 Indian languages.
The ‘digital divide’ is firmly on the agenda in Europe and the US, but languages efforts like this provide important building blocks that help tackle the wider, global digital divide that’s found away from the anglophone internet.
Back to the Voice Search news this post began with. Google defines underrepresented languages, like Zulu, as those spoken by millions, but with little presence in electronic and physical media, from web pages to newspapers.
But if Zulu seems esoteric, how about Aka, Tuvan, Foe and Koro? Google.org recently teamed up with the Living Tongue’s institute and National Geographic to give those endangered languages a presence on the internet for the very first time.
Other language milestones from Google this year include:
• May: Google Translate adds Azerbaijani, Armenian, Basque, Urdu and Georgian, whose combined speakers number about 100 million
• April: Transliteration in Amharic, Tigrinya, Hebrew, Oriya and Sinhalese made available, making a total to 22 languages from across Africa, South Asia, Eastern Europe and the Middle East
• April: Hindi comes to Google’s ‘text-to-speech’ features, where users can hear words and sentences read out loud
• May: Company announces intention to eventually get Google Goggles, its visual search function, to read non-Latin languages (such as Chinese, Hindi and Arabic)