![]() For example, the Zoom video conferencing platform offers live transcription of meetings in English. Nowadays automatic live transcription is available in a variety of popular apps. In this case, prior processing involves automatic transcription in the source language. When Text1 is available as audio, we have called it Speech1 above. You can point your camera at a restaurant menu, for example, and read the descriptions of dishes in your own language. This kind of processing is offered for free by the Google Translate and Microsoft Translator apps. For example, if Text1 is printed, it will need to be captured as a digital image and then processed via Optical Character Recognition (OCR). Nonetheless, in the following discussion, our purpose is not to identify the best speech-to-speech approaches but to have a view of the various inputs, processes and outputs that are of interest to users, including scenarios other than speech-to-speech translation.įollowing the traditional pipeline, there may be some need for prior processing if Text1 is not available as digital text. Most recently, the most promising research has focused on combining these steps into a single so- called “end-to-end” approach in which audio content is translated in a single step, without transcription and machine translation stages. This is the traditional pipeline for automatic speech translation systems. Speech1 (audio) > Text1 (printed text) > Text2 (printed text) > Speech2 (audio)Īutomatic speech recognition (transcription) / Optical Character Recognition Moving back to the complete Babel fish speech-to-speech scenario, let’s take a look at the various processes involved. (A similar service is offered by Microsoft Translator.) ![]() But it is very impressive and seems to meet the needs of commercial helpdesks. ![]() It is more like liaison interpreting with a pause between each turn in the dialogue. Sightcall does not perform a complete L1 speech to L2 speech process and is not quite in real time. Take a look at Sightcall, for example, a multilingual speech-to-text video support app. The Babel fish scenario of 2022 is not, however, based on exotic fish or time-travelling police boxes, but rather on mobile apps of various kinds. A tongue-in-cheek cameo of this kind of technology featured in the Dr Who episode The Christmas Invasion (2006).Īn excerpt of The Christmas Invasion on the official Dr Who YouTube channelīack down here on terra firma, we now seem to be on the cusp of viable software and devices that are capable of making the Babel fish dream a reality, as a result of extraordinary advances in language technology over recent years. ![]() This invention is a commonplace of science fiction, in which members of different cultures or even different species communicate with apparent ease. The name of the service derived from the “Babel fish” fictional species in Douglas Adams’s book “The Hitchhiker’s Guide to the Galaxy”, which could be inserted into the human ear to instantly translate spoken languages. Among other incarnations, it was the name of the first automatic web translation service, launched in 1997 by Altavista, a popular web search engine (1 BG: 1 year before Google), and from 2003 to 2012 the service was available at the Yahoo! website. ![]() Some older Tradiling readers may remember the Babel fish. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |