fbpx

Speech to text

When you ask Siri to text your niece or Google Docs to dictate a memo, you’re using speech-to-text technology.

These technologies take spoken words and convert them into text. Often, they link your commands to other actions, automating small tasks.

Hospital systems, businesses, and governments often need to balance the efficiencies of using speech-to-text with the potential risks of error. If you’re not familiar with speech-to-text in the context of professional translation and transcription, here’s what you should know.

Best uses of today’s speech-to-text tech

Today’s speech-to-text software works best when using single-speaker, single-language audio files. Machines make fewer errors when the audio quality is clear and the background noise is minimal. Even when processing common languages like English, devices often mistake words. A professional transcriber can easily catch these mistakes.

Our team at Translationz uses speech-to-text software to save valuable time when transcribing speeches, interviews, and testimonies. Instead of manually creating the first draft of a transcript, the human transcriber spends time ensuring a level of quality and accuracy.

Where human intelligence takes over

While interest in speech-to-text is booming and the quality is quickly improving, here’s where our translators and interpreters catch what machines often miss:

Code-switching: Code-switching is when multilingual interviewees switch from one language to another, most often their native language and English. We often see code-switching in healthcare, law enforcement, and legal professions. Contemporary speech-to-text automation simply isn’t a good match for this situation.

Multiple speakers: It’s common for business meetings and interviews to have various speakers, and it’s vital to identify each speaker accurately. Speech-to-text technologies often leave out this valuable context.

Background conversations and other noise: Recordings of conferences and events often pickup side conversations or other background noise. Either can result in confusing or blank passages in a speech-to-text file.

Translations: Most people are familiar with web tools that translate small passages of one language into a second language. These machine translations can miss honorifics or cultural idioms. In comparison, when we receive a Vietnamese file, we’re careful to complete the transcription with a native speaker to catch these nuances and then complete the translation into English.

Privacy: Most mass-market speech-to-text services are unsuitable for our clients’ privacy needs because they can’t know who has the files and where the information is stored. 

At Translationz, our use of speech-to-text software is proprietary and used only with our clients’ permission. We store files appropriately and check them for accuracy before releasing them to you.

We’re here to help. If you’d like more information about how we use speech-to-text and other emerging technologies, contact us at any time.