Saturday, October 23Digital Marketing Journals

What Are Acoustic Models and Why Are They Needed in Speech Recognition for Kids? | by SoapBox

A silhouette of a human’s face overlaid with some colourful shapes.

Speech Recognition Engineer Armin Saeb brings you the fifth installment of our “Lessons from Our Voice Engine” series, featuring high-level insights from our Engineering and Speech Tech teams on how our voice engine works.

Acoustic Models (AM) are key components of any speech recognition engine. An AM describes the statistical properties of sound events and connects the acoustic information with phonemes or other linguistic units. Hidden Markov Models (HMM) are one of the most common types of AMs. Other acoustic models include Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN).

1. How Conversational AI can Automate Customer Service

2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

3. Chatbots As Medical Assistants In COVID-19 Pandemic

4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

AMs play an even more critical role at SoapBox than in normal, adult-focused voice engines because recognizing children’s speech is much more challenging. Kids have smaller vocal tracts and slower and more variable speech patterns. They inhabit noisy environments and use a lot of spontaneous speech, imaginative words, ungrammatical phrases, and incorrect pronunciations!

At SoapBox we work hard to design and train AMs that can cater to all of this complexity and do the heavy lifting of accurately converting children’s speech to text.

Catch up on our previous “Lessons from Our Voice Engine”:

While you’re here, check out our latest insights on voice tech for kids.

Leave a Reply