Wednesday, October 20Digital Marketing Journals

Why Custom Language Models (CLMs) are Needed in Speech Recognition for Kids | by SoapBox | Jun, 2021


SoapBox
This image is an abstract representation of Custom Language Models, or CLMs. In the background in a silhouette of a child’s face. It is overlaid with a network of yellow, blue, orange, and grey circles.

Welcome back to “Lessons from Our Voice Engine,” where members of our Engineering and Speech Tech teams offer high level insights into how our voice engine works.

Lesson 2 is from Lora Lynn Asvos, a Computational Linguist on our Speech Tech team.

What are CLMs?

CLM stands for “custom language model.” As mentioned in Lesson 1, language models are statistical models of language that can predict the next word based on the context.

CLMs are language models, as the name implies, but they have a little something extra. Instead of focusing generically on a given language, a CLM focuses on a specific domain of that language. This domain could be fairy tales, fables, scientific texts, cooking recipes, knitting patterns, you name it.

Even though CLMs specialize in a particular domain, they are still bolstered by general language knowledge. This allows CLMs to cope if the user goes outside the intended domain, which is particularly useful with children — they excel at saying the unexpected!Why are CLMs important for our kid-specific voice engine?

We often get this question from clients in conjunction with, “Why is a CLM better than a generic LM?” Generic LMs cover many topics and contain lots of data. For general knowledge applications, they can be useful. However, generic LMs are trained on adult words, use cases, and sentence structures. Their strength is also their weakness. As the old adage goes, a jack-of-all-trades is a master of none. Or in this case, a jack-of-all-domains.

1. How Conversational AI can Automate Customer Service

2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

3. Chatbots As Medical Assistants In COVID-19 Pandemic

4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

When a child says “the train went choo-choo,” a generic LM might interpret “choo-choo” as “to you” or “chew chew,” similar-sounding but more standard words. Children’s texts are also full of fun and unique character names, places, and objects. With a generic LM, the unique word won’t be understood, leading to a disappointing reading experience.

Since our focus is children’s speech, our CLMs are trained on kid-centric data, which means words like “choo-choo” are correctly understood. Our CLMs also allow for phrases with unique words like “the alien smork of planet Terratow” to be recognized with exceptional accuracy. This keeps the experience of reading engaging, educational, and enjoyable.

Are you interested in natural language processing (NLP) and voice technology for kids? Check out our first “Lesson from Our Voice Engine” on NLP.

Leave a Reply