Five things I struggle with when trying to explain NLU to managers

26.05.2021 6 Minutes

Before we dive into the topic, let me tell you a few things about me. For the past five and a half years I worked as an NLU developer for automotive voice control systems in Chinese Mandarin and Japanese for Volkswagen, Porsche, and Audi car models.

Prior to my first Natural Language Understanding (NLU) job, I had never heard of this field, and for most people this is still the case, including business leaders and managers. However, NLU is a key component of voice products and voice experiences. I like to call it the invisible engine. Without NLU there would be no voice assistants. No matter how good the voice design is, it would be like trying to make a luxury car run without an engine.

Illustration with flash, clouds and circle diagram

Today, I want to talk about five things that I think managers in general should know about NLU, even if they have absolutely no background in the field.

What exactly does an NLU developer do

As I said in the beginning of the article, I am an experienced NLU developer, and even now I am pretty sure none of my family members understand what I do for a living. Not that my job is as mysterious as being a Michelin Star inspector, but the concept of someone working with languages and essentially programming a machine to understand the human language is just very difficult to grasp. So far.

If I had to come up with a one-line explanation I would say: NLU is an interpreter between humans and machines.

NLU is very language-specific

This is something managers and product owners should definitely be aware of, especially if they are looking to create virtual assistants and immersive voice experiences for their customers. Not every NLP (Natural Language Processing) developer can be a good NLU developer. Sure, it is essential to have a well-developed pipeline in a language model created by a NLP developer, but it does not mean that an NLP developer automatically has extensive linguistic knowledge of a specific language.

If you compare Chinese Mandarin and English, the syntax might be similar: a sentence is often constructed with a subject plus a verb. But the Chinese language is very poor in morphology, and what complicates things, even more, is that it contains a vast number of homophones (words that have the same pronunciation but different meanings, for instance, “pair” and “pear” in English).

And on top of that, Chinese is an ideographic language, meaning a character can represent its meaning without reflecting the pronunciation. Without characters the sound itself does not necessarily represent the semantics, i.e. the meaning of the word. Thus, an English-speaking developer cannot simply transfer their knowledge to developing a Chinese NLU. Another example would be languages such as Japanese where the verb suffix can reflect the meaning of the word.

Info graphic showing Japanese where the verb suffix can reflect the meaning of the word

From an English perspective, one can argue that the stem is つけ (switch). The stem is the part of a word that is responsible for its semantic meaning. However, the direction of meaning is completely altered by the suffix られる when you go from an active to a passive verb.

The difference between NLP and NLU.

Continuing with the interpreter analogy, NLP is the translation software an interpreter can use, so how you use it and what you use it for is essential. Quite often NLP developers also cover NLU development, so sometimes there is no clear cut unless it is language-specific. Good NLU is all about understanding linguistic details and how to utilize them with the given tools.

Obviously, the more detail-oriented approach an NLU developer takes on the data, the more accurate, helpful, or meaningful the conversation with the voice assistant can get.

Lily Chuang / was Voice User Interface Architect at VUI.agency

It often surprises the user when a voice assistant can understand something the user actually did not expect to be understood. Moments like these create a sense of empathy, engagement, and sometimes even charisma.

Why clean data and intents are so important

I call unclean data and intents the “gaslighting in NLU”.

The language model itself is often not as complicated as people think. Imagine you are drawing a flower (the data) on a canvas (the machine). You mislabeled the yellow pigment as blue on the package (the intent). The canvas is passive. You are expecting to draw a yellow flower, but it turns out to be a blue one. The canvas will, of course, only show a blue flower.

Or perhaps you painted a yellow flower and a yellow elephant on the canvas, and you only classify yellow as one intent. Yes, they are both yellow, but you would not say that a yellow flower and a yellow elephant are the same. It would be much more useful to classify the yellow plants and the yellow animals as different intents.

You are not teaching your children confusing information, so why would you do that with your NLU engine? We should do the same with data, keeping our data and intents clean, and treat it with respect.

/ Lily Chuang / was Voice User Interface Architect at VUI.agency

When a product is released with confusing data and intents, it may work for a while if you only have a simple function. But once you start to expand your functionalities, it will get more complicated to debug, i.e. to find out where the mistaken data comes from. And it also gives your users an impression of poor quality and lack of sophistication.

Is there a perfect NLU platform

Unfortunately, just like everything else, perfection is just a perception and not a reality. Every NLU platform has its own pros and cons. You can overcome a platform’s cons by working with linguistic experts and using more or better training data and tuning rules, which is more practical than searching for the perfect ONE.

The NLU engine is a tool, after all. How you use it and what you use it for is far more critical. Imagine, for instance, that Alexa is a pan and Google Assistant is a pot. And let’s keep in mind that they are just two puzzle pieces out of thousands in the conversational AI world. You can use both for cooking, but if you want to cook a delicious dinner, the actual ingredients and your cooking skills are probably far more crucial than the pan or pot you use. I am sure Gordon Ramsay can still make a wonderful meal with very bad cookware.

Things that a tech manager or product owner should remember to ensure better results when building virtual assistants and voice experiences.

Furthermore, a lot of the time people aim for the stars and fall short. Try to keep NLU simple and clean one step at a time. Non-ambiguous intents defined by qualified linguistic experts are always a good starting point. However, at some point, a clearly defined decision will be necessary to deal with ambiguous data.

So, try to keep in mind the actual goal of your service or product and how you aim to achieve it with the data you have on hand. And remember to put emphasis on the linguistic domain because that is where quality voice experiences are created. Training an NLU engine is like teaching a child, without forgetting it is still a tool. The more meaningful and clear language rules you teach your NLU engine to process, the better it understands you. After all, you don’t need a pleasant voice assistant without functionalities.

To summarize, NLU is a language-specific interpreter between humans and machines.

Lily Chuang / was Voice User Interface Architect at VUI.agency

If you still want to know more about NLU and build virtual assistants and immersive voice experiences, please reach out to us. We are also offering workshops to get deeper into the Conversational User Interface world. Feel free to ask us for industry specific workshops or training sessions.

Share this article

Loneliness, Elderly, and Healthcare Voice Assistants

Sara Oliver G.V. 01.12.2021

Could a voice assistant help alleviate the feeling of loneliness in our elderly? Loneliness and the elderly are very familiar concepts that most of the time, sadly, go hand in hand. To answer this, we must first use our empathy to understand this feeling’s why’s and nuances.

Read blog post

Data and AI in voice – An interview with senior voice architect Dr. Laura Dreessen

Dr. Laura Dreessen 30.08.2021

How do we use data and AI in VUI.agency’s voice projects? First and foremost, we use data to train voice recognition since every voice interaction starts with understanding. For this, we need speech data collections to understand how our user’s persona speaks.