
Have you asked a specific question to a chatbot, only to get a “read more” link to another page? For example, whether you ask a broad question like “is the stadium open?” or a specific question like “are the indoor courts open despite the rain?”, the bot links you to a lengthy page for more information about the venue operating hours and conditions.
If you have grown your FAQ bot to serve thousands of unique questions, that’s a fantastic accomplishment! However, if every intent is designed to return one kind of response, then that single response will serve a bucket of questions at various specificity. There are more details provided in the question, but the bot fails to capture and utilise the entities for decision making. In other words, the answer to the visitor’s question may be correct, but it isn’t direct enough.
If the entity extraction capability is present in the chatbot, you can design conditional responses. If the facts in the universe are unarguable, then a straightforward branch of if-else conditions will be sufficient to build for your chatbot to follow through.
For example, your chatbot is trained to recognise food types and preferences from a string of words in a customer’s message.
- If a guest consumes meat and asks for food recommendations, then the guest is a non-vegetarian, so fetch the relevant data from an API and recommend more meat than non-meat products to the user.
- Else, if the guest is a vegetarian, then exclude chicken, beef, mutton, pork, and seafood recommendations.
What if the decision rules aren’t that simple? Maybe the application logic or business flow is too complex to draw. Or perhaps the rules aren’t clear in the beginning. Suppose the rules are subjective and iterative because they are informed by individual lifestyles, survey results, experiment outcomes, or other evolving situations. In that case, a data-driven approach to continuously update a table of examples for your chatbot program will be more efficient than re-drawing and recodifying a decision flowchart.
Sounds like machine learning? You’re right. But in this article, I’d like to talk about a technique that prefers quick classification (i.e. determining the correct response based on some entity checks in the intent) and handles incomplete data pretty well — Decision Trees.
The good news is that since a decision tree is data-driven, it presents an opportunity for low-code user interfaces to provide a table of attributes and target values instead of arrows and boxes of text to achieve a set of conditional responses.
For example, a sample in a table could have these attributes (entities) and a target (intent response):
- [“Out of shampoo?”, “Raining”] → “Go to market or not?”
- [“Age group”, “Income”, “Credit rating”, etc.] → “Purchase or not?”
- [“Traffic level”, “Raining?”, “Urgency”, etc.] → “Bus or taxi?”
- [“Age group”, “Blood pressure”, “Cholesterol”, etc.] → “Drug A or B?”
- [“Age group”, “Vaccination status”, “COVID-19 test result”, “COVID-19 variant”, “Fever”, “Cough”, etc.] → “Quarantine protocol X, Y or Z?”
So, rather than having the conversation designer to re-draw the rules that allow your chatbot to determine a response, they could make a list of examples with corresponding answers to allow your chatbot to develop its decision tree.
1. How Conversational AI can Automate Customer Service
2. Automated vs Live Chats: What will the Future of Customer Service Look Like?
3. Chatbots As Medical Assistants In COVID-19 Pandemic
4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?
Suppose you have 13 examples of how people prefer to commute to work based on traffic, weather outlook, and level of urgency. It is either by bus or taxi. Visually, we can see that none of the attributes can determine the choice of transport because it depends on other features. Whether rain or shine, you could take a taxi or bus depending on different factors.
At least one attribute will give a much cleaner split between taxi or bus. To do that, we calculate the initial entropy, the level of “bad surprise”, and then find the attribute that gives the highest Gain Ratio. A high gain ratio means a high normalised difference between the parent entropy and the attribute’s total weighted entropy. In other words, a high gain ratio for a particular attribute indicates the best possible clearer split and the least “bad surprise”.
In the example above, the “traffic” attribute produces the biggest gain ratio, becoming the first rule in the decision tree.
Next, in each subcluster, we repeat the calculation of gain ratios and identify which remaining attribute (“weather” or “urgency”) produces the higher gain ratio. The attribute with the higher gain ratio forms the following decision rule under that subcluster.
The process repeats until there are no more attributes to consider or when there’s a predetermined threshold to prevent an overfit (i.e. to avoid over-branching of outliers).
Eventually, a decision tree is formed. Suppose a user is interested in getting advice from the chatbot. In that case, it’s now able to quickly suggest taking a bus ride or a cab, depending on the traffic conditions, weather and level of urgency.
By providing a table of examples rather than writing if-else rules, we help the chatbot to infer a decision tree from a recommendation table. At any point in time, the examples, including the attributes and responses, may be changed on the fly, so that users can expect an updated intent response based on up-to-date training data rather than by up-to-date application code.
In my next article, I’ll run through code examples using a decision tree algorithm and training data to fulfil an intent.




