Wednesday, October 20Digital Marketing Journals

How to maintain chatbot regression tests with minimum effort | by Gergye Szabolcs | Jun, 2021


Gergye Szabolcs

The biggest and the most hateful challenge in software development is writing test cases and maintaining them. This is no different when it comes to chatbot development. At Botium we don’t write the regression tests, we generate them. This article shows you how we do this with the least amount of effort invested.

To reach the best coverage you have to define all possible conversations of your conversation model. To implement and maintain it manually is really time consuming and sometimes a boring task, not to mention the human mistakes which can easily happen even in the case of a pretty simple chatbot. We have implemented a Crawler tool which will help you to do it in a very simple way.

The Crawler detects the buttons and the quick replies and makes conversations along them. The conversation tree is traversed by a little bit customized depth first algorithm. Each time the Crawler reaches an open-ended question (which means no button or quick reply is found), then the conversation is ended, the path marked as visited and a new conversation is started from the beginning (from the ‘conversation start message’) to keep the context of the conversation safe. (The Crawler process starts the conversations with the messages which are defined in the ‘conversation start messages’ parameter.) When all paths are visited in the conversation tree, then the session is ended and you get all the possible conversations as result so you will have a full regression test. Let’s see how it works in practice in Botium Box.

For better understanding I use a very simple banking chatbot example, which is mixed with buttons and open-ended questions.

With quick start you can define a Crawler project in three simple steps. In the first step you have to choose or register a new chatbot. Then in the second step you can configure some basic settings of the conversation crawler. In the third step you can save the Crawler project or you are able to start the first Crawler session immediately.

Finishing the registration you will be redirected to the dashboard of your Crawler project. Here you can see the previous Crawler sessions and the current execution settings.

During a Crawler session as many parallel processes are going to be started as many ‘Conversation start messages’ are defined in the execution settings. These processes will detect all possible conversations along buttons and quick replies as it was already mentioned in the Crawler concept.

The example banking chatbot has bot initiated conversations. The Crawler is able to detect the buttons and quick replies in the welcome messages as well, so in this case we can let the ‘Conversation start messages’ field empty.

1. How Conversational AI can Automate Customer Service

2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

3. Chatbots As Medical Assistants In COVID-19 Pandemic

4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

And here is the biggest value of the Crawler: the generated conversations. This chatbot is pretty small and simple, so in this case we have just five conversations generated as a result. In case of a more complex chatbot hundreds of test cases can be found, which is enormous work to do manually.

The other feature, which is as useful as the generated conversations, is the flowchart, which shows the whole detected conversation tree in visual form to get a big clear picture about your chatbot.

As you can see in the previous section at the bottom of the flowchart there are ‘open-ended questions’ like ‘Which date would be best for you? We need 24 hours …’. In this case the conversation is stopped from Crawler point of view, but with human interaction it could be continued. We have a solution for this problem as well.

For ‘open-ended questions’ you can define multiple user answers. These responses will be recognized in the next Crawler session as if they would be buttons. After adding some user response at the end of non-finished conversations the flowchart became much bigger and the generated conversations were doubled.

As you can see with some minutes of easy work we generated ten conversations for this bot. In case of a fully button based bot, you have even less work, just press the start button and wait for some minutes.

But what can we do with these conversations? In other words these are test scripts. Clicking on ‘Copy Test Scripts into Test Set’ you are able to copy them into a new or an existing test set.

A test set is a collection of test cases which can be added to a test project. At this point the regression test with a pretty good coverage is ready for this bot.

  • Conversation start messages
    These are so called conversation start messages, which the Crawler starts the conversations with. As many start messages you have, as many parallel jobs will be started.
  • Maximum conversation steps
    This is the depth of the conversation in the conversation flow. (One step is a user-bot message pair.) When the configured depth is reached then the conversation is stopped and marked as successfully ended.
  • Number of welcome messages
    There are chatbots which initiate the conversation without user interaction. In this case you have to specify how many welcome messages will be sent by the bot.
    If the bot has a welcome message(s) and you don’t specify any start message, then the Crawler tries to find quick replies and buttons in the welcome message(s), and start the conversations along them.
  • Wait for prompt
    Many chatbots answer with multiple bot messages. In this case you can define a timeout until the Crawler has to wait for the bot messages after each simulated user message.
  • Exit criteria
    In case of a complex chatbot it occurs often that we want to test only a certain part of the conversation tree. In this case you can define exit criteria to exclude some part of the tree.
    If the text of any quick reply/button matches any of the exit criteria, then that conversation is stopped there and marked as successfully ended conversation.
  • Merge utterances
    All text messages are saved as utterances. The Crawler can recognize the non-unique utterances and merge them into one utterance.

Botium Crawler is a very powerful tool for creating regression tests. It’s able to generate all test cases on a happy path without user interaction in case of fully button based chatbot and with minimal user interaction in case of partially button based chatbot. With the flowchart you can overview your chatbot conversation tree and detect circles.
It’s a brand new tool, so there is still a lot of room for improvements, and we already have many ideas. For example we would like to introduce different tree traversal algorithms to reach more effective performance so you will be able to choose the best fit algorithm for your chatbot. Furthermore you will be able to add RegExp as exit criteria, we are planning to make the open-ended question feature more handy, and so on.
Without proper tools you will be lost. The Crawler feature is the part of our flagship product Botium Box which helps you in your path to successful chatbot testing.

Leave a Reply