Recently, Rasa CTO and co-founder Alan Nichol spoke to Kane Simms, CEO of VUX World, about how Rasa helped one of the largest financial institutions in the US level up their chatbot into a successful, expectation-exceeding virtual assistant.
We want to highlight some of the best moments in the webinar by digging into the patterns that lead to inefficient AI assistants, changes that make your bot more effective, and must-have tips for creating conversational AI that works for everyone.
The trouble with some AI assistants
We’ve all been there. Your mom bought you a new kitchen gadget for Christmas, and–low and behold–it’s defective.
When you get to the company's website, you are forwarded to a virtual assistant to help process your exchange. But no matter what you type into the text box, the assistant keeps pushing you back to its general help menu. You just need to return a defective product! What’s with the digital maze?
This issue tends to happen because companies launch AI assistants before they have been tested with real-world data. The result is that customers are miffed, product teams are at a loss, and businesses are left with an even peskier customer experience headache than before.
Your AI assistant shouldn’t have bad days. That’s where Rasa comes in.
A case study
A notable financial institution built an AI assistant that they expected would drastically improve efficiency and customer service. The company was confident the assistant would perform well–it had been extensively tested and had achieved impressive performance scores.
Once the assistant went live, users reported terrible AI customer experiences and general dissatisfaction. So, what happened?
It comes down to how the team tested their Natural Language Understanding (NLU). NLU is the science of computers interpreting what humans say. It transforms human language into a format that computers can use to complete a user’s request by utilizing intents (the user’s aim), entities (data within speech), and training data (examples of how users express their intents used to train the NLU model).
When analyzing the effectiveness of an NLU, F-1 scores show how accurately the NLU model classifies training data against specified intents. Quality data is necessary to produce a reliable F-1 score, but many companies skip collecting varied data based on how people speak.
This company had used synthesized training data–based on the team’s assumptions about how users would speak–to train their NLU model. When they tested that data, they received a high F-1 score and believed that meant their assistant launch would be successful. Instead, they created conversational flows for their bot that real people could not follow.
The gap between training data and human behavior fools companies into thinking they're improving their AI assistants when they’re actually making them inefficient. So, how can you prevent that from happening?
Patterns to avoid when building a digital assistant
- Imposing your organizational chart on your customers
Too often, when searching for solutions, customers are tossed from person to person and team to team. This only frustrates them and makes the customer experience more complicated than it needs to be.
- Designing conversations using your intuition alone
Reliance on intuition does not equate to real human conversation when building NLS models. Effective NLU training can only be achieved through contact with reality and human interaction.
- Cross-validating your training data instead of measuring actual performance
Your F-1 score will improve if you cut out untidy or imperfect data from your testing, but that won’t improve your product! To make your assistant fit to interact with customers successfully, you need to expose it to a wide array of real-world examples.
Conversation-driven development as the solution
Instead of creating user intents yourself, populating them with synthetic training data, and testing your NLU model performance with that data, Rasa supports conversation-driven development as the best approach to training and testing NLU. You’ll get real production data just by listening to your users and exposing your assistant to conversations that people have on a daily basis.
It’s simple: if you listen to your customers and iterate your language model based on what they tell you, you can better align your assistant to their needs. This tactic will generate a more seamless and evergreen conversational AI product.
The impacts of a strong feedback loop
When the company employed conversation-driven development, it was able to improve, monitor, and scale its product. Teams centered their workflow around a feedback loop of inspecting KPIs by user goal, identifying where UX could be improved, and iterating on process, design, and data.
The feedback loop also spotlighted broader changes implemented by the team:
- NLU evaluation that reflected production performance
- A stronger process of conversational analysis
- A more rigorous conversation design process that included more iterations with test users
- More robust cross-functional collaboration
- Clear ownership of intent taxonomy
Four final tips to improve your AI assistant
- Find real data
You can source real training data from live chat transcripts, call recordings, call transcripts, emails, social media posts, search queries, and more. Anywhere you converse with your customer is fair game.
- Don’t tidy the data
Make sure you keep that real data. Even if some words or phrases seem semantic-y or out-of-place to you, that doesn’t mean people won’t use them. Language is never black and white in the real world, so expose your AI assistant to its many varieties.
- Don’t cut data
Don’t remove training data to accommodate your NLU; you should always be striving for your NLU to accommodate your customers. Challenging your NLU with new data will force it to improve categorization and, by extension, customer experience on the whole.
- Define your North Star
Set goals and metrics, and check in on them regularly so you can measure improvement. There’s no use trying to move around in the dark!
Balancing AI assistant research, design, and iteration can be overwhelming, but that precise combination drives a successful product that delivers results. At Rasa, we understand that language is intricate and ever-changing, and we feel lucky to play a small but important role in our mission to make conversational AI attainable, secure, and dependable for everyone.