Skip to content

Research topics

These are just some of the topics we're working on.
The very curious can directly look at our active branches on GitHub.

Shipped in Rasa 2.2

Incremental Training

We've added support for our model pipelines to support incremental training. This allows you to fine-tune an existing model after adding new training examples instead of training a new model from scratch. It's an ongoing area of research on how to do this effectively, but our first results look promising

"Whatlies" in Word Embeddings

Whatlies is an open source toolkit for visually inspecting word and sentence embeddings. The project offers a unified and extensible API with current support for a range of popular embedding backends including spaCy, tfhub, huggingface transformers, gensim, fastText and BytePair embeddings.

Shipped in Rasa 1.8

Dual Intent and Entity Transformer (DIET)

DIET is a new state of the art NLU architecture that jointly predicts intents and entities. It outperforms fine-tuning BERT and is 6x faster to train. You can use DIET together with BERT and other pre-trained language models in a plug-and-play fashion.

Explainer Video
Shipped in Rasa 1.3

Dialogue Transformers

What's the best data structure for dialogue memory - a stack? a graph? a flat list? Self-attention gives you great flexibility without a complex memory structure.

Read the paper

Transfer Learning Across Dialogue Tasks

You've built an assistant and it can already help users with a few things. Now you're adding new functionality. How can your assistant re-use the dialogue elements it already knows about in this new context?

Compressing Transformer Language Models

Large-scale language models like BERT, GPT-2, and XLNet show excellent performance on a number of NLU tasks but are very resource intensive. Can we compress these models to get something that's almost as accurate but much faster?

Read about Quantizing BERT
Shipped in Rasa 1.2

Entity Resolution using Knowledge Bases

Combining a dialogue system with a knowledge base allows developers to encode domain knowledge in a scalable way and integrate it with statistical NLU and dialogue models. It also helps your assistant understand messages like the second one or which of those is cheaper? .

Shipped in Rasa NLU 0.12

Supervised Word Embeddings

Pre-trained word embeddings like word2vec and GloVe are a great way to build a simple text classifier. But learning supervised embeddings for your specific task helps you deal with jargon and out-of-vocabulary words. This is now our default intent classification model.

Shipped in Rasa 1.3

Mixing Single and Multi-turn Dialogue

Dialogue elements like small talk and FAQs are single-turn interactions. New retrieval-based models in Rasa can handle all of these simple responses in a single action. This means your dialogue policy becomes much simpler and you need fewer training stories.

Read the blog post
Work in Progress

Conversational Embeddings

Most language models and word embeddings are trained on prose and don't know anything about the rules of conversation. How can we build embeddings that understand the difference between purposeful dialogue and chit-chat, and can detect non-sequiturs?

Talks and Meetups

We regularly host external speakers at our #botsBerlin meetup to talk about their research.

Academic Collaborations

Research

Our open source tools are used in research projects and papers at a growing number of leading institutions. In addition, TU Munich published a study benchmarking our NLU system.

Partnerships

We’re a partner of the UKRI Centre for Doctoral Training in Natural Language Processing at the University of Edinburgh. We are also sponsors of SigDIAL 2020.

Education

Every year we supervise a few MSc students and take on some interns to work on research. If you are using Rasa in a course, get in touch and we can share materials for use in lectures and group projects.