The initial âLLMs can do anythingâ hype has worn off. Engineers are now on the hook to ship real applications. If youâre done playing whac-a-mole, tweaking your temperature, adding yet another prompting technique to your swamp of text, and crossing your fingers and hoping that your application actually works this time, you might want to try the CALM Developer Edition.
LLM Agents figure out their business logic on-the-fly every time a user interacts with them. Someday this approach could give us a personal concierge that can navigate the whole internet to do anything for us. But business AI agents arenât like that at all, and as an architecture for a production app, I canât really imagine anything worse.
- Docs
- Example chatbot implementation
- A developer sharing their experience
Hereâs the value CALM (Conversational AI with Language Models) adds if youâre building an LLM agent:
A simple, declarative way to define your business logic, and a logical engine that executes it deterministically. The ability to break your logic into small reusable pieces and make your application easier to maintain. A guarantee that users canât override your business logic through prompt injection.
Conversation as a first class citizen. Collecting information from users, input validation, disambiguation, context switching, corrections, and digressions all work out of the box using customizable patterns.
Easy debugging. In CALM, the LLM either gives the correct output or it doesnât - it's that straightforward. You donât need to fall back to fuzzy text matching or trusting another LLM to âscoreâ your answers. If a conversation doesnât go the way you expect, you can see exactly why and where the problem is. And you can track user journeys step-by-step.
Nuanced understanding. CALM doesnât rely on intents and can understand complex instructions, negation, and pragmatics.
No hallucinations! CALM defaults to using templated, human-authored answers, and itâs up to you when and where to allow generation. And if fifty thousand people ask for your opening hours every week, you donât have topay (and wait) to regenerate the same tokens every time. A mature app should only generate when itâs actually saying something new.
Speed! Your LLM only has to generate a handful of tokens. Thereâs no loop that recursively calls an LLM, and no âchainingâ of instructions to multiple LLMs.
We have a lot of big improvements shipping in the next couple of months based on early-adopter feedback, and we always like to hear from teams that are pushing the limits of Rasa. Please share any feedback you have with us.






