/All Articles/The Guardrails Your LLM Needs: Reliable Agent-Based Systems

Photo by Aerps.com on Unsplash

The Guardrails Your LLM Needs: Reliable Agent-Based Systems

Bruno Ferreira · November 14, 2025

Large Language Models (LLMs) are a type of AI system trained to understand and generate human language. They’re powerful — but also unpredictable. Without clear structure, they can misinterpret context, invent facts, or act opaquely. That’s why, at Mercedes-Benz.io, we’re exploring how to design systems that don’t just use LLMs but govern them. What follows is how we’re doing that — and why it matters.

Artificial Intelligence systems are often celebrated for their potential, their ability to generate, summarize, predict, and learn. But in practice, LLMs still come up with major challenges: lack of contextual understanding, hallucinated outputs, and opaque decision-making paths.

At Mercedes-Benz.io, we’ve been experimenting with new ways to tackle these issues by designing agentic systems, structured, traceable, and explicitly governed AI pipelines, using Koog Framework by JetBrains.

This article is a walkthrough of how we got here, what Koog enables, and why we believe building agent-based AI systems is the right move for reliability, scalability, and real-world use.

What’s the Problem with Using LLMs “Out of the Box”?
Building a System with Intent
Introducing Koog: A Framework for Agent-Based AI
A Use Case: Booking a Vehicle Service Maintenance
Why Modularity and Transparency Matter
Final Thoughts

What’s the Problem with Using LLMs “Out of the Box”?

In theory, LLMs offer a flexible, general-purpose interface to knowledge and language. But in real-world applications, flexibility can become unpredictability.

There are four recurring problems:

Contextual understanding gaps: LLMs may understand syntax, but they often fail to track task-specific context across complex workflows.
Hallucinations: LLMs generate confident but false statements, which can’t be trusted in critical scenarios like scheduling, legal advice, or system decisions.
Lack of validation: Without external validation or constraints, outputs can’t be guaranteed to reflect real-world logic or business rules.
Unstructured behavior: LLMs behave like unbounded functions, and without structure, we have no meaningful way to govern or trace their decisions.

We needed a different approach.

Building a System with Intent

That’s where the Agentic approach steps in. Instead of relying on a single LLM prompt, we define the system as a collection of agents, each with its own Strategies, Tools, and roles. Together, these agents form a coordinated reasoning system.

This is what Koog helps us build: structured AI workflows, designed with modularity, traceability, and human-defined logic control.

In short: we don’t just ask the model what to do. We define how it should think.

Introducing Koog: A Framework for Agent-Based AI

Koog is Jetbrains framework that lets us design and run Agentic AI systems reliably.

Each system is composed of three main layers:

Strategies: High-level reasoning paths. These help us define the workflows, validations, rules, and ultimately achieve our main intent of our business needs.
Tools: Interfaces the agent can call - APIs, validators, data fetchers, etc.
Models: The actual LLMs that, based on our strict instructions, can execute Tools and help us generate a natural and fluent human language (to interact with real people) or even machine languages (to interact directly with systems in a structured way).

Koog supports multiple LLM providers, including OpenAI, Anthropic, Google, and Ollama, giving us flexibility without vendor lock-in (we can even provide our own). It also offers traceability for tracking AI processes. LLM compression history to avoid sending unnecessary data to LLMs. Built-in Memory capabilities that let AI agents store, retrieve, and use information across conversations. MCP integration. A2A (Agent-to-Agent Protocol) is a standardized protocol for communication between AI agents. And much more...

A Use Case: Booking a Vehicle Service Maintenance

Let’s take a real Aftersales use case:

Scheduling vehicle service maintenance at a Mercedes-Benz dealer

On the surface, it sounds simple. But it requires reliable access to:

Accurate vehicle data.
A verified list of dealers.
Realistic appointment availability.
Calendar validation for both parties.

An unstructured LLM might guess or invent data. But with Koog, we ensure that every action is traceable and follows a pre-defined ruleset, and that Models can’t fabricate answers without hitting constraints.

Our system decomposes the problem across multiple agents/strategies:

Understand the user's request.
Fetch verified vehicle info.
Query available dealers.
Match calendars for availability.
Run booking creation validations.
Compose a summary of the booking to be performed.
Request user confirmation to proceed with the booking.
Submit the request.

Here is a very basic example of an Agent strategy:

We decide if we can move forward at any point in the workflow, and we will only move forward if we have got what we need.

Why Modularity and Transparency Matter

One of the key advantages of Koog is its modular design. We can extend the system by adding new Strategies, swapping in different Models, or integrating new Tools — without rewriting everything from scratch.

It also allows us to enforce transparency across the AI lifecycle. We know what the system is doing, when, and why. We can inspect intermediate reasoning, tweak behavior, and avoid black-box decisions, which is essential for trust.

Before Koog, we couldn’t predict what the system would do. But now we enforce the guardrails on the system that makes everything predictable.

Final Thoughts

If you want reliability from AI, you can’t just hope the model will behave. You have to design behavior.

That’s what the Agentic Approach gives us, not just outputs, but structure. Not just intelligence, but accountability.

Koog is still very recent and evolving but already behaves pretty well on our experiments, and we’re exploring how far it can scale. But the foundations are there: a system that doesn’t just generate, it reasons with purpose, structure, and control.

For more technical details, visit Koog’s official documentation: https://docs.koog.ai/

Share this Article

Techsphere

Mile Grncarov, Mace Wieland, Bruno Morais

Behind the Engine: How the Car Configurator Turns Dreams into Reality

Welcome to Behind the Engine, a new series where we take you behind the scenes of the products and experiences that power Mercedes-Benz.io. Each edition will spotlight one of our key products, share its story, and celebrate the MB.ioneers who make it happen. Think of it as your backstage pass to the engines driving innovation at Mercedes-Benz.io.

Nov 11, 2025

Techsphere

Sina Langjahr, Filipa Anjos

The Power of Onsite PI Planning

Some moments are better when shared in the same room. PI Planning is one of them. Being onsite brings more than in-person collaboration, but also brings renewed energy, context and connection. For a few days, the plan switches to strategy and a shared mission.

Nov 7, 2025

MB.io WorldTechsphere

João Almeida

From Bottleneck to Backbone: How We Accidentally Built a Company-Wide GenAI Platform (And Lived to Tell the Tale)

When knowledge becomes hard to access, support bottlenecks follow. At Mercedes-Benz.io, what started as a small idea to reduce internal support turned into a company-wide GenAI platform, one that empowers teams to build their own assistants, securely and autonomously. In this article, Site Reliability Engineer João Almeida shares how we scaled structured AI to make knowledge easier to find, workflows faster, and autonomy the default.

Oct 31, 2025

Techsphere

The Guardrails Your LLM Needs: Reliable Agent-Based Systems

Contents

What’s the Problem with Using LLMs “Out of the Box”?

Building a System with Intent

Introducing Koog: A Framework for Agent-Based AI

A Use Case: Booking a Vehicle Service Maintenance

Why Modularity and Transparency Matter

Final Thoughts

Share this Article

Related articles

Behind the Engine: How the Car Configurator Turns Dreams into Reality

The Power of Onsite PI Planning

From Bottleneck to Backbone: How We Accidentally Built a Company-Wide GenAI Platform (And Lived to Tell the Tale)