Semitora.

21 June 2026

Guardrails are not an AI policy. They are a technical layer you must test

When a company says “we have AI guardrails”, it usually means one of three different things — and that confusion is exactly where a false sense of safety comes from. An AI policy, legal compliance and technical safeguards are three separate layers. Guardrails are the last one, the most concrete. And they are most often treated like a switch you flip and never test.

Three things the market confuses

A policy says “we don’t disclose personal data”. A guardrail is the thing that actually detects and masks a national ID number in the model’s response — or fails to, if it is misconfigured. The difference between declaring and enforcing lives right here.

What guardrails really are (using AWS Bedrock Guardrails)

To avoid hand-waving, look at a concrete implementation. On AWS we use Amazon Bedrock Guardrails, which provide several independent layers of control:

Crucially, these safeguards run on input and output, and through the ApplyGuardrail API you can apply them independently of the model — including to models outside Bedrock (self-hosted, third-party). So guardrails are not “one vendor’s feature”; they are a separate layer you design.

Turning guardrails on is the start, not the finish

Here is the part most teams skip. Enabling a filter does not mean it behaves the way you assume.

This is not us arguing against the vendor — it is the approach AWS itself documents as best practice: run the guardrail in detect mode on representative traffic, start with a high filter strength, find the false positives, tune. Then monitor it in production with metrics (CloudWatch). In other words: a guardrail is something you measure, not merely switch on.

Without tests and evaluation, guardrails are decoration

If a guardrail has to be tuned, then you need to know whether it is tuned well. That means evaluation, not declaration.

A guardrail with no test set and no metric is a configuration you assume works. In security, assumption is not the state you want to be in.

Guardrails are a process, not a project — they need an owner

The last and most overlooked element: an owner. The model changes, attacks evolve, the inputs change. A guardrail that was good six months ago may now under-block or over-block. Someone has to watch it, react to the metrics and decide when thresholds change.

This is the same requirement the AI Act places on high-risk systems as human oversight (Art. 14): a real person who understands the system’s output and can stop it. Guardrails are the technical side of that same problem. Without an assigned owner, every alert belongs to no one, and the “safety layer” slowly turns into decoration.

How we approach this in practice

In mojApteczka — the production AI system we built — safeguards are not a declaration but something we measure on a validation set. AI extraction of data from a medicine package reaches 100% accuracy on the validation set (n=200), and answers are grounded in sources rather than invented. No magic — the same discipline: define what you expect, test it on data, measure, tune.

In AWS projects we implement guardrails as a native Amazon Bedrock layer, not a bolted-on extra — and treat them exactly the same way: as something to measure, not to declare.

What’s next

If you run (or are building) an AI system and don’t know whether its guardrails actually work — get in touch. Safeguards are also part of AI Act compliance and of hallucination control in RAG on company documents. In every one of those cases the rule is the same: you have to test the technical layer before you call it safety.