UX case study · AIMSLO

Protected 20% of monthly revenue lost to SLA breachesfor a B2B SaaS provider.

Swaayata fixes SLA breaches autonomously. I designed the product and its human-in-the-loop.

Role

Lead Product Designer

Goal

Make every autonomous action visible, reversible, and explainable.

Enterprise SaaS · 20246 months · concept to build-ready

01The agent doesn’t just alert. It acts, and shows its working in the live feed.

01 / Overview

It turns a signed promise into a live target, then runs the agent that keeps it.

Two terms run through the whole product. An SLA is the promise a company signs. An SLO is the measurable target it has to become. Swaayata reads the first into the second.

Take one. “The payment system will respond in under two seconds, 99.9% of the time” is a sentence in a contract, signed by people who will never get paged. Swaayata reads it into a live target, checked every second and something an agent can act on: p99_latency < 2000ms, success_rate ≥ 99.9%.

Then, on every target it tracks, the agent

Acts

When performance slips, it takes corrective action, not just another alert.

Autonomy

Learns

Each human override becomes a training signal. The model sharpens through RLHF.

Feedback

Governs

Every action is visible and reversible. You step in only when you want to, oversight by choice, not a gate.

Accountability

02 / The problem

A missed target isn’t a bug. It’s lost revenue.

An SLA is a financial contract, not a dashboard reading. Every target carries a credit clause the company has already agreed to pay, so a missed target isn’t counted in downtime. It’s counted in dollars:

What one missed month costs

$267Kmonthly fee, on a $3.2M a year account

20%the credit clause, already in the contract

= revenue lost, automatically

$0KRevenue
lost

Gone the moment a target slips. No approval, no appeal.

And it fires on every breach. Miss enough months and the account that took a nine-month sale to win starts shopping.

03 / The approach

Most of an autonomous system runs with no one watching.

So I didn’t design most of it. I designed the three places a human is watching, where an engineer decides whether to let the agent run at all. The rest, it could keep.

See it.

“Are we actually meeting our commitments?”

The system answers in colour, before a single number is read.

Stop it.

“What is it doing to production, right now?”

Every decision the agent makes is visible, and one click from reversed.

Understand it.

“Why did it make that call?”

The reasoning chain is the artifact, not just the conclusion.

Most of an autonomous system runs with no one watching.

So I didn’t design most of it. I designed the three places a human is watching, where an engineer decides whether to let the agent run at all. The rest, it could keep.

See it.
“Are we actually meeting our commitments?”
The system answers in colour, before a single number is read.
Stop it.
“What is it doing to production, right now?”
Every decision the agent makes is visible, and one click from reversed.
Understand it.
“Why did it make that call?”
The reasoning chain is the artifact, not just the conclusion.

04 / Design decisions

Three surfaces. The calls behind them.

Each one is a place an engineer has to trust the agent. Each became one decision, shown on the real product.

Swaayata dashboard: SLO met, not met and incident counts with trends, SLO compliance and service health.

Decision

The dashboard

Colour first. Numbers second.

1
The move
Status in colour, before any number is read.
2
Hierarchy
Two levels: headline, then drill-down.

“How do you show the health of dozens of SLA promises at once, without overwhelming an engineer?”

What changed

Rejected

Reproducing dashboards at a higher level

Every metric with equal weight

Numbers as the primary signal

Designed

Colour as the primary signal for SLA health

Status first, detail on demand

Two levels: headline, then drill-down

Reasoning

Engineers already have every metric they could want. What they lack is a single answer to the question their manager asks every morning: are we meeting our commitments?

So the top of the screen answers that in colour, before a number is read. Detail waits below, never competing with the headline.

Decision

Human in the loop

Show every decision. Make every correction count.

1
Visible
Every decision logged, with the why.
2
Reversible
Override, one click, always there.

“How do you make an autonomous AI feel safe enough that an engineer lets it touch production?”

What changed

Rejected

The agent acts silently

Approve or reject, no context

Override with no feedback

Designed

A full decision log, with the why

Override at any point, always visible

Every correction feeds the model

Reasoning

The agent restarts containers, runs failover, reallocates memory, on its own. Terrifying to hand a VP unless every decision is visible, reviewable and reversible.

So the agent shows its work, before and after it acts, with override one click away. Every correction is training data, and I made that visible so the work feels worth doing.

Tried first

✎ From the notebook

rlhf only works when engineers engage carefully. at 3am with something on fire they hit override and move on, not thinking about teaching the model. so the loop only improves in calm conditions, which is exactly when you need it least. i designed for the ideal. the hard edge case is still open.

Entity reasoning graph: SLO, component and metric nodes, with breaching nodes in red.

Decision

The reasoning graph

Show the chain, not just the conclusion.

1
Diagnosis
Red marks where the chain breaks.
2
The chain
Promise to target to component to metric.

“How do you show the reasoning behind an AI decision, so an engineer can judge whether it was right?”

What changed

Rejected

A text log of what happened

A flat list of components

Making engineers run queries

Designed

A visual graph of the reasoning chain

Node shape encodes entity type

Colour shows breach state at a glance

Reasoning

“Agent restarted Container X” tells you what happened, not why. So I made the reasoning itself the artifact: a promise connects to a target, to a component, to a metric.

Red nodes show exactly where the chain is breaking, and an engineer reads the diagnosis without opening a log or writing a query.

Colour carries the answer before any label is read. That consistency is the design language of the product.

05 / Outcome

What the product is built to deliver.

<5 minfrom breach to corrective action, down from the 41 minutes a human took to respondResponse

70%of standard, repeating incidents the agent is built to resolve without a human touching productionAutonomy

20%of the monthly invoice at risk on every missed target, caught before the credit clause triggersRevenue

The agent takes the heavy, predictable work. The team keeps the judgment.

06 / What I'd do differentlyAn honest note

The whole product is built on the agent being sure. The case I never designed for is the one it has never seen before.

Today it does one of two things, both wrong: it acts confidently on a situation it doesn’t understand, or it escalates with no context and leaves someone to reconstruct it at 3am. The honest version needs an explicit “I’m not confident here” state, where the agent surfaces its own uncertainty and hands off cleanly, with enough context for a person to actually act.

✎ Margin note · to self

every screen i designed assumes the agent knows what it’s doing, the dashboard, the override, the reasoning graph, all of it. none of them have a state for “i’m not sure.”

that absence is the next thing i’d design.

Next case study

Qlarc Strategy.

Systems DesignStrategyAI GovernanceB2B SaaS

Vendors had the AI governance. Buyers required proof of it. The gap between them was killing deals.

Read Qlarc Strategy →

← All work Let’s talk →