UX case study · AIMSLO

Protected 20% of monthly revenue lost to SLA breachesfor a B2B SaaS provider.

Swaayata fixes SLA breaches autonomously. I designed the product and its human-in-the-loop.

Role
Lead Product Designer
Goal
Make every autonomous action visible, reversible, and explainable.
Enterprise SaaS · 20246 months · concept to build-ready
Swaayata real-time operations dashboard: SLO status, incidents, service health, and live metrics in one view.
  1. 01The agent doesn’t just alert. It acts, and shows its working in the live feed.
01 / Overview

It turns a signed promise into a live target, then runs the agent that keeps it.

Two terms run through the whole product. An SLA is the promise a company signs. An SLO is the measurable target it has to become. Swaayata reads the first into the second.

Take one. “The payment system will respond in under two seconds, 99.9% of the time” is a sentence in a contract, signed by people who will never get paged. Swaayata reads it into a live target, checked every second and something an agent can act on: p99_latency < 2000ms, success_rate ≥ 99.9%.

Then, on every target it tracks, the agent

01
Acts

When performance slips, it takes corrective action, not just another alert.

Autonomy
02
Learns

Each human override becomes a training signal. The model sharpens through RLHF.

Feedback
03
Governs

Every action is visible and reversible. You step in only when you want to, oversight by choice, not a gate.

Accountability
02 / The problem

A missed target isn’t a bug. It’s lost revenue.

An SLA is a financial contract, not a dashboard reading. Every target carries a credit clause the company has already agreed to pay, so a missed target isn’t counted in downtime. It’s counted in dollars:

What one missed month costs
$267Kmonthly fee, on a $3.2M a year account
20%the credit clause, already in the contract
= revenue lost, automatically
$0KRevenue
lost
Gone the moment a target slips. No approval, no appeal.

And it fires on every breach. Miss enough months and the account that took a nine-month sale to win starts shopping.

03 / The approach

Most of an autonomous system runs with no one watching.

So I didn’t design most of it. I designed the three places a human is watching, where an engineer decides whether to let the agent run at all. The rest, it could keep.

See it.

Are we actually meeting our commitments?

The system answers in colour, before a single number is read.

Stop it.

What is it doing to production, right now?

Every decision the agent makes is visible, and one click from reversed.

Understand it.

Why did it make that call?

The reasoning chain is the artifact, not just the conclusion.

Most of an autonomous system runs with no one watching.

So I didn’t design most of it. I designed the three places a human is watching, where an engineer decides whether to let the agent run at all. The rest, it could keep.

  1. See it.

    Are we actually meeting our commitments?

    The system answers in colour, before a single number is read.

  2. Stop it.

    What is it doing to production, right now?

    Every decision the agent makes is visible, and one click from reversed.

  3. Understand it.

    Why did it make that call?

    The reasoning chain is the artifact, not just the conclusion.

04 / Design decisions

Three surfaces. The calls behind them.

Each one is a place an engineer has to trust the agent. Each became one decision, shown on the real product.

Swaayata dashboard: SLO met, not met and incident counts with trends, SLO compliance and service health.
Decision
01
The dashboard

Colour first. Numbers second.

  1. 1
    The move

    Status in colour, before any number is read.

  2. 2
    Hierarchy

    Two levels: headline, then drill-down.

How do you show the health of dozens of SLA promises at once, without overwhelming an engineer?

What changed
Rejected
Reproducing dashboards at a higher level
Every metric with equal weight
Numbers as the primary signal
Designed
Colour as the primary signal for SLA health
Status first, detail on demand
Two levels: headline, then drill-down
Reasoning

Engineers already have every metric they could want. What they lack is a single answer to the question their manager asks every morning: are we meeting our commitments?

So the top of the screen answers that in colour, before a number is read. Detail waits below, never competing with the headline.

Human in the Loop: decision log with confidence levels, SLO compliance, AI versus human ratio, and an override panel.
Decision
02
Human in the loop

Show every decision. Make every correction count.

  1. 1
    Visible

    Every decision logged, with the why.

  2. 2
    Reversible

    Override, one click, always there.

How do you make an autonomous AI feel safe enough that an engineer lets it touch production?

What changed
Rejected
The agent acts silently
Approve or reject, no context
Override with no feedback
Designed
A full decision log, with the why
Override at any point, always visible
Every correction feeds the model
Reasoning

The agent restarts containers, runs failover, reallocates memory, on its own. Terrifying to hand a VP unless every decision is visible, reviewable and reversible.

So the agent shows its work, before and after it acts, with override one click away. Every correction is training data, and I made that visible so the work feels worth doing.

Tried first
From the notebook

rlhf only works when engineers engage carefully. at 3am with something on fire they hit override and move on, not thinking about teaching the model. so the loop only improves in calm conditions, which is exactly when you need it least. i designed for the ideal. the hard edge case is still open.

Entity reasoning graph: SLO, component and metric nodes, with breaching nodes in red.
Decision
03
The reasoning graph

Show the chain, not just the conclusion.

  1. 1
    Diagnosis

    Red marks where the chain breaks.

  2. 2
    The chain

    Promise to target to component to metric.

How do you show the reasoning behind an AI decision, so an engineer can judge whether it was right?

What changed
Rejected
A text log of what happened
A flat list of components
Making engineers run queries
Designed
A visual graph of the reasoning chain
Node shape encodes entity type
Colour shows breach state at a glance
Reasoning

“Agent restarted Container X” tells you what happened, not why. So I made the reasoning itself the artifact: a promise connects to a target, to a component, to a metric.

Red nodes show exactly where the chain is breaking, and an engineer reads the diagnosis without opening a log or writing a query.

Colour carries the answer before any label is read. That consistency is the design language of the product.

05 / Outcome

What the product is built to deliver.

<5 minfrom breach to corrective action, down from the 41 minutes a human took to respondResponse
70%of standard, repeating incidents the agent is built to resolve without a human touching productionAutonomy
20%of the monthly invoice at risk on every missed target, caught before the credit clause triggersRevenue

The agent takes the heavy, predictable work. The team keeps the judgment.

06 / What I'd do differentlyAn honest note

The whole product is built on the agent being sure. The case I never designed for is the one it has never seen before.

Today it does one of two things, both wrong: it acts confidently on a situation it doesn’t understand, or it escalates with no context and leaves someone to reconstruct it at 3am. The honest version needs an explicit “I’m not confident here” state, where the agent surfaces its own uncertainty and hands off cleanly, with enough context for a person to actually act.

Margin note · to self

every screen i designed assumes the agent knows what it’s doing, the dashboard, the override, the reasoning graph, all of it. none of them have a state for “i’m not sure.”

that absence is the next thing i’d design.

Next case study

Qlarc Strategy.

Systems DesignStrategyAI GovernanceB2B SaaS

Vendors had the AI governance. Buyers required proof of it. The gap between them was killing deals.