Replace brittle RPA with event-driven agents

Screen-scraping bots amplify technical debt. Event-driven agents anchored to system events and idempotent APIs cut error rates and survive schema changes. If your automation cannot explain its actions with a ledger, it is a liability — not an asset.

The Fortune 500 RPA story has aged badly. The robotic-process-automation tools that were sold as a low-code path to automating back-office work have, in many of the deployments we have audited, produced a portfolio of fragile screen-scraping scripts that break when an upstream UI changes a button color, run without an audit trail any internal auditor will accept, and quietly proliferate beyond the central CIO's view.

The renewal cycle for the licenses many CIOs signed in 2023 is coming due in late 2026 and early 2027. The natural budget window to swap them out is open now, not in the panic of the renewal week. The replacement is not a different RPA vendor. It is a different model of automation entirely — event-driven agents anchored to system events and idempotent APIs, with an audit trail that survives both SOX scrutiny and the agent's own evolution.

This is the entry point for Vardr's enterprise practice, not just our government work. The pattern that makes agentic systems defensible at OIG is the same pattern that makes them defensible at PCAOB.

What RPA actually did wrong

RPA tools sold a flexibility story: you could automate any process without changing the underlying systems. The flexibility was real and the cost was paid downstream.

They depended on the UI as the integration layer. A button moves. The script breaks. The button gets renamed in a localization update. The script reports success while doing nothing. The team that maintains the script discovers the failure three weeks later, after the downstream process has been silently corrupted.

They had no idempotency contract. A script that times out and retries can perform the same action twice. A script that processes a queue can lose items. Duplicate invoices, double-paid bills, partial state changes — these are the symptoms internal audit sees and the symptoms the engineering team has to explain.

They produced action logs, not audit trails. A typical RPA log says "process completed successfully" or "clicked submit button on line 47." That is not an audit trail. It is operational telemetry. When internal audit asks why a particular invoice was approved, the RPA log does not contain the answer.

They proliferated outside CIO governance. A business unit could license RPA without a full IT review, build a hundred scripts, and accumulate operational risk that does not surface until something breaks at scale. The 2025 audit cycle started exposing these portfolios; the 2026 cycle is going to be harder.

The combination is what produces the renewal-cycle reckoning. The licenses are expensive, the scripts are unmaintainable, and the audit posture is indefensible. The right move is not a different RPA vendor. It is a categorically different model.

Domain events, not click sequences

The right unit of automation is not "the steps a human would do." It is "the event in the business that needs to happen." A customer cancels a subscription. An invoice arrives. A regulatory filing comes due. A vendor's quarterly review hits the calendar.

These are domain events. They are nouns in the business. They are also, in any reasonably modern system, available as actual events — Salesforce object change events, ERP message-bus emissions, finance-system webhooks, document-management notifications. The events exist. The automation can subscribe to them.

When the automation is built around the event rather than around the UI sequence, four things change:

The integration is stable. The event has a schema. The schema is versioned. When the schema changes, the change is announced and the consumer can be updated explicitly. There is no equivalent of "the button moved."

Idempotency is feasible. Each event has an identifier. The automation can be safely re-run on the same event without producing a duplicate effect. Retries become safe. Recovery from a failed run becomes a matter of replaying the events that failed.

Audit becomes a property of the system, not a layer on top. Each event produced an action. The action produced a state change. The state change is recorded. The chain — event → action → state change — is the audit trail, and it is queryable in a way the RPA log is not.

The automation can be tested. The event is a structured input. The action is a function. Testing the action against the event is unit-testable. The current RPA pattern, where the test environment is "run it against a sandbox and see if it breaks," is replaceable.

The agent as the orchestrator

The shift to event-driven does not require an agent. A handler subscribed to an event queue can do the work. The agent enters when the work requires judgment — categorizing an exception, drafting a response, deciding which of several next steps to take.

The pattern that works:

The event arrives. A handler receives it. The handler does the structured, deterministic work — extracting the relevant fields, calling the necessary read APIs, applying any rules that are unambiguous. When the work reaches a step that requires judgment, the handler invokes an agent with a constrained tool surface (read-only data access, narrow write tools) and an evidence-pack-style output requirement. The agent produces a draft action with the evidence. A human reviews it. The reviewed action is the write that mutates state.

This is the read/write/determine model from the authority boundary piece, applied at the enterprise rather than the government scope. The line between agent action and human action is enforced at the API surface. The agent never directly mutates a determination that the auditor will ask about; it prepares the determination, the human signs off, and the human's action is what hits the system of record.

For internal-audit and SOX-relevant processes, this is the architecture that produces a defensible posture. The agent's role is documented. The human's role is documented. The chain is auditable. The PCAOB-relevant evidence is queryable on demand.

Exception handling as the first high-value workload

The most common question we get from enterprise CIOs evaluating a move off RPA is "where do we start?" The answer is usually exception handling.

Exception workloads — invoices that don't match purchase orders, customer credits that need manual approval, vendor records that fail validation, contracts that need renewal-clause review — share three properties that make them ideal first targets:

The volume is high enough to justify the engineering investment.
The unit cost per exception (in human time, in delayed revenue, in customer dissatisfaction) is high enough that the ROI is visible within a quarter.
The work involves judgment, which is where the agent adds value beyond what a simple rules engine would.

A typical engagement we have run starts with three exception classes pulled from the past quarter's volume. For each class, we measure the current per-exception time (often in the 15-to-40-minute range when a human has to gather context, evaluate alternatives, and document the decision), build an event-triggered handler that pre-assembles the evidence pack and produces a draft action, and present it to the human in a single-screen review surface. The per-exception time drops to a few minutes for the simple cases — and crucially, the human spends those few minutes on judgment, not on context-gathering.

The 2026 PCAOB guidance update tightened IT-dependency evidence requirements on opaque automations. An exception workload handled by a documented, evidence-producing agent is in much better posture against that guidance than the same workload handled by an RPA script no one has touched in eighteen months.

What replaces a script

When we migrate a specific RPA script to the event-driven pattern, the artifact looks like this:

A subscription to the relevant event from the system of record. The integration is named, the schema is documented, and the schema version is pinned.

A handler that processes the event. The handler is a function. It is tested. It is version-controlled. It produces a structured log entry per execution including the event identifier, the handler version, the timestamp, the action taken, and the state of all relevant write tools after the action.

An agent invocation, if the handler reaches a judgment step. The agent has a constrained tool surface, an evidence-pack output contract, and a draft-action output. The agent never mutates state directly.

A human-review surface that presents the draft action and evidence in a one-click accept/amend/reject UI. The reviewer's action is what writes to the system of record.

An audit query interface that can answer questions like "show me every event of type X processed in the past 30 days, with the agent's draft and the human's final action." The interface exists as a queryable data product, not as an export job.

The whole thing is more code than the RPA script. It is also defensible at audit, durable against UI changes, and replaceable piece by piece when the agent or the event schema evolves.

The CFO and CIO conversations

The case for the migration is different per audience.

For the CFO, the case is the audit posture and the risk of an opaque-automation finding under the updated PCAOB guidance. The cost of a finding is non-trivial; the cost of remediation under audit pressure is much higher than the cost of remediation now.

For the CIO, the case is the renewal cycle. The 2023 RPA licenses are coming due in 2026 and 2027. The renewal will be expensive. The conversation with the procurement team has to start now, with a clear technical replacement story, not in the renewal-week panic.

For the line-of-business owner, the case is the work itself. The current RPA scripts break and they get blamed for the downstream effects. The event-driven replacement breaks less, surfaces failures clearly, and produces an audit trail their internal-audit colleagues will accept.

These are three different conversations and they need to happen in parallel. We have seen migrations fail when only one of the three was aligned.

What to do Monday

Inventory your top ten RPA scripts by transaction volume. For each, identify whether the underlying system of record emits the relevant business event as a structured event (most modern Salesforce, NetSuite, SAP, Workday, and ServiceNow integrations do).

For two scripts where the event is available, write the migration scope on a single page: the event, the target APIs (preferring API calls over UI scraping), the idempotency key, the audit-log shape, and the human-review point if judgment is required.

If your internal audit team has flagged any RPA scripts in the past year, those are the first ones to migrate. The audit posture is the visible problem; the brittleness is the hidden one.

Where Vardr fits

We do these migrations as the entry point into the enterprise practice — picking the right first three exception workloads, building the event-driven pattern against the relevant system of record, and producing the audit-ready handler-plus-agent architecture that replaces the RPA script. Kevin brings the data-platform and backend depth to specify the event integration, idempotency, and audit data product. Amlan brings the agent-workflow engineering for the judgment steps and the evidence-pack output that satisfies internal audit. The deliverable is a working migration on the first workload within six weeks — and a replicable pattern for the rest of the RPA portfolio after that.

Replace brittle RPA with event-driven agents that pass audit