Commands vs Events: The Mental Model I Needed While Refactoring Tensorify
How splitting a 1.5-second request path into commands and events cut Tensorify's response time to 600ms — and the architectural thinking that made it possible.
When I was refactoring Tensorify, I hit a wall.
Tensorify is supposed to let users create backend APIs. That means response time matters. If an end user calls an API built on top of the platform, the system cannot feel slow, heavy, or indirect.
But my request-to-response path had become extremely slow — roughly 1.5 seconds before the refactor.
At first, I treated it like a normal performance problem. Maybe I needed to optimize queries. Maybe I needed better caching. Maybe I needed to clean up slow service calls.
Those things mattered, but they did not fully explain the problem.
The real issue was not one slow function. It was the shape of the request path itself.
A request would enter the platform, and too many internal operations had to finish before the response could return. The system looked modular in code, but at runtime it behaved like one long blocking chain. One service waited on another. That service waited on another internal operation. A side effect that had nothing to do with the immediate response could still delay the user.
That is when the architecture started to feel wrong.
Some people would call this a distributed monolith. I did not care much about the label at first. What mattered was the feeling: the system was split into parts, but the request still moved through it like everything was glued together.
That was the point where I started asking whether event-driven architecture could help.
I opened Cursor and asked a simple question:
Would following event-driven architecture reduce latency?
At the time, I did not properly understand what event-driven architecture meant. I had a vague idea that it involved Redis somehow. I thought maybe it was just about writing a key-value pair into Redis and letting another service read from it.
That was wrong.
The better answer was not simply yes or no. Event-driven architecture only helps latency when the slow part of the system is non-essential work blocking the response. If the required work itself is slow, a queue will not save you. If the core query is slow, a stream will not magically make it fast.
But if the system is slow because too much work is happening synchronously before the response, then event-driven thinking becomes useful.
It forces a better question:
What actually needs to happen before the response, and what can happen after?
That question became the center of the refactor.
The request path is the product
A request path is the chain of work that must finish before the user receives a response.
For a normal backend API, that might include validating the request, reading from a database, running business logic, writing state, and returning a response. For Tensorify, the request path matters even more because the product itself is about helping users create backend APIs.
If Tensorify adds unnecessary overhead, every API built on top of it feels slower. That is not just an implementation problem. It becomes part of the product experience.
The uncomfortable part was realizing that some of the work in my request path did not actually deserve to be there.
Some operations were required for correctness. They had to happen before the response. But other operations were side effects. They were important, but the user did not need to wait for them.
That difference sounds obvious in hindsight. It was not obvious while I was inside the system.
When latency is bad, it is tempting to search for “the slow thing.” A slow query. A slow service call. A missing cache. Sometimes that is the right approach. But in this case, the problem was not only the cost of each operation. It was the fact that the user was waiting for too many operations in the first place.
That was the real shift.
I stopped asking:
How do I make this whole chain faster?
And started asking:
Why is this whole chain blocking the user?
Before and after
The old shape of the system looked something like this:
User request
↓
Validate request
↓
Core logic
↓
Metadata sync
↓
Webhook delivery
↓
Analytics write
↓
Internal service update
↓
ResponseSome of those steps matter. The problem is that they all sit in the same synchronous path.
The better shape looked more like this:
User request
↓
Validate request
↓
Core logic
↓
Required state write
↓
Response
After response:
├── Broker → webhooks / retries / background jobs / cache warming
└── Stream → analytics / billing / logs / audit trail / activity historyThe refactor was not about making the system fancy. It was about making the request path honest.
If something was required to return a correct response, it stayed in the synchronous path.
If something was work that still needed to happen but did not need to happen while the user was waiting, it became a command.
If something was a fact that other parts of the system needed to observe, it became an event.
That is where the commands vs events distinction finally clicked.
The work that became commands
The first category I pulled out of the request path was work that still needed to happen, but did not need to happen immediately.
That kind of message is a command.
A command says:
Please do this.
In Tensorify, a command could be:
Send this webhook.
Retry this failed integration.
Warm this cache.
Run this background workflow.
Sync this non-critical metadata.These are not facts about the past. They are instructions for work that should happen somewhere else.
That is where a message broker fits.
A broker answers:
Who should do this work?
If the message was an instruction, and if that instruction did not need to block the response, it could leave the request path.
A broker does not remove work. It moves work.
The system still has to send the webhook, retry the integration, or warm the cache. But those operations no longer need to sit between the incoming request and the outgoing response.
That is how a broker can reduce perceived latency: not by making every operation faster, but by making the user-facing path smaller.
The facts that became events
The second category was not work at all.
It was history.
Some things in the system did not need to become commands. They needed to be recorded as facts.
An event says:
This happened.
In Tensorify, an event could be:
RouteCreated
RequestCompleted
WorkflowFailed
DeploymentFinished
WebhookSentA RequestCompleted event might be useful to analytics, billing, monitoring, dashboards, audit history, or debugging tools. But none of those consumers should block the original request.
That is where a stream fits.
A stream answers:
What happened, and who needs to know about it?
A stream gives the system a durable timeline. Multiple consumers can read the same event independently. A consumer can fall behind and catch up later. In some systems, events can be replayed to rebuild state or investigate what happened.
That mattered because I did not only need to move work out of the request path. I also needed a cleaner way for the rest of the platform to react to what had already happened.
Commands were for work.
Events were for facts.
That sounds small, but it changed the shape of the system.
The result
After the refactor, the response time dropped from roughly 1.5 seconds to around 600 milliseconds.
Seeing that number drop was the moment the refactor stopped feeling theoretical. The architecture change had turned into something the product could actually feel.
The improvement did not come from “using event-driven architecture” as a magic pattern. It came from making the request path smaller.
The system still had to do the work. Webhooks still had to be sent. Metadata still had to be synced. Events still had to be recorded. But those responsibilities no longer sat directly between the incoming request and the outgoing response.
That was the real win.
The user-facing path became more focused:
request → validate → required logic → required state write → responseEverything else had to justify why it deserved to block the user.
The mistake I almost made
One mistake I almost made was treating asynchronous work as a place to hide every slow operation.
Once you see latency drop by moving work out of the path, it becomes tempting to move everything.
That would have been wrong.
Some work can move out of the request path. Some work cannot. If a state write is required for correctness, moving it to a broker does not make the system better. It only makes the system harder to reason about.
For example, if Tensorify needs to persist the route configuration before saying the API is ready, that write belongs in the synchronous path. Moving it to a background worker might reduce the apparent response time, but it would create a correctness problem. The user could receive a successful response before the system is actually ready.
That is not latency optimization.
That is lying to the user.
This is the part of event-driven architecture that I think is easy to miss when learning it from the outside. Async work feels like an escape hatch. Slow thing? Put it in a queue. Expensive thing? Move it to a worker. Blocking thing? Make it event-driven.
But architecture is not about hiding latency. It is about deciding where responsibility belongs.
If the user needs a guarantee before the response, that guarantee has to be real. If the system says “your API is ready,” then the state required to make that true needs to exist already.
The real goal is not to make everything async. The real goal is to decide which work is required for correctness and which work is a side effect.
That distinction matters more than the tool.
Conclusion
I started with a naive question:
Will event-driven architecture reduce latency?
The better answer is:
It depends on why the system is slow.
Not “use Redis.”
Not “use Kafka.”
Not “make everything async.”
The lesson was simpler:
Commands are instructions. Events are facts.
That mental model made the architecture easier to reason about.