
Middleware Pipeline PR
Check out the pull request that introduced this architecture.
try/except
block there. It was all sprinkled across the methods of our project. It worked, but it was fragile.
We knew we were technical-debt-financing our own future. It was time to pay up. We had to find a way to decouple our core feature logic from the operational wiring that held it all together. This is the story of how we did just that by building a middleware pipeline from the ground up.
From Tangled Logic to a Clean Pipeline
The breakthrough came when we stopped thinking about a request as a single, monolithic function call and started thinking of it as a message flowing through a pipeline. Each stage in the pipeline gets a chance to inspect the message, add data to it, change it, or even stop it dead in its tracks. Then, after the core work is done, the response flows back through those same stages in reverse. This “middleware” pattern isn’t new, but applying it to our system was game-changing. It let us untangle the operational logic (e.g., “how long did this take?”) from the business logic (e.g., “call this tool”). We could finally build these concerns as independent, reusable, and testable components. Our design goals were born from past pain:- No, you don’t have to rewrite your code: The new system had to be a drop-in. Developers using our existing
ClientSession
shouldn’t have to change a single line of their code. - Make the right thing easy (and type-safe): Writing new middleware should be straightforward, with full auto-complete and static analysis support. No more guessing what’s in the
context
object. - Be both specific and general: We needed a way to write middleware that runs on every single request, but also middleware that only targets a specific method like
tools/call
.
The Core Components
Our architecture boils down to four key components that work in concert.1. The MiddlewareContext
: Our Universal Passport
For a message to travel through the pipeline, it needs a passport, a standard document carrying all its vital info. That’s our MiddlewareContext
. It’s a generic dataclass that holds a request’s unique ID, the RPC method name, and, crucially, its strongly-typed parameters.
Generic[T]
, we give the type checker the information it needs. If a middleware gets a context where method == "tools/call"
, a developer knows that context.params
is a CallToolRequestParams
object. This has saved us from countless AttributeError
bugs at runtime.
2. The Middleware
Base Class: A Template for Behavior
This is the contract for anyone wanting to plug into the pipeline. Instead of forcing developers into a massive if/elif/else
chain to check the request type, we use a bit of dispatch magic to route the context to the right handler.
on_request
. Need to add a cache check only for read_resource
calls? Just implement on_read_resource
. The base class handles the routing; you just write the logic.
3. The MiddlewareManager
: The Pipeline Conductor
This is the brain of the operation. It holds the list of registered middleware and stitches the pipeline together for each request. Its process_request
method uses a neat bit of functional composition to wrap the original function call, layer by layer.
try...finally
block. This is how we ensure that we capture timing, results, and errors for every single request in a structured way. Centralized error handling was a massive win for our on-call engineers.
4. The CallbackClientSession
: The Invisible Adapter
This was the secret sauce for our “zero-intrusion” goal. How do you rewire a factory without the workers noticing? You build an adapter that looks and feels exactly like the old tool.
The CallbackClientSession
is a wrapper around the original ClientSession
. It exposes the exact same methods (call_tool
, read_resource
), so from the user’s perspective, nothing has changed. But on the inside, it’s not executing the call. It’s packaging the arguments into a MiddlewareContext
and handing it off to the MiddlewareManager
.