ASP.NET Core Request Timeout: Enterprise Decision Guide

Every production API eventually encounters an endpoint that takes too long. A downstream service hangs, a database query blocks, or a third-party call stalls — and suddenly your thread pool is exhausted, your response times spike, and your SLA is in flames. How your team handles ASP.NET Core request timeout strategy at the architecture level is what separates brittle services from resilient ones.

There are three distinct mechanisms available in the .NET ecosystem for controlling how long a request is allowed to run: the built-in RequestTimeouts middleware (introduced in ASP.NET Core 8), propagating a scoped CancellationToken through every I/O call, and Polly's resilience pipeline with a dedicated timeout strategy. Each operates at a different layer, enforces different guarantees, and carries different operational trade-offs. Choosing the wrong one — or ignoring the question entirely — is one of the most common gaps in enterprise API design. The complete patterns and a working implementation are available on Patreon — with production-ready source code that maps directly to what enterprise teams actually ship.

Understanding how these mechanisms fit together is also a core part of building resilient ASP.NET Core APIs at scale. Chapter 10 of the Zero to Production course covers Polly resilience pipelines alongside rate limiting — walking through retry, circuit breaker, and timeout policies inside a complete production API codebase where the interactions between strategies are already wired together.

What Problem Are We Actually Solving?

Before comparing mechanisms, it's worth being precise about the failure modes timeout strategy is meant to address.

Slow downstream calls. A database query that normally takes 20ms suddenly takes 45 seconds. Without a timeout, the thread holding that connection is blocked for the duration, degrading throughput for every other request.

Client abandonment without cleanup. A mobile client drops its connection after 5 seconds of waiting, but the server continues processing — consuming CPU, memory, and database connections for a response nobody will ever receive.

Cascading saturation. A single slow upstream dependency causes threads to queue up, HTTP connection pool slots to exhaust, and eventually the entire service to stop accepting new requests. This is how a timeout on one endpoint becomes an outage across the service.

The right timeout strategy depends on which of these failure modes you're primarily protecting against — and that distinction drives the choice between middleware, token propagation, and Polly.

The Three Mechanisms Compared

RequestTimeouts Middleware (.NET 8+)

ASP.NET Core 8 introduced a first-class RequestTimeouts middleware via Microsoft.AspNetCore.Http.Timeouts. It integrates directly into the middleware pipeline and enforces per-endpoint or global timeout policies by signalling HttpContext.RequestAborted when the configured duration elapses.

What it does well:

Declarative, zero-boilerplate timeout enforcement per endpoint or route group
Works at the HTTP pipeline level — no changes required deep inside service or repository code
Policy-based: you define named timeout policies once and apply them via [RequestTimeout("policy-name")] or WithRequestTimeout(...) on endpoint groups
Integrates naturally with the existing CancellationToken flow — when a timeout fires, HttpContext.RequestAborted is already cancelled, which propagates to any awaited I/O that respects it

What it does not do:

It does not forcibly abort the request — it signals cancellation and expects the application code to honour it. If your EF Core queries, HttpClient calls, or service methods don't accept or check a CancellationToken, the middleware fires but nothing actually stops
It does not provide retry logic, circuit breaking, or fallback behaviour — it's purely a timeout boundary, not a resilience strategy
It's an ASP.NET Core 8+ feature. Teams on .NET 6/7 cannot use it without upgrading

When to reach for it: Use RequestTimeouts middleware when you need a declarative, pipeline-level boundary on how long any given endpoint is allowed to run. It's the correct first line of defence for API surface-level timeouts — the outer ring of your timeout strategy.

CancellationToken Propagation

A CancellationToken passed through every I/O call — EF Core queries, HttpClient requests, Redis calls, external SDK methods — is not itself a timeout mechanism. It's the propagation layer that makes timeout enforcement actually work. If you're building a robust understanding of CancellationToken usage in enterprise ASP.NET Core APIs, that context is worth reading alongside this guide.

The critical distinction: middleware and Polly both signal cancellation. But unless your code is passing that token downstream and your I/O calls are watching it, the signal goes nowhere.

// Every EF Core query should respect the token passed from the controller
var products = await _context.Products
    .AsNoTracking()
    .Where(p => p.IsActive)
    .ToListAsync(cancellationToken);

This is the pattern. Simple, but teams frequently skip it — especially in older codebases where tokens were retrofitted as an afterthought.

The two sources of cancellation your token should handle:

HttpContext.RequestAborted — fired when the client disconnects or when RequestTimeouts middleware elapses. This is the source to use in controllers and service methods called from the HTTP pipeline.
A linked token from Polly — when Polly's timeout strategy fires, it cancels a linked source. If your code has been given Polly's token (usually through ResilienceContext), it will participate in Polly-level timeouts too.

The discipline required: Consistent token propagation is a codebase-wide convention, not a one-time configuration. Teams need a clear rule: every method that performs I/O accepts a CancellationToken parameter and passes it to every call it makes. Without this discipline, RequestTimeouts middleware and Polly are both partially blind.

Polly Timeout Strategy

Polly's TimeoutStrategy (via Microsoft.Extensions.Resilience in .NET 8+) operates at the outbound call level — the point where your code calls a dependency, not where a request enters your API. This makes it the right tool for protecting against slow downstream services, not slow endpoints in general.

builder.Services.AddHttpClient<ICatalogClient, CatalogClient>()
    .AddStandardResilienceHandler(options =>
    {
        options.Timeout.Timeout = TimeSpan.FromSeconds(10);
    });

AddStandardResilienceHandler() bundles retry, circuit breaker, and timeout into a single, pre-configured Polly pipeline attached to a named HttpClient. The timeout here fires when the outbound HTTP call exceeds the configured duration — independent of the inbound request timeout at the API surface.

What Polly adds that middleware cannot:

Retry with backoff and jitter — if a downstream call times out, Polly can retry with exponential backoff before surfacing the failure
Circuit breaker — after a threshold of timeouts or failures, Polly opens the circuit and fast-fails subsequent calls without hitting the dependency at all
Hedging (Polly v8+) — dispatch a second parallel attempt if the first hasn't returned within a threshold, taking whichever completes first
Composable pipelines — stack timeout + retry + circuit breaker as a single, tested resilience unit per dependency

What Polly cannot do alone: Polly's timeout operates on the outbound leg only. It does not enforce an overall budget for the entire inbound request. A request that makes three external calls — each individually within Polly's per-call timeout — can still run for far longer than your endpoint-level SLA allows. You need RequestTimeouts middleware to enforce the overall budget.

The Layered Model: How They Work Together

The most robust enterprise approach treats these three mechanisms as complementary layers rather than alternatives:

Layer	Mechanism	What It Enforces
Inbound request budget	`RequestTimeouts` middleware	Overall time allowed for an endpoint to respond
Outbound call resilience	Polly timeout + retry + circuit breaker	Time allowed for each downstream call, with fallback
I/O participation	`CancellationToken` propagation	Ensures all I/O respects the cancellation signals above

A practical layered policy for a standard enterprise API endpoint:

RequestTimeouts policy: 30 seconds — the absolute maximum the endpoint is allowed to run before returning a 504 to the client
Polly HttpClient timeout: 10 seconds per outbound call, with 2 retries using exponential backoff
EF Core queries: all accept cancellationToken — so if the 30-second budget is hit, in-flight queries are cancelled cleanly

This composition means slow outbound calls get retried and circuit-broken. If retries still don't resolve the issue within the overall budget, the middleware signals cancellation and the API returns 504. No thread is held indefinitely.

When to Use Each Strategy

Use RequestTimeouts Middleware When:

You need a declarative, configuration-driven timeout boundary per endpoint or route group
You're running .NET 8 or later
The primary concern is enforcing an overall SLA for the HTTP response, regardless of what's happening internally
You want audit-friendly, centralised timeout policy names ("standard-api", "file-upload", "reporting") rather than scattered CancellationTokenSource boilerplate

Use CancellationToken Propagation When:

Always — this is not an alternative, it's a prerequisite for the others to work
Retrofitting timeout support into an existing codebase: start by threading the token into EF Core queries and HttpClient calls before adding middleware or Polly
Supporting client disconnect detection: HttpContext.RequestAborted naturally fires when the client drops the connection, not just on timeouts — valuable for expensive reporting endpoints

Use Polly Timeout + Resilience Pipeline When:

You have outbound HttpClient calls to external or internal services
You need retry logic, circuit breaking, or hedging alongside timeout enforcement
You're building a microservices topology where your API depends on other services and needs per-dependency resilience budgets
You're targeting .NET 6 or 7 and cannot use the RequestTimeouts middleware — Polly covers you at the outbound layer

Use All Three When:

You're building a production enterprise API that calls external services and needs both inbound SLA enforcement and outbound resilience
You have multiple downstream dependencies with different latency profiles (database fast, third-party payment slow, ML inference variable) — each gets its own Polly policy, but all operate within the shared inbound budget

Anti-Patterns to Avoid

Timeout without token propagation. Adding RequestTimeouts middleware and calling it done, without ensuring EF Core, HttpClient, and other I/O actually receive and honour the CancellationToken. The middleware fires — but nothing stops.

Per-dependency Polly timeouts without a global budget. Three downstream calls each with a 10-second Polly timeout, but no inbound request timeout. A sufficiently slow sequence of calls can hold a thread for 30+ seconds with no enforcement.

Catching OperationCanceledException and continuing. When a timeout fires, it surfaces as OperationCanceledException. Catching it and continuing the request — rather than short-circuiting and returning 504 — defeats the purpose entirely and risks sending a partial or inconsistent response.

Using Thread.Sleep or blocking sync calls inside async endpoints. No timeout mechanism can cancel a synchronously blocked thread. Ensure all I/O is genuinely async before relying on cancellation-based timeout enforcement.

Ignoring the IsDevelopment() check on timeout middleware. RequestTimeouts middleware is disabled in Development environment by default. Teams sometimes add environment-specific [DisableRequestTimeout] attributes inconsistently, creating a gap where timeouts are enforced in staging but not production — or vice versa.

Decision Matrix

Scenario	Recommended Approach
Enforce SLA on all HTTP endpoints	`RequestTimeouts` middleware with named policies
Protect outbound `HttpClient` calls	Polly `AddStandardResilienceHandler` or custom pipeline
Support client disconnect detection	`HttpContext.RequestAborted` via `CancellationToken` propagation
.NET 6 / .NET 7 (no RequestTimeouts)	Polly + `CancellationToken` propagation
Reporting/export endpoints needing longer budgets	Named timeout policy with per-endpoint override
All enterprise APIs in production	All three layers composed

How Does This Map to the Course?

The full implementation — RequestTimeouts middleware with policy naming, Polly AddStandardResilienceHandler wired to a typed HttpClient, and EF Core query cancellation — is covered in Chapter 10 of the Zero to Production course. The chapter shows each strategy applied to the same production API, so you can see how the timeout budget, Polly pipeline, and CancellationToken interact rather than studying them in isolation.

💻 A minimal but complete working example — with endpoint-level policies, cancellation-token handling, and Swagger testing — is available on GitHub: dotnet-request-timeout-middleware

☕ Prefer a one-time tip? Buy us a coffee — every bit helps keep the content coming!

FAQ

What is the difference between RequestTimeouts middleware and Polly timeout in ASP.NET Core?
RequestTimeouts middleware enforces an inbound request budget — how long an endpoint is allowed to run before returning a timeout response to the client. Polly timeout operates at the outbound call level — how long a single HttpClient call or dependency invocation is allowed to take before Polly steps in. They protect different boundaries and are best used together.

Does RequestTimeouts middleware automatically abort the HTTP request when the timeout fires?
No. When the configured timeout elapses, the middleware signals cancellation via HttpContext.RequestAborted. It does not forcibly terminate the thread or abort the TCP connection. The application code must observe and respect the CancellationToken — if I/O calls don't accept a token, the timeout fires silently without actually stopping anything.

Should I use HttpContext.RequestAborted or create a separate CancellationTokenSource for timeouts?
For inbound request timeouts, use HttpContext.RequestAborted — it's the canonical token that fires on both client disconnect and RequestTimeouts middleware expiry. Create a separate linked CancellationTokenSource only when you need a shorter budget than the overall request timeout, for example a specific expensive query that should be constrained tighter than the endpoint-level policy.

Is CancellationToken propagation mandatory for timeout strategies to work?
Functionally, yes. Both RequestTimeouts middleware and Polly's timeout strategy signal cancellation — but if your EF Core queries, Redis calls, HttpClient invocations, and service methods don't accept and pass through a CancellationToken, the signal goes unobserved. The I/O continues running even after the timeout fires.

What HTTP status code should ASP.NET Core return when a request timeout fires?
RFC 7231 and common practice recommend HTTP 504 Gateway Timeout when the server cannot produce a response within the configured time. If you're using RequestTimeouts middleware with a custom response writer, configure it to return 504. For Polly-level timeouts on outbound calls, the calling code should catch OperationCanceledException / TimeoutRejectedException and map them to an appropriate 504 or 503 response depending on your retry and circuit breaker state.

Can I configure different timeout durations per endpoint using RequestTimeouts middleware?
Yes. Named timeout policies allow per-endpoint or per-route-group configuration. Define policies with different durations in AddRequestTimeouts() — for example "fast-read" at 5 seconds, "standard-api" at 30 seconds, "file-upload" at 120 seconds — and apply them with [RequestTimeout("policy-name")] on individual endpoints or route groups. This is one of the primary advantages of middleware-based timeout over scattered CancellationTokenSource boilerplate.

Does Polly's AddStandardResilienceHandler include a timeout strategy by default?
Yes. AddStandardResilienceHandler() in Microsoft.Extensions.Http.Resilience configures a default pipeline that includes retry with exponential backoff, circuit breaker, and timeout. The default total timeout is 30 seconds with a 10-second per-attempt timeout. These defaults are overridable via the HttpStandardResilienceOptions configuration object.

Want to explore the full implementation? The annotated, production-ready source with all three layers wired together is on Patreon. For a guided walkthrough inside a complete ASP.NET Core API, Chapter 10 of the Zero to Production course covers everything in context.

ASP.NET Core Request Timeout Strategy in Enterprise APIs: RequestTimeouts Middleware vs CancellationToken vs Polly — Enterprise Decision Guide

What Problem Are We Actually Solving?