ASP.NET Core Slow First Request: Root Cause and Fix

Every .NET team eventually runs into it: the application is fast during load testing, but the first real user request after a deployment - or after a quiet overnight period - takes five, ten, sometimes thirty seconds. Support tickets arrive. The team assumes the server is overloaded. But the real cause is almost always something else.

Slow first requests in ASP.NET Core production environments are caused by a predictable set of root causes, and each one has a clear fix. If you want to go deeper with annotated, production-ready source code that maps to what enterprise teams actually ship, Patreon has full implementations - including warm-up patterns, hosted service startup hooks, and diagnostics you can run immediately.

The worst part of this problem is not the latency itself - it is the unpredictability. A request that normally completes in 50ms suddenly takes 8,000ms, the health check misses its window, and Kubernetes restarts the pod. That restart triggers another cold start. The cycle repeats.

This article covers every root cause behind cold start latency in ASP.NET Core, how to diagnose which one is hitting your system, and the specific fix for each. Understanding how Chapter 12 of the ASP.NET Core Web API: Zero to Production course handles IHostedService startup sequencing, background service initialization, and the Outbox processor warm-up can make this pattern click immediately inside a complete production codebase.

What Makes the First Request Slow?

"First request" latency is not one problem. It is five different problems that happen to share the same symptom. Treating them all the same leads to fixes that solve nothing.

The root causes fall into these categories:

JIT compilation delay - .NET's Just-In-Time compiler hasn't seen this code path yet
Dependency injection graph instantiation - Singleton services get created on first access, not at startup
External warm-up dependencies - Database connection pools, external HTTP clients, caches, and configuration providers that initialise lazily
Middleware pipeline evaluation - Some middleware components build internal data structures on first invocation
Application Idle / IIS App Pool recycling - IIS shuts down idle processes; the next request restarts from zero

Understanding which category is hitting your system changes the diagnostic approach entirely.

Why the JIT Compiler Delays the First Request

.NET compiles IL (Intermediate Language) to native machine code at runtime, on first use. Every method in your application is compiled the first time it is called. In a large enterprise API with hundreds of controllers, services, validators, and middleware components, the cumulative JIT cost can add hundreds of milliseconds - or more - to the first request.

This is distinct from C++ or Go, which compile to native code ahead of time. The tradeoff is faster iterative development and cross-platform portability, but slower cold starts.

How to Diagnose JIT Delay

Add a log timestamp at the very start of your first controller action and compare it against the request arrival time logged by middleware earlier in the pipeline. If the gap between the middleware timestamp and the first controller log is measured in seconds, JIT is likely involved.

The more definitive test is to enable ReadyToRun compilation:

<PublishReadyToRun>true</PublishReadyToRun>

Publish with ReadyToRun enabled and measure startup latency again. ReadyToRun pre-compiles frequently used code paths into R2R format, which the runtime can use directly without JIT, falling back to JIT only for paths not covered. For most enterprise APIs, this alone cuts cold-start JIT cost by 30 - 60%.

The Fix: ReadyToRun + Tiered Compilation

In your .csproj:

<PublishReadyToRun>true</PublishReadyToRun>
<TieredCompilation>true</TieredCompilation>

Tiered compilation lets the JIT make a quick first pass for correctness (Tier 0) and then optimise hot paths later (Tier 1) without blocking the request. This means the first request pays a smaller upfront cost, and subsequent requests get the fully optimised version.

For scenarios where absolute first-request latency matters - serverless, burst-traffic APIs - also consider Native AOT (Ahead-Of-Time compilation), which compiles everything ahead of deployment. The tradeoff is significantly longer publish times and incompatibility with reflection-heavy libraries. For most long-running API hosts, ReadyToRun with Tiered Compilation is the right choice.

Why Dependency Injection Causes Cold Starts

Singleton services in ASP.NET Core are created on first resolution, not when the container is built. That means the first request that touches a large singleton - one that wraps a caching layer, an HTTP client pool, a database context factory, or an external SDK - pays the full construction cost.

In a straightforward application, this is milliseconds. In an enterprise API with deep service graphs, each singleton may instantiate dozens of transitive dependencies. The cumulative construction cost is often the dominant factor in first-request latency.

How to Diagnose DI Graph Instantiation Delay

Wrap your service registrations with a simple timer in Program.cs to verify construction happens at startup vs at first request:

var startupStart = Stopwatch.GetTimestamp();
var app = builder.Build();
var elapsed = Stopwatch.GetElapsedTime(startupStart);
app.Logger.LogInformation("DI container built in {ElapsedMs}ms", elapsed.TotalMilliseconds);

If Build() completes fast but the first request is slow, construction is deferred. If Build() itself is slow, you have a startup-time DI issue - a different problem.

The Fix: Force Singleton Construction at Startup

Use IServiceProvider.GetRequiredService<T>() on your critical singletons immediately after Build():

var app = builder.Build();

// Force construction before first request
_ = app.Services.GetRequiredService<IMyHeavySingleton>();
_ = app.Services.GetRequiredService<IExternalClientPool>();

This is deliberately simple and effective. Singletons are cheap to construct once - the problem is that the cost lands on a user request if you don't control when it happens.

For a more structured approach, implement IHostedService and resolve your critical services inside StartAsync:

public class WarmupService : IHostedService
{
    private readonly IServiceProvider _services;

    public WarmupService(IServiceProvider services) => _services = services;

    public Task StartAsync(CancellationToken cancellationToken)
    {
        _ = _services.GetRequiredService<IMyHeavySingleton>();
        return Task.CompletedTask;
    }

    public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}

builder.Services.AddHostedService<WarmupService>();

The runtime calls StartAsync on all hosted services before the application begins serving requests. This guarantees your singletons are warm before any user traffic lands.

Why Database and External Service Connections Delay the First Request

Connection pools - whether ADO.NET, EF Core, Redis, or HTTP client pools - do not pre-establish connections unless you explicitly tell them to. The first request that touches the database creates the first connection from scratch, including TCP handshake, TLS negotiation, authentication, and pool expansion.

On a database with strict TLS and certificate validation (standard in production), this can add 500ms - 2,000ms to the first request. On a Redis instance behind a VNet, the penalty is similar.

How to Diagnose Connection Pool Cold Start

Enable connection pool event counters or use dotnet-counters to observe pool creation events:

dotnet-counters monitor --process-id <pid> --counters Microsoft.Data.SqlClient

Look for the connection-pool-created and active-connections counters. If they jump from 0 to 1 exactly when your slow first request occurs, the pool cold start is the cause.

The Fix: Pre-Warm Connections in a Hosted Service

A hosted service that opens and immediately closes a connection forces the pool to establish its first connection during startup, not during user traffic:

public class DatabaseWarmupService : IHostedService
{
    private readonly IDbContextFactory<AppDbContext> _contextFactory;

    public DatabaseWarmupService(IDbContextFactory<AppDbContext> factory)
        => _contextFactory = factory;

    public async Task StartAsync(CancellationToken cancellationToken)
    {
        await using var context = await _contextFactory.CreateDbContextAsync(cancellationToken);
        _ = await context.Database.CanConnectAsync(cancellationToken);
    }

    public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}

CanConnectAsync() opens a real connection and verifies the database is reachable. The pool retains the connection after the check, so the first user request gets a pooled connection with zero cold-start cost.

Apply the same approach to Redis and typed HttpClient instances - make one connection during startup to force pool initialisation.

Why IIS and App Pool Recycling Causes Idle-Period Slowness

If your production environment uses IIS (on-premises or Azure App Service on Windows), App Pool recycling is a separate and common source of slow-first-request behaviour. IIS recycles application pools by default after 20 minutes of idle time. The next request triggers a full application restart.

This is completely different from JIT or DI cold start - the process is entirely dead and must be relaunched. The slow request is paying for OS process creation, .NET runtime startup, DI container construction, and connection pool warm-up, all at once.

How to Diagnose IIS App Pool Recycling

Check the Windows Application Event Log for WAS (Windows Activation Service) events. Event ID 5079 indicates an application pool was recycled. Correlate the timestamp with your slow-request log entries.

On Azure App Service, the Platform Logs blade shows IIS Application Recycle events in the Activity Log.

The Fix: Disable Idle Timeout and Enable Application Initialisation

In web.config, set the idle timeout to zero to prevent recycling on idle:

<processModel idleTimeout="00:00:00" />

Enable the IIS Application Initialisation module to pre-warm the application automatically after every recycle:

<applicationInitialization doAppInitAfterRestart="true">
    <add initializationPage="/health/live" />
</applicationInitialization>

The initialisation module sends a synthetic request to the specified URL after every application start. The application is kept in a warm state before IIS begins routing real user traffic to it.

On Azure App Service, enable "Always On" in Configuration → General Settings. This prevents the platform from pausing your application during idle periods entirely.

Why Kubernetes Readiness Probes and Pod Scheduling Add Latency

In Kubernetes deployments, a newly scheduled pod serves traffic the moment it passes its readiness probe. If the readiness probe passes before warm-up is complete - because the probe only checks HTTP 200 from the process, not whether the DI graph is fully initialised - real requests land on a partially warm application.

This is often misread as a slow-first-request problem, but the actual cause is a readiness probe that signals too early.

How to Diagnose a Premature Readiness Signal

Add a startup probe with a longer initialDelaySeconds than your typical startup time:

startupProbe:
  httpGet:
    path: /health/ready
    port: 8080
  failureThreshold: 30
  periodSeconds: 2

This gives the application 60 seconds to become ready before Kubernetes considers the pod unhealthy. The readiness probe should only pass when the application is genuinely ready to serve traffic - meaning all warm-up hosted services have completed StartAsync.

In ASP.NET Core, add a custom readiness check that tracks whether warm-up is complete:

builder.Services.AddHealthChecks()
    .AddCheck<WarmupHealthCheck>("warmup-complete");

The WarmupHealthCheck returns Unhealthy until the WarmupService has finished. The /health/ready endpoint returns a non-200 status until warm-up completes, so Kubernetes does not route traffic until the application is genuinely ready.

Which Cause Should You Fix First?

Is the Slow Request Happening After a Quiet Period?

Check whether the slow request pattern is time-correlated:

Pattern	Most Likely Cause
Slow after 20+ minutes of no traffic	IIS App Pool recycling
Slow after every pod restart	JIT + DI + connection pool
Slow only after scaling events	Pod scheduling + premature readiness
Slow after every deployment, then fast	DI cold start or connection pool
Random spikes at moderate traffic	Cache stampede or GC pressure (different problem)

Start diagnostics from this table. Match the pattern before applying any fix.

The Right Warm-Up Strategy for Each Hosting Model

Hosting Model	Primary Fix
IIS / Azure App Service (Windows)	Disable idle timeout + Application Initialisation module
Azure App Service (Linux)	Always On setting + warm-up hosted service
Docker / Kubernetes	ReadyToRun publish + startup probe + readiness health check
AWS Lambda / Azure Functions	Provisioned concurrency / Always Ready instances
Self-hosted Kestrel	Warm-up hosted service + ReadyToRun

Preventing Recurring Cold Starts in CI/CD

The most overlooked part of the cold-start fix is making sure warm-up regressions don't creep back in. Add a startup performance test to your CI pipeline that measures the time from process start to first successful health check response:

time curl --retry 10 --retry-delay 1 --retry-connrefused http://localhost:5000/health/live

Set a threshold - something like 3 seconds for most APIs - and fail the build if startup exceeds it. This catches heavy singleton registration, slow module initialisation, or accidental synchronous I/O in startup code before it ships.

Track the number over time in your observability platform. A gradual increase is a signal that new code added to startup is accumulating cost.

Internal Links

For the complete source code demonstrating background services, hosted service startup sequencing, and warm-up hooks, the dotnet-background-services-hostedservice repository on GitHub has everything wired together in a production-structured project.

FAQ

What causes the first request in ASP.NET Core to be slow in production? The most common causes are JIT compilation delay (the runtime hasn't compiled those code paths yet), lazy DI singleton construction (heavy services built on first access), connection pool cold start (database and HTTP connections established on first use), IIS App Pool recycling (process restarted after idle timeout), and Kubernetes pods receiving traffic before warm-up completes.

How do I fix slow first request in ASP.NET Core hosted on IIS? Set the App Pool idle timeout to zero to prevent recycling, enable the IIS Application Initialisation module to pre-warm the application after every restart, and point it at your /health/live endpoint. Also publish with PublishReadyToRun enabled to reduce JIT startup cost.

What is ReadyToRun compilation in .NET and does it help with cold starts? ReadyToRun (R2R) pre-compiles frequently used IL code paths to native code during the publish step, so the JIT compiler has less work to do at runtime. It typically reduces cold-start JIT cost by 30 - 60% for enterprise APIs. Enable it with <PublishReadyToRun>true</PublishReadyToRun> in your project file.

How can I pre-warm the database connection pool in ASP.NET Core? Implement an IHostedService that calls CanConnectAsync() on your DbContext inside StartAsync(). The runtime calls all hosted services before routing traffic, so the connection pool is established before any user request arrives. The same pattern works for Redis and typed HttpClient instances.

Why does my Kubernetes pod serve slow first requests even after fixing JIT and DI? If your readiness probe passes before your warm-up hosted services finish, Kubernetes routes real traffic to a partially warm pod. Add a custom IHealthCheck that returns Unhealthy until your warm-up sequence is complete, and wire it into your /health/ready endpoint. Kubernetes will only start routing once the endpoint returns healthy.

What is the difference between a startup probe and a readiness probe in Kubernetes for ASP.NET Core? A startup probe governs whether the container has started at all - it blocks readiness and liveness probes from running until it passes. A readiness probe governs whether the container should receive traffic. For ASP.NET Core, use a startup probe with generous failureThreshold settings to allow time for full warm-up, then rely on the readiness probe to signal when the application is genuinely ready.

Does Native AOT solve the slow first request problem in ASP.NET Core? Native AOT (Ahead-Of-Time compilation) eliminates JIT delay entirely by compiling everything to native code at publish time. It produces faster startup and lower memory usage, but is incompatible with many reflection-heavy libraries. For most long-running ASP.NET Core APIs, ReadyToRun is the pragmatic choice; Native AOT suits serverless or CLI scenarios where startup time dominates.

How do I prevent cold starts from returning after new deployments? Add a startup performance test to your CI pipeline that measures time from process start to first successful health check response. Set a threshold and fail builds that exceed it. This catches slow singleton registration or accidental synchronous I/O in startup code before it reaches production.

About the Author

Celin Daniel is Co-founder of Coding Droplets with 13+ years of hands-on experience building, shipping, and operating .NET and ASP.NET Core systems in production. The guidance here comes from real projects and production incidents, not theory.

ASP.NET Core Slow First Request in Production: Root Cause and Fix

What Makes the First Request Slow?

Why the JIT Compiler Delays the First Request

How to Diagnose JIT Delay

The Fix: ReadyToRun + Tiered Compilation

Why Dependency Injection Causes Cold Starts

How to Diagnose DI Graph Instantiation Delay

The Fix: Force Singleton Construction at Startup

Why Database and External Service Connections Delay the First Request

How to Diagnose Connection Pool Cold Start

The Fix: Pre-Warm Connections in a Hosted Service

Why IIS and App Pool Recycling Causes Idle-Period Slowness

How to Diagnose IIS App Pool Recycling

The Fix: Disable Idle Timeout and Enable Application Initialisation

Why Kubernetes Readiness Probes and Pod Scheduling Add Latency

How to Diagnose a Premature Readiness Signal

Which Cause Should You Fix First?

Is the Slow Request Happening After a Quiet Period?

The Right Warm-Up Strategy for Each Hosting Model

Preventing Recurring Cold Starts in CI/CD

Internal Links

FAQ

About the Author

Comments

More from this blog

Resilient LLM Calls in .NET: Retries, Timeouts, and Fallbacks Done Right

OpenTelemetry.Extensions.Logging in .NET: Why the NuGet Package Is Missing and What to Use Instead

A Possible Object Cycle Was Detected in ASP.NET Core: Causes and Fixes

Multi-Agent Orchestration in .NET: Choosing the Right Workflow Pattern

CompleteAsync and CompleteStreamingAsync Not Found in Microsoft.Extensions.AI: Causes and Fixes

Command Palette

What Makes the First Request Slow?

Why the JIT Compiler Delays the First Request

How to Diagnose JIT Delay

The Fix: ReadyToRun + Tiered Compilation

Why Dependency Injection Causes Cold Starts

How to Diagnose DI Graph Instantiation Delay

The Fix: Force Singleton Construction at Startup

Why Database and External Service Connections Delay the First Request

How to Diagnose Connection Pool Cold Start

The Fix: Pre-Warm Connections in a Hosted Service

Why IIS and App Pool Recycling Causes Idle-Period Slowness

How to Diagnose IIS App Pool Recycling

The Fix: Disable Idle Timeout and Enable Application Initialisation

Why Kubernetes Readiness Probes and Pod Scheduling Add Latency

How to Diagnose a Premature Readiness Signal

Which Cause Should You Fix First?

Is the Slow Request Happening After a Quiet Period?

The Right Warm-Up Strategy for Each Hosting Model

Preventing Recurring Cold Starts in CI/CD

Internal Links

FAQ

About the Author

Comments

More from this blog