# ASP.NET Core Performance Optimization Interview Questions for Senior .NET Developers (2026)

Senior .NET developer interviews now place performance optimization front and centre. Whether you are preparing for a staff engineer role, a principal developer position, or a technical lead interview, interviewers expect you to move beyond syntax and explain *why* certain design choices produce faster, more resilient ASP.NET Core applications. This guide covers the performance interview questions that appear most frequently in senior .NET rounds, grouped by difficulty, with the direct and precise answers interviewers are listening for.

---

Performance questions reward real war stories over textbook answers. The full set of production-tuned examples is on [Patreon](https://www.patreon.com/CodingDroplets), with source you can profile and adapt to your own workload.

---

## Basic Performance Interview Questions

### What Is the Kestrel Web Server and How Does It Affect Throughput in ASP.NET Core?

Kestrel is the cross-platform, high-performance HTTP server built into ASP.NET Core. It processes requests directly on the libuv I/O loop (replaced by managed sockets since .NET 5) using a pool of `SocketAsyncEventArgs` objects to eliminate per-request allocations. Because Kestrel runs in-process, there is no interprocess communication overhead. In high-throughput scenarios, Kestrel consistently outperforms IIS Express and NGINX-proxied setups in raw requests-per-second benchmarks. For enterprise deployments, you typically place Kestrel behind a reverse proxy such as NGINX or IIS, but Kestrel does the heavy lifting for request processing. Key tuning levers include `KestrelServerOptions.Limits`, thread count via `ThreadPool.SetMinThreads`, and connection-level keep-alive settings.

### What Is the Difference Between `IMemoryCache` and `IDistributedCache` in ASP.NET Core?

`IMemoryCache` stores data in the local process heap. It is fast because it avoids network round-trips, but it does not survive process restarts and is not shared across multiple instances in a load-balanced or Kubernetes deployment. `IDistributedCache` abstracts a shared external store - typically Redis or SQL Server - that all instances of your application can read and write. The trade-off is network latency versus data consistency. In enterprise deployments with multiple pods or servers, `IMemoryCache` creates cache stampede and stale-data risks; `IDistributedCache` backed by Redis is the correct choice. ASP.NET Core 9 introduced `HybridCache`, which layers `IMemoryCache` in front of `IDistributedCache` to give you in-process speed on cache hits and cross-instance consistency on misses, with built-in stampede protection via `GetOrCreateAsync` lock coalescing.

### What Is Response Compression and When Should You Disable It in ASP.NET Core?

Response compression middleware reduces payload size by applying Gzip or Brotli encoding before sending the response to the client. This is beneficial for text-heavy payloads such as JSON, HTML, and XML because it reduces bytes-over-the-wire and can significantly cut latency on slow connections. However, you should disable or bypass response compression for already-compressed formats - images (JPEG, PNG, WebP), video, and binary streams - because compressing them again wastes CPU cycles and often increases payload size. In HTTPS environments you should also be aware of CRIME/BREACH attack vectors when compressing secrets alongside user-controlled data. The general rule: enable compression for JSON API responses, disable it for binary content, and let reverse proxies handle it when offloading TLS termination.

### What Does `AsNoTracking()` Do in EF Core and When Should You Use It?

`AsNoTracking()` tells EF Core to skip the change tracker for a query. By default, every entity EF Core materialises is registered with the `DbContext` change tracker so EF can detect mutations and generate `UPDATE` statements. Change tracking has non-trivial memory and CPU overhead, especially when materialising hundreds or thousands of entities per request. For read-only queries - dashboards, reports, API GET endpoints that do not need to write back - calling `AsNoTracking()` eliminates that overhead entirely. The rule of thumb: use `AsNoTracking()` on every read path in ASP.NET Core API handlers unless you intentionally plan to update and save the entity in the same request scope.

### What Is Minimal API in ASP.NET Core and Why Can It Be Faster Than Controller-Based APIs?

Minimal API, introduced in .NET 6 and significantly improved in .NET 8 and 10, maps HTTP endpoints directly to delegates or handler methods without routing through `Controller` base classes, `ActionDescriptor`, `IActionInvoker`, or `ModelStateDictionary` validation. This eliminates the full MVC middleware stack for simple endpoints. In benchmarks, minimal API endpoints consistently show lower overhead per request than equivalent controller-based endpoints because fewer middleware components execute in the pipeline. For microservices or high-throughput endpoints with simple input/output shapes, Minimal API is the better default. For complex enterprise scenarios with rich model binding, action filters, and view rendering, traditional controllers remain appropriate.

---

## Intermediate Performance Interview Questions

### How Does `IAsyncEnumerable<T>` Improve Streaming Performance in ASP.NET Core APIs?

`IAsyncEnumerable<T>` enables a producer-consumer model where the server streams results to the client incrementally rather than materialising the entire dataset into memory before serializing. In an ASP.NET Core Minimal API or controller action that returns `IAsyncEnumerable<T>`, the JSON serialiser (`System.Text.Json`) writes each item to the response stream as it is produced. This means time-to-first-byte is dramatically lower for large datasets, memory pressure on the server is significantly reduced (you do not buffer the entire result set), and clients can begin consuming data sooner. It is especially valuable for paginated exports, EF Core query results over large tables, and event-stream APIs. The key requirement is that the response must not have been started (no headers sent), and you need a client that can consume chunked/streamed JSON.

### What Is Output Caching in ASP.NET Core and How Does It Differ From Response Caching?

Response caching works by instructing the client (browser) and intermediate proxies (CDNs, NGINX) to cache responses via HTTP cache-control headers. It is completely client-side and proxy-side; the server still processes subsequent requests if a proxy decides not to cache or the cache has expired. Output caching (introduced in ASP.NET Core 7) is server-side in-memory caching of the full response bytes. When a cached response exists for a matching request, ASP.NET Core short-circuits the entire pipeline and returns the cached response without ever reaching your endpoint logic. Output caching is controlled entirely by your application, supports custom eviction policies, tag-based invalidation, and does not depend on HTTP cache headers. For high-read API endpoints, output caching delivers superior throughput because it eliminates request processing for repeated identical queries.

### What Is a `ThreadPool` Starvation Scenario in ASP.NET Core and How Do You Diagnose It?

Thread pool starvation occurs when all available thread pool threads are blocked on synchronous I/O or synchronous waits (`.Result`, `.Wait()`, `Thread.Sleep()`), and new incoming requests cannot be scheduled because no threads are free to process them. Symptoms include increasing request queue length, rising P99 latency, and eventual HTTP 503 errors under load even though CPU is not saturated. Diagnosis: use `dotnet-counters` to watch `ThreadPool Queue Length`, `ThreadPool Completed Work Items`, and `Active Threads`. A queue that grows while active threads plateau at your `ThreadPool.GetMinThreads()` value is a starvation signal. The fix is to eliminate all synchronous blocking in the async call chain - use `await` throughout, never call `.Result` or `.GetAwaiter().GetResult()` on the hot path, and avoid `Task.Run` wrappers around I/O as a false fix.

### How Does Rate Limiting in ASP.NET Core Protect Performance Under Load?

ASP.NET Core's built-in rate limiting middleware (`Microsoft.AspNetCore.RateLimiting`, GA in .NET 7) implements four algorithms: Fixed Window, Sliding Window, Token Bucket, and Concurrency Limiter. Rate limiting protects your application's performance by shedding excess load before it exhausts resources - thread pool threads, database connections, or downstream API quotas. Without rate limiting, a sudden traffic spike can cause cascading failures across the entire application. The Concurrency Limiter is particularly useful for protecting expensive endpoints: it caps the number of in-flight requests to a specific handler, queuing or rejecting overflow rather than letting them all compete for the same database connections. For enterprise APIs, applying per-user or per-client rate limits prevents a single abusive caller from degrading service for all other users. You can find a detailed rate limiting implementation in the [Coding Droplets GitHub repo](https://github.com/codingdroplets/dotnet-rate-limiting-api).

### What Are Compiled Queries in EF Core and When Do They Deliver Meaningful Gains?

Every time EF Core executes a LINQ query, it translates the expression tree to SQL, compiles the result, and caches it. For simple queries, this overhead is small. For complex queries with many joins, filters, and projections, the translation and compilation cost can be measurable, especially at high request rates. `EF.CompileQuery()` and `EF.CompileAsyncQuery()` pre-compile the query once and reuse the compiled delegate on every subsequent call, eliminating the translation overhead from the hot path. The payoff is most significant for queries that run hundreds or thousands of times per second. The trade-off is that compiled queries lose the ability to build expressions dynamically - parameters must be passed in at call time, and query shape must be fixed at compilation time. Use compiled queries on hot read paths; leave dynamic queries uncompiled.

### What Is the Role of `PipeReader` and `PipeWriter` in High-Throughput ASP.NET Core APIs?

`System.IO.Pipelines` provides a high-performance, allocation-minimising API for reading and writing streams of data. Unlike `Stream`, which allocates byte arrays on every read, `PipeReader` works with memory segments from a pooled `MemoryPool<byte>`, avoids copies, and supports zero-copy parsing of incoming data. ASP.NET Core's HTTP/2 and HTTP/3 implementations use pipelines internally. For scenarios where you need to parse a large request body, process binary protocols, or implement custom serialisation without `Stream`-based overhead, dropping down to `PipeReader` can eliminate significant GC pressure. It is an advanced API primarily relevant when profiling shows `MemoryStream` or `Stream.ReadAsync` allocations as a hot path in your application.

---

## Advanced Performance Interview Questions

### How Do You Profile and Diagnose High GC Pressure in an ASP.NET Core Application?

Excessive garbage collection pauses are among the most insidious performance problems in .NET because they can cause latency spikes without saturating CPU or I/O. Diagnosis follows a structured path: first, watch `dotnet-counters` for `GC Heap Size`, `Gen 0 Collection Count`, `Gen 1 Collection Count`, and `Gen 2 Collection Count`. A rising Gen 2 count under steady-state load indicates large objects or long-lived allocations escaping Gen 0. Next, capture an allocation trace using `dotnet-trace collect --profile gc-verbose` and analyse it in PerfView or `speedscope`. Common culprits in ASP.NET Core include: unnecessary `string.Format` or string concatenation in hot paths (use `StringBuilder` or `string.Create`), boxing value types in generic collections, `MemoryStream` copies, frequent `async` state machine heap allocations (mitigated by `ValueTask`), and large object heap allocations from arrays over 85 KB. Fixing high GC pressure means adopting `ArrayPool<T>`, `MemoryPool<T>`, `Span<T>`, `stackalloc` for small buffers, and `ObjectPool<T>` for expensive-to-construct objects.

### What Is NativeAOT in .NET 10 and What Are the Performance Trade-offs for ASP.NET Core APIs?

NativeAOT (Native Ahead-of-Time compilation) compiles a .NET application entirely to native machine code at publish time, eliminating the JIT compiler from the runtime path. The benefits for ASP.NET Core APIs are: significantly faster startup time (tens of milliseconds instead of hundreds), lower steady-state memory footprint (no JIT metadata overhead), and improved suitability for containerised microservices and serverless functions where cold start time is critical. The trade-offs are real: NativeAOT is incompatible with runtime reflection, dynamic code loading, and assemblies that use `System.Reflection.Emit` or `Activator.CreateInstance` with unknown types. Libraries that rely on reflection-based serialisation, such as older `Newtonsoft.Json` configurations, are not NativeAOT-compatible without source generators. ASP.NET Core Minimal API with `System.Text.Json` source generation is the recommended architecture for NativeAOT-published services. The correctness discipline required makes NativeAOT most appropriate for new greenfield services, not for migrating reflection-heavy existing applications.

### How Does HTTP/3 and QUIC Affect Performance in ASP.NET Core, and How Do You Enable It?

HTTP/3 runs over QUIC (Quick UDP Internet Connections), a transport protocol that eliminates the TCP handshake and head-of-line blocking that affect HTTP/1.1 and HTTP/2. In high-latency network environments (mobile, intercontinental, lossy Wi-Fi), HTTP/3 significantly reduces connection establishment time because QUIC combines the transport and TLS handshake into a single round-trip (0-RTT resumption allows subsequent connections to skip the handshake entirely). ASP.NET Core supports HTTP/3 via the `Microsoft.AspNetCore.Server.Kestrel` transport since .NET 6 (GA in .NET 8). Enabling it requires adding `ListenOptions.Protocols = HttpProtocols.Http1AndHttp2AndHttp3` in `KestrelServerOptions` and ensuring the server runs on a platform with QUIC support (Windows with MsQuic, or Linux with libmsquic). HTTP/3 does not universally improve performance for server-to-server calls on low-latency private networks - the benefit is most pronounced for last-mile client connections.

### What Is the `ValueTask<T>` vs `Task<T>` Trade-off in ASP.NET Core Performance?

`Task<T>` always allocates a heap object to represent the asynchronous operation, even when the operation completes synchronously (as in a cache hit). `ValueTask<T>` is a struct that can represent a synchronously completed result without allocating, making it zero-allocation in the hot path when the result is immediately available. The trade-off is that `ValueTask<T>` is single-awaitable - you cannot `await` it twice, store it in a list, or pass it to `Task.WhenAll`. Misuse of `ValueTask` (double-awaiting, caching, `.Result` access) causes hard-to-diagnose bugs. The guidance from the .NET performance team: use `Task<T>` as the default for interface contracts and public APIs; use `ValueTask<T>` in sealed hot-path implementations where profiling shows the allocation cost of `Task<T>` is measurable - for example, in a caching layer where 90% of calls are synchronous cache hits. Do not apply `ValueTask<T>` speculatively without profiling evidence.

### How Do You Implement Connection Pooling for HttpClient in ASP.NET Core to Avoid Socket Exhaustion?

Using `new HttpClient()` in each request handler creates a new connection pool and a new set of TCP sockets. When the handler is disposed, the underlying `HttpMessageHandler` is disposed too, but the underlying TCP connection enters the `TIME_WAIT` state for up to 240 seconds, preventing reuse. Under load, this exhausts the local socket port range and causes `SocketException: address already in use`. The solution is `IHttpClientFactory`, which manages named or typed `HttpClient` instances with shared, pooled `HttpMessageHandler` instances that are recycled on a configurable interval (default: 2 minutes) to respect DNS TTL changes. `IHttpClientFactory` integrates with `Polly` for retry, circuit breaker, and timeout policies. In ASP.NET Core services, register typed clients with `services.AddHttpClient<T>()` and inject them via the constructor - never `new HttpClient()` in production code. You can find a request correlation implementation pattern in the [Coding Droplets GitHub repo](https://github.com/codingdroplets/dotnet-request-correlation-middleware).

### What Is the Span\<T\> and Memory\<T\> API and Why Is It Critical for Performance-Sensitive .NET Code?

`Span<T>` is a stack-allocated, ref struct that represents a contiguous region of memory - whether in the managed heap, the stack, or native memory - without copying it. Operations such as slicing a substring, parsing a byte buffer, or splitting a string can be performed using `Span<T>` with zero allocation, compared to `string.Substring()` which always allocates a new string. `Memory<T>` is the heap-compatible counterpart of `Span<T>` for use in `async` methods, where stack-only ref structs cannot cross `await` boundaries. In ASP.NET Core, `Span<T>` and `Memory<T>` appear throughout the framework - request body parsing, header parsing, URL routing, and JSON serialisation all use these types internally to eliminate allocations in the hot request path. Senior developers are expected to understand when to use `Span<T>` over `string` slicing, when to prefer `Memory<T>` in async contexts, and how to use `MemoryMarshal` for advanced interop scenarios.

### How Does the Request Timeout Middleware in ASP.NET Core 8+ Improve Resilience and Performance?

`RequestTimeoutMiddleware`, built into ASP.NET Core 8 without a third-party library, associates a `CancellationToken` with each request and cancels it if the configured timeout expires. This is critical for performance under load because without a timeout, a slow upstream database query or third-party API call can hold a thread-pool thread indefinitely. Thread-pool starvation then cascades into full application degradation. By adding `app.UseRequestTimeouts()` and configuring timeouts per endpoint or globally, you guarantee that no single slow request monopolises resources beyond your defined SLA. The middleware integrates with `HttpContext.RequestAborted` - any downstream `async` code that respects `CancellationToken` will be terminated cleanly. You can explore a full implementation at the [Coding Droplets request timeout repo](https://github.com/codingdroplets/dotnet-request-timeout-middleware).

---

---

## What Interviewers Are Really Testing

When a senior interviewer asks performance questions, they are rarely fishing for memorised API names. They want to see three things: the ability to reason from first principles about where time is spent (CPU, I/O, GC, network), the discipline to profile before optimising, and the judgment to know which trade-offs are worth making in production. The best answers are structured as: *identify the bottleneck class → explain the mechanism → name the tooling to measure it → describe the fix and its trade-offs*. That mental model - not memorised trivia - is what earns senior .NET roles in 2026.

For tutorials and production-ready implementations covering these topics, visit [Coding Droplets](https://codingdroplets.com/) or explore the full code repository on [GitHub](http://github.com/codingdroplets/).

* * *

## About the Author

**Celin Daniel** is Co-founder of Coding Droplets with 13+ years of hands-on experience building, shipping, and operating .NET and ASP.NET Core systems in production. The guidance here comes from real projects and production incidents, not theory.

- Website: [codingdroplets.com](https://codingdroplets.com/)
- GitHub: [github.com/codingdroplets](http://github.com/codingdroplets/)
- YouTube: [youtube.com/@CodingDroplets](https://www.youtube.com/@CodingDroplets)