ASP.NET Core Performance Optimization Interview Questions for Senior .NET Developers (2026)
A complete guide to ASP.NET Core performance questions asked in senior .NET developer interviews โ covering Kestrel, caching, GC, NativeAOT, HTTP/3, and more.

Senior .NET developer interviews now place performance optimization front and centre. Whether you are preparing for a staff engineer role, a principal developer position, or a technical lead interview, interviewers expect you to move beyond syntax and explain why certain design choices produce faster, more resilient ASP.NET Core applications. This guide covers the performance interview questions that appear most frequently in senior .NET rounds, grouped by difficulty, with the direct and precise answers interviewers are listening for.
๐ Want implementation-ready .NET source code you can drop straight into your project? Join Coding Droplets on Patreon for exclusive tutorials, premium code samples, and early access to new content. ๐ https://www.patreon.com/CodingDroplets
Basic Performance Interview Questions
What Is the Kestrel Web Server and How Does It Affect Throughput in ASP.NET Core?
Kestrel is the cross-platform, high-performance HTTP server built into ASP.NET Core. It processes requests directly on the libuv I/O loop (replaced by managed sockets since .NET 5) using a pool of SocketAsyncEventArgs objects to eliminate per-request allocations. Because Kestrel runs in-process, there is no interprocess communication overhead. In high-throughput scenarios, Kestrel consistently outperforms IIS Express and NGINX-proxied setups in raw requests-per-second benchmarks. For enterprise deployments, you typically place Kestrel behind a reverse proxy such as NGINX or IIS, but Kestrel does the heavy lifting for request processing. Key tuning levers include KestrelServerOptions.Limits, thread count via ThreadPool.SetMinThreads, and connection-level keep-alive settings.
What Is the Difference Between IMemoryCache and IDistributedCache in ASP.NET Core?
IMemoryCache stores data in the local process heap. It is fast because it avoids network round-trips, but it does not survive process restarts and is not shared across multiple instances in a load-balanced or Kubernetes deployment. IDistributedCache abstracts a shared external store โ typically Redis or SQL Server โ that all instances of your application can read and write. The trade-off is network latency versus data consistency. In enterprise deployments with multiple pods or servers, IMemoryCache creates cache stampede and stale-data risks; IDistributedCache backed by Redis is the correct choice. ASP.NET Core 9 introduced HybridCache, which layers IMemoryCache in front of IDistributedCache to give you in-process speed on cache hits and cross-instance consistency on misses, with built-in stampede protection via GetOrCreateAsync lock coalescing.
What Is Response Compression and When Should You Disable It in ASP.NET Core?
Response compression middleware reduces payload size by applying Gzip or Brotli encoding before sending the response to the client. This is beneficial for text-heavy payloads such as JSON, HTML, and XML because it reduces bytes-over-the-wire and can significantly cut latency on slow connections. However, you should disable or bypass response compression for already-compressed formats โ images (JPEG, PNG, WebP), video, and binary streams โ because compressing them again wastes CPU cycles and often increases payload size. In HTTPS environments you should also be aware of CRIME/BREACH attack vectors when compressing secrets alongside user-controlled data. The general rule: enable compression for JSON API responses, disable it for binary content, and let reverse proxies handle it when offloading TLS termination.
What Does AsNoTracking() Do in EF Core and When Should You Use It?
AsNoTracking() tells EF Core to skip the change tracker for a query. By default, every entity EF Core materialises is registered with the DbContext change tracker so EF can detect mutations and generate UPDATE statements. Change tracking has non-trivial memory and CPU overhead, especially when materialising hundreds or thousands of entities per request. For read-only queries โ dashboards, reports, API GET endpoints that do not need to write back โ calling AsNoTracking() eliminates that overhead entirely. The rule of thumb: use AsNoTracking() on every read path in ASP.NET Core API handlers unless you intentionally plan to update and save the entity in the same request scope.
What Is Minimal API in ASP.NET Core and Why Can It Be Faster Than Controller-Based APIs?
Minimal API, introduced in .NET 6 and significantly improved in .NET 8 and 10, maps HTTP endpoints directly to delegates or handler methods without routing through Controller base classes, ActionDescriptor, IActionInvoker, or ModelStateDictionary validation. This eliminates the full MVC middleware stack for simple endpoints. In benchmarks, minimal API endpoints consistently show lower overhead per request than equivalent controller-based endpoints because fewer middleware components execute in the pipeline. For microservices or high-throughput endpoints with simple input/output shapes, Minimal API is the better default. For complex enterprise scenarios with rich model binding, action filters, and view rendering, traditional controllers remain appropriate.
Intermediate Performance Interview Questions
How Does IAsyncEnumerable<T> Improve Streaming Performance in ASP.NET Core APIs?
IAsyncEnumerable<T> enables a producer-consumer model where the server streams results to the client incrementally rather than materialising the entire dataset into memory before serializing. In an ASP.NET Core Minimal API or controller action that returns IAsyncEnumerable<T>, the JSON serialiser (System.Text.Json) writes each item to the response stream as it is produced. This means time-to-first-byte is dramatically lower for large datasets, memory pressure on the server is significantly reduced (you do not buffer the entire result set), and clients can begin consuming data sooner. It is especially valuable for paginated exports, EF Core query results over large tables, and event-stream APIs. The key requirement is that the response must not have been started (no headers sent), and you need a client that can consume chunked/streamed JSON.
What Is Output Caching in ASP.NET Core and How Does It Differ From Response Caching?
Response caching works by instructing the client (browser) and intermediate proxies (CDNs, NGINX) to cache responses via HTTP cache-control headers. It is completely client-side and proxy-side; the server still processes subsequent requests if a proxy decides not to cache or the cache has expired. Output caching (introduced in ASP.NET Core 7) is server-side in-memory caching of the full response bytes. When a cached response exists for a matching request, ASP.NET Core short-circuits the entire pipeline and returns the cached response without ever reaching your endpoint logic. Output caching is controlled entirely by your application, supports custom eviction policies, tag-based invalidation, and does not depend on HTTP cache headers. For high-read API endpoints, output caching delivers superior throughput because it eliminates request processing for repeated identical queries.
What Is a ThreadPool Starvation Scenario in ASP.NET Core and How Do You Diagnose It?
Thread pool starvation occurs when all available thread pool threads are blocked on synchronous I/O or synchronous waits (.Result, .Wait(), Thread.Sleep()), and new incoming requests cannot be scheduled because no threads are free to process them. Symptoms include increasing request queue length, rising P99 latency, and eventual HTTP 503 errors under load even though CPU is not saturated. Diagnosis: use dotnet-counters to watch ThreadPool Queue Length, ThreadPool Completed Work Items, and Active Threads. A queue that grows while active threads plateau at your ThreadPool.GetMinThreads() value is a starvation signal. The fix is to eliminate all synchronous blocking in the async call chain โ use await throughout, never call .Result or .GetAwaiter().GetResult() on the hot path, and avoid Task.Run wrappers around I/O as a false fix.
How Does Rate Limiting in ASP.NET Core Protect Performance Under Load?
ASP.NET Core's built-in rate limiting middleware (Microsoft.AspNetCore.RateLimiting, GA in .NET 7) implements four algorithms: Fixed Window, Sliding Window, Token Bucket, and Concurrency Limiter. Rate limiting protects your application's performance by shedding excess load before it exhausts resources โ thread pool threads, database connections, or downstream API quotas. Without rate limiting, a sudden traffic spike can cause cascading failures across the entire application. The Concurrency Limiter is particularly useful for protecting expensive endpoints: it caps the number of in-flight requests to a specific handler, queuing or rejecting overflow rather than letting them all compete for the same database connections. For enterprise APIs, applying per-user or per-client rate limits prevents a single abusive caller from degrading service for all other users. You can find a detailed rate limiting implementation in the Coding Droplets GitHub repo.
What Are Compiled Queries in EF Core and When Do They Deliver Meaningful Gains?
Every time EF Core executes a LINQ query, it translates the expression tree to SQL, compiles the result, and caches it. For simple queries, this overhead is small. For complex queries with many joins, filters, and projections, the translation and compilation cost can be measurable, especially at high request rates. EF.CompileQuery() and EF.CompileAsyncQuery() pre-compile the query once and reuse the compiled delegate on every subsequent call, eliminating the translation overhead from the hot path. The payoff is most significant for queries that run hundreds or thousands of times per second. The trade-off is that compiled queries lose the ability to build expressions dynamically โ parameters must be passed in at call time, and query shape must be fixed at compilation time. Use compiled queries on hot read paths; leave dynamic queries uncompiled.
What Is the Role of PipeReader and PipeWriter in High-Throughput ASP.NET Core APIs?
System.IO.Pipelines provides a high-performance, allocation-minimising API for reading and writing streams of data. Unlike Stream, which allocates byte arrays on every read, PipeReader works with memory segments from a pooled MemoryPool<byte>, avoids copies, and supports zero-copy parsing of incoming data. ASP.NET Core's HTTP/2 and HTTP/3 implementations use pipelines internally. For scenarios where you need to parse a large request body, process binary protocols, or implement custom serialisation without Stream-based overhead, dropping down to PipeReader can eliminate significant GC pressure. It is an advanced API primarily relevant when profiling shows MemoryStream or Stream.ReadAsync allocations as a hot path in your application.
Advanced Performance Interview Questions
How Do You Profile and Diagnose High GC Pressure in an ASP.NET Core Application?
Excessive garbage collection pauses are among the most insidious performance problems in .NET because they can cause latency spikes without saturating CPU or I/O. Diagnosis follows a structured path: first, watch dotnet-counters for GC Heap Size, Gen 0 Collection Count, Gen 1 Collection Count, and Gen 2 Collection Count. A rising Gen 2 count under steady-state load indicates large objects or long-lived allocations escaping Gen 0. Next, capture an allocation trace using dotnet-trace collect --profile gc-verbose and analyse it in PerfView or speedscope. Common culprits in ASP.NET Core include: unnecessary string.Format or string concatenation in hot paths (use StringBuilder or string.Create), boxing value types in generic collections, MemoryStream copies, frequent async state machine heap allocations (mitigated by ValueTask), and large object heap allocations from arrays over 85 KB. Fixing high GC pressure means adopting ArrayPool<T>, MemoryPool<T>, Span<T>, stackalloc for small buffers, and ObjectPool<T> for expensive-to-construct objects.
What Is NativeAOT in .NET 10 and What Are the Performance Trade-offs for ASP.NET Core APIs?
NativeAOT (Native Ahead-of-Time compilation) compiles a .NET application entirely to native machine code at publish time, eliminating the JIT compiler from the runtime path. The benefits for ASP.NET Core APIs are: significantly faster startup time (tens of milliseconds instead of hundreds), lower steady-state memory footprint (no JIT metadata overhead), and improved suitability for containerised microservices and serverless functions where cold start time is critical. The trade-offs are real: NativeAOT is incompatible with runtime reflection, dynamic code loading, and assemblies that use System.Reflection.Emit or Activator.CreateInstance with unknown types. Libraries that rely on reflection-based serialisation, such as older Newtonsoft.Json configurations, are not NativeAOT-compatible without source generators. ASP.NET Core Minimal API with System.Text.Json source generation is the recommended architecture for NativeAOT-published services. The correctness discipline required makes NativeAOT most appropriate for new greenfield services, not for migrating reflection-heavy existing applications.
How Does HTTP/3 and QUIC Affect Performance in ASP.NET Core, and How Do You Enable It?
HTTP/3 runs over QUIC (Quick UDP Internet Connections), a transport protocol that eliminates the TCP handshake and head-of-line blocking that affect HTTP/1.1 and HTTP/2. In high-latency network environments (mobile, intercontinental, lossy Wi-Fi), HTTP/3 significantly reduces connection establishment time because QUIC combines the transport and TLS handshake into a single round-trip (0-RTT resumption allows subsequent connections to skip the handshake entirely). ASP.NET Core supports HTTP/3 via the Microsoft.AspNetCore.Server.Kestrel transport since .NET 6 (GA in .NET 8). Enabling it requires adding ListenOptions.Protocols = HttpProtocols.Http1AndHttp2AndHttp3 in KestrelServerOptions and ensuring the server runs on a platform with QUIC support (Windows with MsQuic, or Linux with libmsquic). HTTP/3 does not universally improve performance for server-to-server calls on low-latency private networks โ the benefit is most pronounced for last-mile client connections.
What Is the ValueTask<T> vs Task<T> Trade-off in ASP.NET Core Performance?
Task<T> always allocates a heap object to represent the asynchronous operation, even when the operation completes synchronously (as in a cache hit). ValueTask<T> is a struct that can represent a synchronously completed result without allocating, making it zero-allocation in the hot path when the result is immediately available. The trade-off is that ValueTask<T> is single-awaitable โ you cannot await it twice, store it in a list, or pass it to Task.WhenAll. Misuse of ValueTask (double-awaiting, caching, .Result access) causes hard-to-diagnose bugs. The guidance from the .NET performance team: use Task<T> as the default for interface contracts and public APIs; use ValueTask<T> in sealed hot-path implementations where profiling shows the allocation cost of Task<T> is measurable โ for example, in a caching layer where 90% of calls are synchronous cache hits. Do not apply ValueTask<T> speculatively without profiling evidence.
How Do You Implement Connection Pooling for HttpClient in ASP.NET Core to Avoid Socket Exhaustion?
Using new HttpClient() in each request handler creates a new connection pool and a new set of TCP sockets. When the handler is disposed, the underlying HttpMessageHandler is disposed too, but the underlying TCP connection enters the TIME_WAIT state for up to 240 seconds, preventing reuse. Under load, this exhausts the local socket port range and causes SocketException: address already in use. The solution is IHttpClientFactory, which manages named or typed HttpClient instances with shared, pooled HttpMessageHandler instances that are recycled on a configurable interval (default: 2 minutes) to respect DNS TTL changes. IHttpClientFactory integrates with Polly for retry, circuit breaker, and timeout policies. In ASP.NET Core services, register typed clients with services.AddHttpClient<T>() and inject them via the constructor โ never new HttpClient() in production code. You can find a request correlation implementation pattern in the Coding Droplets GitHub repo.
What Is the Span<T> and Memory<T> API and Why Is It Critical for Performance-Sensitive .NET Code?
Span<T> is a stack-allocated, ref struct that represents a contiguous region of memory โ whether in the managed heap, the stack, or native memory โ without copying it. Operations such as slicing a substring, parsing a byte buffer, or splitting a string can be performed using Span<T> with zero allocation, compared to string.Substring() which always allocates a new string. Memory<T> is the heap-compatible counterpart of Span<T> for use in async methods, where stack-only ref structs cannot cross await boundaries. In ASP.NET Core, Span<T> and Memory<T> appear throughout the framework โ request body parsing, header parsing, URL routing, and JSON serialisation all use these types internally to eliminate allocations in the hot request path. Senior developers are expected to understand when to use Span<T> over string slicing, when to prefer Memory<T> in async contexts, and how to use MemoryMarshal for advanced interop scenarios.
How Does the Request Timeout Middleware in ASP.NET Core 8+ Improve Resilience and Performance?
RequestTimeoutMiddleware, built into ASP.NET Core 8 without a third-party library, associates a CancellationToken with each request and cancels it if the configured timeout expires. This is critical for performance under load because without a timeout, a slow upstream database query or third-party API call can hold a thread-pool thread indefinitely. Thread-pool starvation then cascades into full application degradation. By adding app.UseRequestTimeouts() and configuring timeouts per endpoint or globally, you guarantee that no single slow request monopolises resources beyond your defined SLA. The middleware integrates with HttpContext.RequestAborted โ any downstream async code that respects CancellationToken will be terminated cleanly. You can explore a full implementation at the Coding Droplets request timeout repo.
โ Prefer a one-time tip? Buy us a coffee โ every bit helps keep the content coming!
What Interviewers Are Really Testing
When a senior interviewer asks performance questions, they are rarely fishing for memorised API names. They want to see three things: the ability to reason from first principles about where time is spent (CPU, I/O, GC, network), the discipline to profile before optimising, and the judgment to know which trade-offs are worth making in production. The best answers are structured as: identify the bottleneck class โ explain the mechanism โ name the tooling to measure it โ describe the fix and its trade-offs. That mental model โ not memorised trivia โ is what earns senior .NET roles in 2026.
For tutorials and production-ready implementations covering these topics, visit Coding Droplets or explore the full code repository on GitHub.




