Microsoft.Extensions.AI IChatClient in ASP.NET Core

The arrival of Microsoft.Extensions.AI as a first-class library in .NET 10 marks one of the most significant shifts in how .NET teams will wire AI capabilities into their applications. Rather than coupling your ASP.NET Core services directly to a specific AI vendor SDK, the library introduces a unified abstraction layer - led by the IChatClient and IEmbeddingGenerator interfaces - that treats AI providers the same way the .NET ecosystem treats logging, caching, and configuration. If your team is evaluating where AI fits in your architecture or wondering whether this library is production-ready, this breakdown will give you the honest picture.

Wiring an LLM into a .NET API is one thing; making it reliable in production is another. The full setup is on Patreon. Chapter 3 of the AI-Powered .NET APIs course builds your first IChatClient endpoint with provider swapping inside a real ASP.NET Core API.

What Is Microsoft.Extensions.AI?

Microsoft.Extensions.AI is a set of core abstractions and utilities that give .NET applications a consistent, vendor-agnostic way to interact with AI services. It ships as two NuGet packages: Microsoft.Extensions.AI.Abstractions, which defines the interfaces, and Microsoft.Extensions.AI, which adds DI integration, middleware pipeline support, OpenTelemetry wiring, and automatic function-calling tooling.

The central interface is IChatClient. Any LLM provider - OpenAI, Azure OpenAI, GitHub Models, Ollama, or a custom local model - can implement IChatClient and plug into your application without changing your consuming code. This is the same philosophy that made ILogger, IMemoryCache, and IDistributedCache so durable: code against the abstraction, swap the implementation.

In .NET 10, this library graduated from preview to a supported release, and it now integrates cleanly into the ASP.NET Core DI container, the Blazor Web App template, and the OpenTelemetry pipeline.

The Core Interfaces You Need to Know

IChatClient - The Heart of the Abstraction

IChatClient defines a client responsible for interacting with any AI service that provides chat or completion capabilities. It exposes two primary methods: CompleteAsync for a full synchronous response, and CompleteStreamingAsync for incremental streaming responses using IAsyncEnumerable. Both methods accept multi-modal content - text, images, and audio - through the ChatMessage type.

What makes this valuable at the enterprise level is that it is composable. You register your concrete AI provider, then chain middleware around it for caching, rate limiting, telemetry, or content filtering - the same pattern you use with HTTP clients and middlewares today.

IEmbeddingGenerator - Unified Embedding Support

The IEmbeddingGenerator<TInput, TEmbedding> interface handles generating vector embeddings from text or other inputs. This matters for semantic search, Retrieval-Augmented Generation (RAG) pipelines, and similarity-based recommendation systems. Like IChatClient, the interface is provider-agnostic - use Azure OpenAI, Ollama, or any custom provider without rewriting your pipeline logic.

IImageGenerator - The Experimental Addition

The IImageGenerator interface was also introduced in this release, though it carries an experimental tag. It handles text-to-image generation through a consistent GenerateAsync method and composable options for image size and format. Enterprise teams should treat this as a preview signal - it is available, but not yet recommended for production without additional stability evaluation.

How the Middleware Pipeline Works

One of the most compelling aspects of Microsoft.Extensions.AI is that it mirrors the ASP.NET Core middleware model. Using ChatClientBuilder, you construct a pipeline of delegating handlers that wrap your underlying provider. Out of the box, Microsoft ships handlers for OpenTelemetry tracing (UseOpenTelemetry), response caching (UseDistributedCache), and logging. Your team can add custom middleware - audit logging, PII redaction, cost tracking, or prompt injection detection - without touching the core AI call.

This architectural decision is important for enterprise teams because it separates the cross-cutting concerns of AI usage (observability, policy enforcement, cost governance) from the business logic that uses AI. The same pattern that keeps your HTTP logic clean now keeps your AI calls clean.

What Changed for ASP.NET Core Specifically

Dependency Injection Integration

IChatClient and IEmbeddingGenerator register through the standard IServiceCollection. You call AddChatClient or AddEmbeddingGenerator in your service registration and resolve them via constructor injection the same way you resolve ILogger or IMemoryCache. No static factory methods, no thread-safety surprises from shared instances, no vendor SDK cluttering your service registrations.

This matters operationally: you can swap your underlying AI provider in a single configuration change without touching any code that consumes the abstraction.

OpenTelemetry and Diagnostics Integration

The UseOpenTelemetry extension wires AI calls into your existing distributed tracing pipeline. Chat completions emit spans with token usage, model metadata, and latency. For teams already running OpenTelemetry - which has been a first-class citizen in ASP.NET Core since .NET 8 - this means AI observability slots into Grafana, Jaeger, Azure Monitor, or any OTLP-compatible backend with no additional instrumentation effort.

This is the gap that most enterprise AI integrations built before .NET 10 had to fill manually, often inconsistently.

Automatic Function Tool Invocation

Microsoft.Extensions.AI includes built-in support for automatic function calling, where the AI model can request that a defined function be executed and the result fed back into the conversation. This is registered as middleware on the ChatClientBuilder. Enterprise teams building agentic workflows - where the model drives orchestration, calls internal APIs, or queries data sources - benefit from a standardised invocation model rather than building bespoke loop logic for each provider's function calling format.

Is This the Same as Semantic Kernel?

This is the most common question teams ask when evaluating Microsoft.Extensions.AI. The answer is that they serve different levels of the stack.

Microsoft.Extensions.AI provides the low-level, unopinionated plumbing - interfaces, DI wiring, middleware, and basic tooling. It is deliberately minimal and composable.

Semantic Kernel is a higher-level orchestration framework that builds on these abstractions. It handles agent orchestration, memory management, planner pipelines, and complex multi-step AI workflows. Semantic Kernel now implements IChatClient internally, which means teams investing in it get a consistent contract at both levels.

If your team needs basic AI integration in an ASP.NET Core API - chat completion, embeddings, function calling - Microsoft.Extensions.AI is often sufficient and preferable for its simplicity. If you are building complex agentic systems with memory, multi-agent coordination, or dynamic planning, Semantic Kernel adds the orchestration layer you need on top.

What to Adopt Now Versus What to Watch

Adopt Now

Register IChatClient through DI for any new AI integration work. Do not write code that depends directly on OpenAIClient or AzureOpenAIClient at the consuming layer.
Wire OpenTelemetry early. AI calls are expensive in both latency and cost - having traces from day one prevents guesswork when performance issues appear in production.
Use ChatClientBuilder middleware for response caching. Many AI workloads are deterministic for identical inputs. Caching reduces cost and latency without sacrificing correctness.
Consider prompt logging middleware in lower environments. Understanding what your application actually sends to AI models is critical for compliance, debugging, and cost control.

Watch, Not Yet Adopt

IImageGenerator remains experimental. Track the release notes but avoid production dependencies until it reaches stable.
The automatic function tool invocation middleware is production-capable but requires careful prompt engineering to avoid runaway tool calls. Establish token and iteration budgets before enabling it in latency-sensitive services.
The IChatClient implementation for Ollama-based local models is maturing fast. Teams with data residency constraints should revisit this quarterly - local inference via Microsoft.Extensions.AI is close to being a first-class option for regulated workloads.

Migration Impact for Existing AI Integrations

Teams that integrated directly with the Azure OpenAI SDK or the OpenAI .NET SDK before Microsoft.Extensions.AI reached stable have a straightforward migration path. Both SDKs now expose AsIChatClient() extension methods that wrap the existing client in the standard abstraction. This means you can wrap your existing SDK client, register it as IChatClient, and begin adopting the middleware pipeline without a full rewrite. The migration can be done incrementally, service by service.

Is What's New Enough to Justify Upgrading to .NET 10?

Microsoft.Extensions.AI is part of a broader case for .NET 10, which is an LTS release supported through November 2028. The AI integration story is one of several reasons platform teams are accelerating upgrades, alongside the JIT and GC performance improvements covered in .NET 10 Runtime Performance: JIT, GC, and NativeAOT Changes Enterprise Teams Should Know and the new ASP.NET Core 10 API features covered in What's New in ASP.NET Core 10: API Features Every Enterprise .NET Team Should Adopt.

If your team is building AI-enabled features in 2026, the abstraction quality and DI integration in Microsoft.Extensions.AI are a strong argument for being on .NET 10 rather than continuing to patch raw provider SDK calls into applications on earlier versions.

Frequently Asked Questions

What is Microsoft.Extensions.AI in .NET 10? Microsoft.Extensions.AI is a set of official .NET libraries that provide unified, vendor-agnostic abstractions - primarily IChatClient and IEmbeddingGenerator - for integrating AI services into .NET applications. It allows ASP.NET Core teams to register AI clients via DI, compose middleware pipelines for telemetry and caching, and swap providers without changing consuming code.

How does IChatClient differ from calling the OpenAI SDK directly? Calling the OpenAI SDK directly couples your code to a specific vendor. IChatClient is a provider-neutral interface that any AI SDK can implement. Your services depend on IChatClient; the concrete implementation - OpenAI, Azure OpenAI, Ollama, or a custom provider - is resolved from DI at startup. This makes testing, provider migration, and middleware composition significantly simpler.

Is Microsoft.Extensions.AI production-ready in .NET 10? The core interfaces (IChatClient and IEmbeddingGenerator) and the DI/middleware pipeline are stable and production-ready as of .NET 10. The IImageGenerator interface is still marked experimental. The OpenTelemetry middleware and caching middleware are also stable and recommended for production use.

Does Microsoft.Extensions.AI replace Semantic Kernel? No. They operate at different levels of abstraction. Microsoft.Extensions.AI provides the low-level AI client plumbing. Semantic Kernel is a higher-level orchestration framework for agents, memory, and multi-step AI workflows. Semantic Kernel now implements IChatClient internally, so the two layers complement each other rather than compete.

How do I migrate existing OpenAI or Azure OpenAI SDK usage to IChatClient? Both the OpenAI .NET SDK and the Azure OpenAI SDK provide AsIChatClient() extension methods. Wrap your existing SDK client with this method, register the result as IChatClient in DI, and update your consuming services to depend on IChatClient instead of the concrete SDK type. The migration can be done incrementally and does not require rewriting your AI logic.

Can I use IChatClient with local models like Ollama? Yes. The Ollama .NET client library implements IChatClient, making it possible to point your application at a local Ollama instance rather than a cloud API. This is particularly useful for development environments, testing, and data-residency-constrained workloads. Cloud and local providers are swappable through DI configuration alone.

What observability does Microsoft.Extensions.AI provide out of the box? Using the UseOpenTelemetry extension on ChatClientBuilder, AI completions emit OpenTelemetry spans that include model name, token usage, and latency. These traces flow into any OTLP-compatible backend - Grafana, Jaeger, Azure Monitor, or Datadog - without additional instrumentation. This is the most operationally significant feature for teams running AI in production.

About the Author

Celin Daniel is Co-founder of Coding Droplets with 13+ years of hands-on experience building, shipping, and operating .NET and ASP.NET Core systems in production. The guidance here comes from real projects and production incidents, not theory.