Azure Weekly CW23: Build 2026 Special

Build 2026 in one post

There is way too much from Build to cover every announcement, so this post is a curated walkthrough. The things I think actually matter for organisations building on Azure. For the full firehose, the Build blog has it all.

The framing of Build 2026 was about making AI real for an organisation. Microsoft kept coming back to the same idea: more compute, better data, better evaluations, better outcomes. Most of the announcements aren’t standalone features, they’re another step in that direction.

Grouping by area.

Compute

Cobalt 200 ARM VMs (preview)

Second generation of Microsoft’s custom ARM-based silicon. Compared to Cobalt 100: 50% CPU performance gains, ~20% remote storage IOPS improvement, ~15% higher network bandwidth, up to 128 vCPUs, very large caches, memory encryption on by default. Real-workload numbers were even better: 135% improvement on cloud databases, 80% on caching workloads, 40% on web servers.

Available in general-purpose (2:1 and 4:1 memory:CPU), memory-optimised (8:1 and 16:1), and storage-optimised (8:1 with larger local NVMe). Look for the P-variant.

ARM is still the right answer when cost-to-performance matters more than the last percent of single-thread x86 throughput.

Lasv5 and Laosv5 storage-optimised VMs (preview)

Built on 5th-gen AMD EPYC. The O variant carries up to 138 TiB of local storage (vs ~30 TiB on the non-O variant) and very high network throughput for remote storage. Roughly 35% CPU performance improvement over the previous generation. Useful if you’ve got storage-heavy workloads that previously had to scale out because no single VM could hold the data.

Azure Linux 4.0 (preview) and the split

Already covered briefly in CW20-21, but re-announced at Build: Azure Linux is now split in two.

Azure Linux 4 (preview): general-purpose OS for Azure VMs (and a WSL distro). Fedora-derived, free, kernel optimised for Azure infrastructure.
Azure Container Linux (GA): based on the immutable Flatcar project, container hosts only.

If you’ve been using Azure Linux on VMs, you stay on Azure Linux. If you’ve been using it as a container OS, you move to Azure Container Linux.

VMSS Flexible: health-aware automatic OS image upgrades

Virtual Machine Scale Sets with Flexible orchestration can now do automatic OS upgrades with health-aware progression. It checks each instance’s health before moving to the next, rather than just rolling and hoping. Needs the guest OS health extension installed. This should be the default for production scale sets.

Azure Infrastructure Resiliency Manager (preview)

A new manager that combines availability zones, Azure Advisor, Chaos Studio, Azure Monitor and Copilot to give you end-to-end resiliency recommendations and tests. It’s a wrapper over things that already exist, but having a unified surface is useful. You don’t need to remember which tool to check for which question.

Confidential TDX VM live migration (preview)

Live migration (zero-downtime VM moves during platform maintenance) now works for confidential TDX VMs. Previously this protection class forced you to take the VM down during host maintenance. With live migrate, you no longer pay an availability cost for encryption-in-use.

Azure Functions: a lot of updates

Functions had one of the bigger sets of announcements. The shortlist:

Node.js 24 (GA), Linux and Windows.
Go (preview), first-class Go support.
Serverless agents runtime on Flex Consumption (preview), event-driven agents on cheap hosting.
Native Grafana dashboards (GA) for function app health and performance.
Logic Apps connector ecosystem (preview), Functions can now consume the same ~1,400 connectors as Logic Apps, both as triggers and operations.
MCP apps (GA), server-rendered HTML interfaces returned through MCP. The model gets a real graphical surface instead of a text response.
AI assistant plugins (preview), context for Copilot, Claude Code, Codex and friends so they generate better Functions code.
Rolling upgrades (GA) for Flex Consumption.

The big one: durable functions on consumption (from CW19) plus serverless agents on Flex now form a cheap, stateful, agent-capable serverless story end-to-end.

Logic Apps: codeful, agentic and as-a-service

Call Foundry agents directly from a Logic App (preview). Drop an agent into a deterministic workflow where you need judgment.
Knowledge-as-a-service (preview). Ingest, chunk, embed and retrieve documents from inside a Logic App. RAG without writing a pipeline.
New automation SKU (preview). Combines Logic Apps with Foundry-hosted agents in one experience. Describe what you want, it wires up agents, models and knowledge, exposes it as MCP, billing is consumption.
Codeful workflows (preview) for Logic App Standard. Write workflows in code while keeping connector access.
Connector namespaces (preview). A common programming model so apps and AI agents talk to one place. The namespace handles auth, state, retries, and exposes an MCP server.
Workflows as MCP servers (GA). Any existing workflow becomes discoverable to AI agents.

AKS

AKS Anyscale (preview): a managed Ray platform on AKS. Spin up large Python/ML workloads (training, tuning, inferencing) using the Anyscale Kubernetes operator.
Managed system node pools for AKS Automatic (GA): system components keep themselves current.
Kubernetes Fleet Management for Arc-enabled clusters (GA): workload placement and visibility across any CNCF-compliant cluster you’ve Arc-enabled.

Container Apps

Defender for Cloud posture (preview): serverless container posture management and attack-path detection now extend to Container Apps.
Container Apps Sandboxes (preview): ephemeral environments with microVM-based kernel isolation and full state durability. Scale to zero, scale wide, snapshot the full memory and disk. This is the same capability powering Foundry hosted agents.
Confidential compute (GA): encryption-in-use for sensitive workloads.
Detailed HTTP access log (GA), disabled by default.
OpenTelemetry destination expansion, more flexibility in where your logs go.
Custom KEDA scaling rules to override the platform-managed scaling.

Networking and API platform

API Management

Data-plane MCP server (GA). Your AI agents talk to one APIM-exposed MCP server and discover everything from there.
Premium V2 multiple custom domains plus wildcard host names (GA). Flexible branding and certificate handling.
Content safety for MCP and A2A (GA). Consistent safety controls regardless of protocol. The A2A API also goes GA.
Anthropic and Google Vertex AI through APIM AI Gateway (GA). Same policies, observability and traffic shaping across a much wider model surface.
Unified Model API (preview). One API your app calls, APIM translates to whatever the backend model expects. No more hand-wiring per-provider clients.
Workspaces on the built-in gateway (GA). Workspaces (folders of APIs) used to need a dedicated workspace gateway, which cost more and lost some features. They can now run on the default gateway.
Token metrics across all interaction types (preview). Cache, reasoning and thinking tokens, not just regular inferencing. Important if you care about cost.

API Center

Register agents directly in the catalog.
Sync agents from Git repos.
LLM-judge gate before registration. Quality, safety and behaviour checked before an agent enters the catalog.

Observability

Azure Monitor went heavy on standards and signal quality:

OpenTelemetry metrics and OTLP ingestion (GA). VMs, Arc servers, OTLP signals all flow in natively.
Dynamic thresholds for log search alerts (GA). ML-learned normal patterns instead of brittle hand-tuned thresholds.
Simple log alerts (GA). Per-row evaluation, lower latency triggering.
Service Level Indicators and Objectives (GA). Surface SLIs and SLOs directly rather than dashboarding their underlying metrics yourself.

Database

Azure Horizon DB (preview)

Microsoft’s own Postgres fork with full Postgres compatibility plus a much bigger envelope: up to 128 TiB, 15 read replicas, 3,072 vCores, multi-AZ deployment. Built-in extras that matter for AI workloads:

Native disk-ANN vector search (Microsoft Research’s vector engine).
Native BM25 lexical search (higher quality than Postgres’ built-in).
Hybrid vector + BM25 (the combination most RAG apps want).
Vector metadata filters (narrow by category, date, etc).
Pre-provisioned AI model APIs (embeddings, chat, reranking).
AI pipelines defined in SQL: ingest, chunk, embed, extract, rank, human-in-the-loop, all running as durable jobs.

If you’re choosing a managed Postgres in Azure and your workload has a heavy AI angle, take a serious look at Horizon DB.

Cosmos DB

Cosmos DB shipped a long list. Highlights:

Semantic reranker, what Azure AI Search had for a while, now in Cosmos DB across vector, lexical, full-text and hybrid searches. Reranks results by semantic intent and returns top-K. Higher quality plus fewer tokens to your model, but it adds a network hop so test the net impact.
Integrated embedding for NoSQL (preview). Point a container at a Foundry embedding model and embeddings get created automatically on insert/update.
Agentic retrieval toolkit (preview). Multi-pass retrieval with reasoning for complex queries.
Agent kit (GA). A skill set your AI coding agent uses to architect Cosmos DB correctly.
Agent memory toolkit (preview). Cosmos DB as a memory store for agents, via a Python SDK.
Distributed transactions (preview). Atomic operations across items and across partitions. Removes one of the longest-standing reasons to fall back to a relational database.
Per-partition automatic failover (GA). Only affected partitions fail over instead of the whole account.
Online partition key change (GA) via the data explorer.
Global Secondary Indexes (GA). Alternate partition keys for query patterns that don’t match the primary partition key. Saves RU costs.
All-versions-and-deletes change feed (GA). Every modification, not just the most recent.
MCP toolkit (GA). Entra-auth MCP server with full CRUD, schema discovery and vector/hybrid search in a container.
V-next local emulator across Windows, Linux and macOS.
Azure Backup support (preview), isolated backups, ransomware-friendly.

Fabric Rayfin (preview) and GPU-accelerated analytics

Rayfin is an open-source SDK and CLI for deploying apps into Microsoft Fabric. They inherit Fabric’s enterprise security and Purview labels, and the data lands in OneLake. Replit integration is the headline path: describe an app, it provisions and deploys into Fabric. Fabric data warehouse also gets GPU-accelerated analytics with 5-7x query acceleration and no config changes.

Other database highlights

SQL Server on Azure VM snapshot backup (preview). Azure Backup plus disk snapshots plus transaction logs = near-instant backups and point-in-time restore regardless of database size.
Databricks copy-on-write branches (preview). Quick branches of a Databricks environment for tests against prod data.
Oracle to Postgres schema conversion (GA) through the Postgres VS Code extension.
MSSQL VS Code extension agent integration. Describe schemas in natural language, Copilot generates schema-designer changes.
Cross-tenant CMK for Postgres (preview). Your SaaS customers hold the encryption keys in their own tenant.

Foundry IQ and Azure AI Search

Foundry IQ went GA. New things in preview:

Fabric OneLake catalogs as a knowledge source. Content is indexed, document-level permissions and sensitivity labels are preserved across knowledge bases.
Fabric IQ ontologies as a knowledge source. AI can reason over enterprise entities, not raw underlying data.
Azure SQL and MCP servers as knowledge sources. Salesforce, ServiceNow and similar external systems become first-class.
Serverless indexer in Azure AI Search. Pay-per-work instead of always-on capacity.

Ingestion quality also improved. Figures, screenshots and diagrams from PDFs and Office docs are now retained alongside their text and can be returned to agents directly. Chunking respects tables and paragraph structure. Higher-quality embeddings produce higher-quality retrieval.

Generative AI inside the indexer (GA): indexers can now call a chat completion model during enrichment and retrieval (summarise, normalise, clean up) without you writing custom functions or Logic Apps. Managed identity now covers embedding, chat, extraction and billing. No more keys.

Agents, governance and runtime

This is where Build was at its most consequential.

Microsoft Execution Container (MXC)

As agents start to write and run code locally on machines (or in microVMs in the cloud), sandboxing them safely becomes a real problem. MXC is an SDK that wraps OS-level isolation with multiple levels, from process and session containment up to full kernel isolation via microVMs. Works across Windows, WSL, Linux and macOS. The kind of plumbing every “agent that runs your code” architecture needs.

Agent governance, ASSERT and ACS

Generative AI is non-deterministic, so traditional input-A-yields-output-B testing doesn’t work. Two specifications address this.

ASSERT (Adaptive Spec-driven Scoring for Evaluation & Regression Testing). Turn written requirements into a behaviour taxonomy, then test cases, then run against the system, then judge how closely it adheres. Surfaces gaps and policy failures.

ACS (Agent Control Specification). Interception points at agent startup, user input, pre/post model call, pre/post tool call, output, shutdown. Evaluate policies and act at each point.

For Foundry, there’s a native rubric evaluator (preview) that generates evaluation criteria from an agent’s spec. ACS is the runtime side. Close the gaps ASSERT finds, then re-run ASSERT to verify improvement. This is the closest thing to a proper test-and-guardrails loop for non-deterministic systems.

Foundry guided guardrails (preview)

Developer questionnaire, recommended guard rails based on what the agent does, what data it accesses, and how it’s used. Same direction: take the question of “is this safe enough?” out of folklore.

Agent A2A (preview), tracing and continuous improvement

A2A is now in preview across both prompt-based and Foundry-hosted agents. The tracing/eval side picked up trace-based evaluations for external and hosted agents, multi-turn simulators, and replay of past sessions so you can see why an agent did what it did. Pair with Foundry’s optimiser for continuous improvement. A new agent ROI dashboard (private preview) finally measures business impact (time saved, cost efficiency).

Foundry Toolbox and memory

The Toolbox is a single MCP endpoint that combines tools, search-over-tools (so the model doesn’t get flooded with every available capability), and access to all the IQs (Work IQ, Foundry IQ, Fabric IQ). Foundry handles auth, lifecycle and governance.

Memory got more types (user, session, procedural) plus TTLs to prevent drift, semantic deduplication, and event-triggered agents.

Purview agent integrations (preview)

Purview now observes and protects local agents (GitHub Copilot, Claude Code, Codex, open Claude) and Foundry-hosted agents. Sensitivity labels propagate into Foundry IQ. SharePoint permissions sync incrementally. Admin operations in Azure AI Search that touch labelled data flow to Purview’s unified audit log.

Publish to Teams and M365 Copilot (June GA)

You can now publish an agent into Teams or M365 Copilot directly. The point isn’t the publishing, it’s the distribution. An agent nobody uses is useless. Showing up in users’ existing flow of work is the only reliable way to actually change behaviour.

AI services

Azure AI Translator (GA): better PDF batch translation (digital and scanned), image translation (JPEG/PNG/WEBP and inside DOCX), synchronous single-document image translation, easier domain-specific terminology with small datasets.
Azure AI Speech LLM API (GA): adds LLM-based capability to fast transcription (translation, custom prompts, deeper contextual understanding, higher quality).
Custom Avatar and Custom Video in Foundry (GA). 10 minutes of video gets you a custom half/full-body avatar, or a single photo for a talking head. Custom voice authoring also GA in the Foundry portal.
Content Understanding in the Foundry portal (preview). No separate portal needed.
Foundry Global PTU reservations (GA). Discount applies across any regional PTU deployment. Multi-region flexibility plus reservation pricing.
PII playgrounds (preview). Test text and conversational PII detection before SDK integration.
Foundry VS Code extension (GA). Model catalog, playgrounds, hosted agent deployments inside VS Code.

Models

Microsoft shipped a wave of its own MAI models, all designed to expand the choice surface rather than chase the biggest leaderboard.

Alon 1.0 Instruct: small language model for text tasks (summarisation, rewriting, accessibility) on Windows devices.
Alon 1.0 Plan: 14B reasoning model, 32K context, tool calling. For multi-step agent work on-device.
MAI-Thinking-1: midweight LLM, mixture-of-experts (1T total / 35B active). Trained on non-distilled, licensed data, not derived from another model’s output.
MAI-Image-2.5 with Flash variant. Updated text-to-image and image-to-image.
MAI-Transcribe-1.5: speech-to-text with speaker diarisation, content biasing, 5x faster than competing models at SOTA accuracy.
MAI-Voice-2 with Flash variant. Multilingual text-to-speech, voice cloning with authorisation.
MAI-Code-1-Flash: 5B active parameter agentic coding model in GitHub Copilot. Token-efficient.

The point isn’t that any of these is the best at anything. It’s that you should not use the biggest, most expensive model for every task. Small ones for routine and on-device, big frontier models for the hard work. The whole MAI family is a cost-optimisation lever.

Add Anthropic’s Claude Opus 4.8 (now in Foundry, M365 Copilot and GitHub Copilot) for frontier coding and frontier tuning (reinforcement-learning-based tuning of frontier models from your own workloads).

There’s also a useful Azure Policy (preview) for controlling which models the Foundry Model Router may select from.

Big-picture

Microsoft scouts and autopilots

Microsoft Scout is the first autopilot. Always-on, built on the open-Claude technology, runs inside your enterprise governance, has its own identity. You give it a goal, it works in the background, across email, calendar, M365. Triages, suggests reorganisations, does actual work. The pitch is that an agent that only acts when explicitly invoked is much less useful than one that proactively helps.

Work IQ APIs and Web IQ

Work IQ APIs expose the “how we work” context (documents, transcripts, chats, relationships, personalisation) across REST, MCP and A2A. Chat (responses with citations), Context (raw matched data), Tools (email, scheduling, document operations) and Workspaces (stateful session stores). Consumption-billed in Copilot credits.

Web IQ is the agent-shaped counterpart to Bing. Same global index, but tuned for how agents query. Returns the passage and the structured evidence rather than a page or a document. Smallest relevant payload, fast fan-out across partitions, ranked. Fewer tokens, more iterations possible.

Confidential Clean Room multi-party analytics (preview)

Run Apache-Spark analytics across data contributed by multiple parties without any party being able to see another’s data. Think banks collaborating on a problem without exposing customer data. Extends the same protections to broader cross-organisation analytics.

MDASH and code security

MDASH (already mentioned in CW20) now integrates with Defender and GitHub Code Security. A multi-model agent scanning harness (~100 specialised agents) that discovers, debates and proves exploitable vulnerabilities. Outperforms a single-model approach. Worth signing up for if you do security work.

Majorana 2

Quantum qubit stability went from milliseconds to seconds. 1000x improvement. Microsoft now publicly targeting 2029 for a fully usable scaled quantum processor. Not actionable today, but the timeline is now real enough to factor into long-range strategy.

Intelligent terminal (preview)

A fork of Windows Terminal with a native agent (default GitHub Copilot, can switch to Claude / Codex / Gemini). Errors in the main terminal are observed; the agent helps diagnose and fix. Available via Microsoft Store or Winget.

Event Grid MQTT v5 subscription identifier (GA)

A small one but useful: MQTT v5 subscribers can route by an identifier carried in the message instead of inspecting payloads to figure out where each message should go.

Final thoughts

Build 2026 was about taking agents from demo to production. ASSERT, ACS, Foundry guided guardrails, MXC, Purview integration, the ROI dashboard. The toy was the easy bit. The governance, sandboxing, evaluation and observability are what most organisations are still missing.

Cost optimisation is becoming the AI story too. Global PTU reservations, the Foundry Model Router expanding, the whole MAI family, token metrics in APIM. Stop spending Opus tokens on tasks a 5-billion-parameter model handles fine.

The IQs (Work IQ, Foundry IQ, Fabric IQ, Web IQ) are where Microsoft is betting the long-term value sits. Not the model, the data the model can reason over and how cleanly that data is structured.

And Postgres is becoming a serious centre of gravity in Azure. Horizon DB, the AI pipelines, Oracle migration, the developer hub. If you’ve been resisting Postgres in favour of Cosmos DB or SQL, the gap is closing fast for many workload shapes.

Go read the Build blog for the rest. There were a hundred more updates I haven’t covered here.

Sources

John Savill, “Azure Update - 5th June 2026 (Build Special),” YouTube, https://www.youtube.com/watch?v=NA4-eOC_jUI
Microsoft Build 2026 announcements, https://news.microsoft.com/build-2026/