This week in Azure
Three new videos from John this week. One compares Agent Builder, Copilot Studio, and Microsoft Foundry for building AI agents, walking through the personas and capability differences that drive the choice. The 400K subscriber AMA recording is now available. And there’s a walkthrough of the new Entra backup and recovery feature, showing how daily snapshots combined with soft delete and protected actions give you a solid defense against accidental or malicious changes to your Entra environment.
AKS dominates this week with six updates. The application network preview is the most interesting: mesh capabilities without mesh overhead. The blue-green agent pool upgrade is the one with the most practical impact for anyone doing Kubernetes upgrades today.
| Category | Update | Status |
|---|---|---|
| Kubernetes | AKS application network | Preview |
| Kubernetes | AKS meshless Istio app routing | Preview |
| Kubernetes | AKS network logs | GA |
| Kubernetes | AKS managed GPU metrics | Preview |
| Kubernetes | AKS fleet manager cross-cluster networking | Preview |
| Kubernetes | AKS container network metric filtering | GA |
| Kubernetes | AKS network AI agent | Preview |
| Kubernetes | AKS blue-green agent pool upgrade | Preview |
| Kubernetes | Arc-enabled K8s recommended Prometheus alerts | GA |
| Storage | Azure Container Storage elastic SAN integration | GA |
| Database | SQL DB automatic index compaction | Preview |
| Database | SQL MI change event streaming | Preview |
| Database | SQL Hyperscale new 160/192 vCore SKUs | Preview |
| Database | DiskANN vector search improvements | Preview |
| Database | PostgreSQL custom timezone for cron jobs | GA |
| Database | PostgreSQL migrations from EDB and Google AlloyDB | GA |
| Fabric | MySQL mirroring to Fabric | GA |
| Fabric | Cosmos DB private endpoint mirroring to Fabric | GA |
| Monitoring | Azure Monitor OTLP ingestion | Preview |
| AI | Foundry priority processing | GA |
| Identity | Entra ID external MFA | GA |
| Identity | Entra ID tenant governance | GA |
AKS application network (preview)
AKS now has an application network layer in preview. Think of it as a new application-layer abstraction for all Kubernetes traffic that gives you service mesh capabilities without the service mesh overhead.
Traditional mesh: Application network: Pod A ──→ [sidecar] ──→ Pod A ──→ ──→ Pod B ──→ [sidecar] ──→ Pod B ──→ ──→ Pod C ──→ [sidecar] ──→ Pod C ──→ ──→ └── no sidecars injected (sidecar per pod, (abstraction layer, management overhead) no app changes needed)It controls service-to-service communication, gives you observability without injecting a sidecar into every pod, and uses SPIFFE for identity. That’s interesting because SPIFFE is showing up more and more as the standard for workload identity in Kubernetes. The network uses those SPIFFE identities for tracking and controlling traffic.
Linux only today, but the key selling point is that there are no application changes required to take advantage of it.
AKS meshless Istio app routing (preview)
In the same vein, AKS now supports meshless Istio app routing in preview. You can use the Kubernetes Gateway APIs for ingress management without deploying a full sidecar architecture. Another step toward getting the useful parts of a service mesh without the operational weight.
AKS network logs (GA)
Container network logs are now GA. These capture network flow metadata (not full packet data): IP addresses, ports, namespaces, pods, services, flow direction, policy verdicts, and more at layers 3, 4, and 7.
The logs write to local storage and optionally to a Log Analytics workspace. You can filter to only capture traffic from specific resources, which is important because full network flow data generates massive volume. There’s also an on-demand mode using Hubble for targeted captures.
AKS managed GPU metrics (preview)
If you’re running nodes with NVIDIA GPUs, you can now get managed GPU metrics in preview. These feed into managed Prometheus and Grafana.
GPU metrics available
If you’re running AI/ML workloads on AKS, you finally get proper GPU observability through the same monitoring stack you use for everything else.
AKS fleet manager cross-cluster networking (preview)
If you have applications spanning multiple AKS clusters, fleet manager now provides a managed Cilium cluster mesh. Once enabled, any published service from one cluster is available to every connected cluster as if it were local.
Cluster A Cluster B Service X ────────────────→ Can consume Service X (annotated as global) (as if it were local) │ │ └───── Shared metrics ──────┘ and flow logsYou mark services with a global annotation, and cross-cluster communication just works. You also get global observability with shared metrics and flow logs across all connected clusters. The managed part is key: Cilium cluster mesh exists today, but setting it up and maintaining it yourself is significant operational work.
AKS container network metric filtering (GA)
Network observability metrics can generate enormous volumes of data. Metric filtering, now GA, lets you control exactly what gets captured. Only collect the signals you need, reduce storage costs, reduce noise. Simple but necessary.
AKS network AI agent (preview)
A new AI agent for AKS network troubleshooting. Describe a problem in natural language, and it turns your description into diagnostics using all the captured network data. Should make cluster networking issues easier to investigate for teams that aren’t deeply specialized in Kubernetes networking.
AKS blue-green agent pool upgrade (preview)
This is the AKS update with the most practical impact. Instead of rolling upgrades (the traditional approach), you can now do blue-green upgrades for your node pools.
Current state: Node Pool A (current config) ──→ all traffic
Upgrade initiated: Node Pool A (current config) ──→ traffic Node Pool B (new config) ──→ created automatically
Validation: Node Pool A ──→ partial traffic Node Pool B ──→ partial traffic (validate it works)
Complete: Node Pool B (new config) ──→ all traffic Node Pool A ──→ deleted
Problem? Roll back: Node Pool A (current config) ──→ all traffic (never changed) Node Pool B ──→ deletedUse it for Kubernetes version upgrades, node image upgrades, or configuration changes. The new node pool is created on demand when you trigger the upgrade; it doesn’t exist permanently.
The safety benefit is clear: if something goes wrong, your original node pool is untouched. Rolling upgrades modify nodes in place, which means a bad upgrade can leave you in a partially-upgraded state that’s harder to recover from.
Arc-enabled Kubernetes recommended alerts
Arc-enabled Kubernetes now has one-click enablement of recommended Prometheus alerts based on community rules. Coverage for clusters, nodes, and pods. This was previously available but required a template-based deployment. Now it’s a single click.
Azure Container Storage elastic SAN integration (GA)
Azure Container Storage now supports elastic SAN in GA. Previously, Container Storage was GA only for local node NVMe storage. Adding elastic SAN gives you more durable storage with flexible pools for different performance tiers.
If your AKS workloads need persistent storage that’s more durable and flexible than local NVMe but still managed through Container Storage, elastic SAN fills that gap.
SQL DB automatic index compaction (preview)
SQL Database, SQL Managed Instance, and SQL in Fabric all get automatic index compaction in preview. It runs as a background process that automatically compresses indexes to reduce storage space, CPU, memory, and disk IO.
This replaces the need for scheduled index maintenance jobs. Enable it with a single command and stop thinking about it. Less storage cost, better performance, no maintenance window to schedule.
SQL MI change event streaming (preview)
SQL Managed Instance can now stream row-level changes (inserts, updates, deletes) to Event Hub in near real-time. From Event Hub, you trigger serverless functions or feed into analytics pipelines.
SQL Managed Instance INSERT / UPDATE / DELETE │ ▼ Change Event Stream ──→ Event Hub ──→ Azure Functions ──→ Stream Analytics ──→ Real-time dashboardsBuild event-driven architectures without modifying your application code. The streaming happens at the database level, so your application just writes to SQL as normal.
SQL Hyperscale new SKUs and DiskANN improvements
SQL Hyperscale gets new 160 and 192 vCore options in preview on premium series hardware. These are for workloads that need massive compute and memory: large-scale OLTP, HTAP, analytics-heavy scenarios. Available for both single database and elastic pool.
DiskANN (Microsoft Research’s vector search) improvements landed across SQL Database and SQL Database in Fabric:
- Tables are no longer read-only after DiskANN index creation
- Filters are applied during vector searches instead of after (much more efficient)
- Better automatic selection between DiskANN and KNN algorithms
- Various other optimizations
If you’re using SQL Database for vector search with RAG applications, these improvements make it more practical for production use.
Azure Monitor OTLP ingestion (preview)
Azure Monitor now has a native OpenTelemetry Protocol (OTLP) endpoint. Send metrics, logs, and traces directly to an Azure Monitor workspace using the standard OTLP protocol. Uses Entra for authentication.
This is a straightforward but important addition. If your applications already emit OpenTelemetry data, you can point them directly at Azure Monitor without running a collector or adapter in between.
PostgreSQL updates
Two PostgreSQL updates this week:
Custom timezone for cron jobs: You can now set a timezone for scheduled jobs instead of working around the server’s default. Schedule jobs based on the regional timezone of the users, not the server clock. Practical for ensuring maintenance doesn’t run during business hours.
Migration updates (GA): You can now migrate from EDB PostgreSQL and Google AlloyDB to Azure Database for PostgreSQL. PG output is also available for minimal-downtime online migrations.
Fabric mirroring updates
MySQL mirroring: Azure Database for MySQL Flexible Server can now mirror to Fabric’s OneLake in near real-time without building data pipelines. Immediately available for all Fabric workloads: analytics, AI, Power BI.
Cosmos DB private endpoint mirroring (GA): Cosmos DB databases using private endpoints can now mirror to Fabric. There’s temporary additional networking needed during mirror establishment, but you can remove it once the mirror is running. Your Cosmos DB stays restricted to private endpoints only.
Foundry priority processing (GA)
For time-critical AI inferencing, priority processing is now GA. It gives you lower latency and higher throughput on a pay-as-you-go basis.
Standard pay-as-you-go
- Normal latency
- Standard throughput
- Lowest cost per token
Priority processing
- Lower latency
- Higher throughput
- Price premium (varies by model)
- No upfront commitment
Provisioned Throughput (PTU)
- Guaranteed throughput
- Fixed capacity
- Upfront commitment
The use case: you have latency-sensitive AI workloads but don’t want to commit to provisioned throughput. Or you already have PTUs but occasionally need burst capacity. Priority processing fills that gap. You can combine it with standard pay-as-you-go, PTUs, and batch processing to optimize cost across different latency requirements.
Available for the latest models on global and data zone deployments.
Entra ID external MFA (GA)
You can now use an external MFA solution with Entra ID authentication via OpenID Connect. This includes conditional access policy integration. It replaces the old custom controls feature, which is being deprecated.
If your organization is committed to a third-party MFA provider and has been using the custom controls workaround, this gives you a proper, supported integration path.
Entra ID tenant governance (GA)
Tenant governance helps with a problem that’s more common than organizations realize: shadow tenants.
Based on patterns of external identities, multi-tenant apps, and billing, it discovers other tenants being used by your organization. Then it helps you create relationships to administer those tenants and enables secure tenant creation so new tenants are configured correctly from the start.
There’s also an API available now. Some features are still in preview, but the core discovery and relationship management is GA.
Final thoughts
This was an AKS week. The application network preview is the most forward-looking update: mesh capabilities without sidecars, using SPIFFE for identity. That’s the direction Kubernetes networking is heading, and AKS is getting there early.
But the blue-green agent pool upgrade is what I’d actually prioritize trying. Rolling upgrades work, but they leave you in a partially-upgraded state if something goes wrong. Blue-green gives you a clean rollback path. The cost of running double resources during the upgrade window is worth it for the safety. If you’ve ever had a node image upgrade go sideways and had to figure out which nodes were upgraded and which weren’t, you’ll appreciate this.
The SQL updates are quietly significant too. Automatic index compaction is the kind of thing that saves DBA time every week, and the change event streaming for SQL MI opens up event-driven patterns without touching application code.
Sources
- John Savill, “Azure Update - 27th March 2026,” YouTube, https://www.youtube.com/watch?v=rz-7PWle174