When the basics aren’t enough

You’ve got EPAC running after following the EPAC Introduction and setting up EPAC Pipelines. Policies are deploying through pipelines. Life is good.

Then reality sets in.

“We need an exemption for this legacy application, but just for 90 days.”

“Can we have the same policy with different parameters for different business units?”

“Why is this policy showing compliant in DEV but non-compliant in production when they’re identical?”

These are the questions I kept running into after the initial EPAC setup. The basics work fine, but real environments are messy. This article covers the patterns I’ve found useful for handling that mess: exemptions, assignment strategies, initiative design, and debugging.

Policy exemptions: when rules need exceptions

Every policy framework needs escape hatches. The question is whether those escape hatches are documented, time-bound, and auditable, or whether they’re “someone disabled that in the portal three years ago.”

EPAC exemption structure

Exemptions in EPAC live alongside your assignments:

Definitions/
├── policyAssignments/
│ └── security-baseline.json
└── policyExemptions/
└── legacy-app-exemptions.json

Creating time-bound exemptions

{
"exemptions": [
{
"name": "legacy-erp-storage-encryption",
"displayName": "Legacy ERP Storage Encryption Exemption",
"description": "Temporary exemption for legacy ERP system during migration. Expires 2026-04-30.",
"exemptionCategory": "Mitigated",
"scope": "/subscriptions/12345678-1234-1234-1234-123456789abc/resourceGroups/legacy-erp",
"policyAssignmentId": "/providers/Microsoft.Management/managementGroups/production/providers/Microsoft.Authorization/policyAssignments/require-storage-encryption",
"expiresOn": "2026-04-30T23:59:59Z",
"metadata": {
"ticketNumber": "SEC-2024-1234",
"approvedBy": "security-team@company.com",
"mitigatingControls": "Network isolation, enhanced monitoring, scheduled migration Q2 2026"
}
}
]
}

Exemption categories

CategoryUse whenCompliance state
WaiverRisk accepted, no compensating controlsExempt (risk acknowledged)
MitigatedCompensating controls in placeExempt (alternative controls documented)

Both categories result in an Exempt compliance state in Azure. The resources count as exempt in compliance percentage calculations, not as non-compliant. The distinction between Waiver and Mitigated is semantic: it documents your intent and risk posture, but both work the same way in the compliance engine.

Exemption lifecycle

EPAC handles the full lifecycle:

Terminal window
# Exemptions are included in the deployment plan
Build-DeploymentPlans -PacEnvironmentSelector "prod"
# Plan shows:
# - New exemptions to create
# - Expired exemptions to remove
# - Modified exemptions to update

One thing to watch out for: set a calendar reminder for 30 days before exemption expiry. When EPAC removes an expired exemption, the resource becomes non-compliant, which might trigger unwanted remediation.

Bulk exemptions

If you have a lot of exemptions, organise them by category. For cost-management related exemptions, you may also find the Azure Cost Management post useful as context:

policyExemptions/
├── security/
│ ├── legacy-systems.json
│ └── third-party-integrations.json
├── cost-management/
│ └── dev-environment-exemptions.json
└── compliance/
└── regional-requirements.json

Each file can contain multiple exemptions:

{
"exemptions": [
{ "name": "exemption-1", ... },
{ "name": "exemption-2", ... },
{ "name": "exemption-3", ... }
]
}

Complex assignment scenarios

Real environments need more than “apply policy X to scope Y.” Here are patterns I’ve used to deal with that.

Pattern 1: same policy, different parameters per scope

Say you want to require tags, but each business unit needs different required tags.

{
"nodeName": "/Root/",
"children": [
{
"nodeName": "FinanceUnit/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/finance"
]
},
"assignment": {
"name": "require-tags-finance",
"displayName": "Require Tags - Finance"
},
"definitionEntry": {
"policySetName": "require-resource-tags"
},
"parameters": {
"requiredTags": ["CostCenter", "Project", "FinanceApprover"]
}
},
{
"nodeName": "EngineeringUnit/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/engineering"
]
},
"assignment": {
"name": "require-tags-engineering",
"displayName": "Require Tags - Engineering"
},
"definitionEntry": {
"policySetName": "require-resource-tags"
},
"parameters": {
"requiredTags": ["CostCenter", "Team", "Repository"]
}
}
]
}

Pattern 2: inheritance with override

Base security policies at the root, stricter controls for regulated workloads.

{
"nodeName": "/Root/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/landing-zones"
]
},
"assignment": {
"name": "security-baseline",
"displayName": "Security Baseline"
},
"definitionEntry": {
"policySetName": "security-baseline-initiative"
},
"parameters": {
"allowedLocations": ["westeurope", "northeurope"],
"requiredEncryption": "platform-managed"
},
"children": [
{
"nodeName": "RegulatedWorkloads/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/regulated"
]
},
"assignment": {
"name": "security-baseline-regulated",
"displayName": "Security Baseline - Regulated"
},
"parameters": {
"allowedLocations": ["westeurope"],
"requiredEncryption": "customer-managed"
},
"overrides": [
{
"kind": "policyEffect",
"selectors": [
{
"kind": "policyDefinitionReferenceId",
"in": ["auditStorageEncryption"]
}
],
"value": "deny"
}
]
}
]
}

Pattern 3: non-compliance messaging

This one is easy to skip but makes a big difference. When someone’s deployment fails, a good message saves them from filing a ticket:

{
"assignment": {
"name": "require-approved-images",
"displayName": "Require Approved Container Images"
},
"nonComplianceMessages": [
{
"message": "Container images must come from the approved registry: acr.company.com. See https://wiki.company.com/approved-images for the list of approved images and how to request additions."
},
{
"policyDefinitionReferenceId": "containerImageSource",
"message": "This specific container image is not from an approved source. Contact platform-team@company.com for assistance."
}
]
}

Pattern 4: resource selectors

You can limit policies to specific resource types or locations:

{
"assignment": {
"name": "require-private-endpoints",
"displayName": "Require Private Endpoints for PaaS"
},
"resourceSelectors": [
{
"name": "StorageAccountsInProduction",
"selectors": [
{
"kind": "resourceType",
"in": [
"Microsoft.Storage/storageAccounts",
"Microsoft.KeyVault/vaults",
"Microsoft.Sql/servers"
]
},
{
"kind": "resourceLocation",
"in": ["westeurope", "northeurope"]
}
]
}
]
}

Initiative (policy set) design

Initiatives group related policies. Get the grouping right and assignments stay manageable. Get it wrong and you’ll spend your time figuring out which initiative a policy belongs to.

A few things I’ve learned:

  • Group by compliance domain, not by resource type
  • Keep initiatives under 20 policies if you can. 50-policy initiatives are painful to manage
  • Use parameter references so you don’t duplicate parameters across assignments
  • Version your initiatives so you can roll out changes gradually

Initiative structure

{
"name": "storage-security-initiative",
"properties": {
"displayName": "Storage Security Standards",
"description": "Security controls for Azure Storage accounts",
"metadata": {
"version": "2.1.0",
"category": "Storage"
},
"parameters": {
"effect": {
"type": "String",
"defaultValue": "Audit",
"allowedValues": ["Audit", "Deny", "Disabled"],
"metadata": {
"displayName": "Effect",
"description": "The effect for all policies in this initiative"
}
},
"allowedSkus": {
"type": "Array",
"defaultValue": ["Standard_LRS", "Standard_GRS", "Standard_RAGRS"],
"metadata": {
"displayName": "Allowed Storage SKUs"
}
}
},
"policyDefinitions": [
{
"policyDefinitionReferenceId": "requireHttpsTraffic",
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/404c3081-a854-4457-ae30-26a93ef643f9",
"parameters": {
"effect": {
"value": "[parameters('effect')]"
}
}
},
{
"policyDefinitionReferenceId": "requireEncryption",
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/b5ec538c-daa0-4006-8596-35468b9148e2",
"parameters": {
"effect": {
"value": "[parameters('effect')]"
}
}
},
{
"policyDefinitionReferenceId": "restrictSkus",
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/7433c107-6db4-4ad1-b57a-a76dce0154a1",
"parameters": {
"listOfAllowedSKUs": {
"value": "[parameters('allowedSkus')]"
}
}
}
]
}
}

Initiative versioning

policySetDefinitions/
├── storage-security/
│ ├── v1.0.0/
│ │ └── storage-security.json
│ ├── v2.0.0/
│ │ └── storage-security.json
│ └── v2.1.0/
│ └── storage-security.json

Or use metadata versioning:

{
"metadata": {
"version": "2.1.0",
"changeLog": {
"2.1.0": "Added private endpoint requirement",
"2.0.0": "Breaking: Changed default effect to Deny",
"1.0.0": "Initial release"
}
}
}

Troubleshooting EPAC deployments

Things will go wrong. Here’s what I’ve seen most often and how to deal with it.

Plan shows changes that shouldn’t exist

Every deployment shows updates even when nothing changed. This is almost always formatting, ordering, or metadata differences between what’s in Azure and what’s in your source files.

Terminal window
# Get the raw policy from Azure
$azurePolicy = Get-AzPolicyDefinition -Name "your-policy" | ConvertTo-Json -Depth 100
# Compare with your source file
$sourcePolicy = Get-Content "./Definitions/policyDefinitions/your-policy.json" | ConvertFrom-Json | ConvertTo-Json -Depth 100
# Use a diff tool to find differences
Compare-Object ($azurePolicy -split "`n") ($sourcePolicy -split "`n")

Remediation tasks failing

You created remediation tasks but resources aren’t getting fixed. Here’s how to check:

Terminal window
# Check remediation task status
$remediation = Get-AzPolicyRemediation -Name "your-remediation" -Scope "/subscriptions/..."
# Look at the provisioning state
$remediation.ProvisioningState
# Check for role assignment issues (only assignments with DeployIfNotExists/Modify have identities)
$assignment = Get-AzPolicyAssignment -Id $remediation.PolicyAssignmentId
if ($null -ne $assignment.Identity -and $null -ne $assignment.Identity.PrincipalId) {
Get-AzRoleAssignment -ObjectId $assignment.Identity.PrincipalId
} else {
Write-Warning "Assignment has no managed identity - remediation requires DeployIfNotExists or Modify effect"
}

Most of the time it’s one of these:

  • The managed identity doesn’t have the right permissions
  • A resource provider isn’t registered
  • The template in your deployIfNotExists policy has errors

Policy showing wrong compliance state

A resource should be compliant but shows non-compliant (or the other way around).

Terminal window
# Trigger a compliance scan
Start-AzPolicyComplianceScan -ResourceGroupName "your-rg"
# Check the specific policy state
Get-AzPolicyState -PolicyAssignmentName "your-assignment" `
-Filter "ResourceId eq '/subscriptions/.../your-resource'"
# Look at the detailed compliance reason
$state = Get-AzPolicyState -PolicyAssignmentName "your-assignment" -Top 1
$state.ComplianceReasonCode

Build fails with circular reference

Build-DeploymentPlans fails with a circular reference error. Usually this means policy definitions reference each other, or you have malformed JSON somewhere.

Terminal window
# Validate JSON files individually
Get-ChildItem -Path "./Definitions" -Filter "*.json" -Recurse | ForEach-Object {
try {
$null = Get-Content $_.FullName | ConvertFrom-Json
Write-Host "Valid: $($_.FullName)" -ForegroundColor Green
} catch {
Write-Host "Invalid: $($_.FullName)" -ForegroundColor Red
Write-Host $_.Exception.Message
}
}

Debug mode

When nothing else works, turn on verbose output:

Terminal window
$VerbosePreference = "Continue"
$DebugPreference = "Continue"
Build-DeploymentPlans `
-PacEnvironmentSelector "dev" `
-OutputFolder "./debug-output"

Multi-tenant patterns

If you manage more than one Entra ID tenant, here’s how to structure it. Your management group hierarchy design from the cloud foundation becomes especially important in multi-tenant scenarios.

Separate configurations per tenant

{
"pacEnvironments": [
{
"pacSelector": "tenant1-prod",
"cloud": "AzureCloud",
"tenantId": "tenant-1-guid",
"deploymentRootScope": "/providers/Microsoft.Management/managementGroups/tenant1-root"
},
{
"pacSelector": "tenant2-prod",
"cloud": "AzureCloud",
"tenantId": "tenant-2-guid",
"deploymentRootScope": "/providers/Microsoft.Management/managementGroups/tenant2-root"
}
]
}

Shared definitions, tenant-specific assignments

Definitions/
├── policyDefinitions/ # Shared across tenants
│ └── custom-policies/
├── policySetDefinitions/ # Shared across tenants
│ └── security-baseline/
└── policyAssignments/
├── tenant1/ # Tenant-specific
│ └── production.json
└── tenant2/ # Tenant-specific
└── production.json

What I’d do differently next time

If I were starting over, I’d focus on three things:

  1. Make every exemption expire. No exceptions. If it needs to be permanent, that’s a sign the policy itself needs adjusting.

  2. Match assignment structure to your management group tree. Fighting the hierarchy just creates confusion. If the tree is wrong, fix the tree first.

  3. Keep initiatives small. It’s tempting to put everything into one big initiative. Don’t. You’ll regret it when you need to change one parameter and it affects 40 policies.

Next in the Series

This wraps up the three-part EPAC series. The natural continuation is the governance framework post, which ties EPAC and policy management into a broader Azure governance strategy.


Sources

  1. Microsoft, “Azure Policy Exemption Structure,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/exemption-structure

  2. Microsoft, “Azure Policy Assignment Structure,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/assignment-structure

  3. Microsoft, “Azure Policy Initiative Definition,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/initiative-definition-structure

  4. Microsoft, “Remediate Non-Compliant Resources,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/how-to/remediate-resources

  5. Azure, “EPAC Documentation,” GitHub, https://github.com/Azure/enterprise-azure-policy-as-code/tree/main/Docs

  6. Microsoft, “Azure Policy Troubleshooting,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/troubleshoot/general