Introduction
You’ve got EPAC running. Policies are deploying through pipelines. Life is good.
Then reality sets in.
“We need an exemption for this legacy application, but just for 90 days.”
“Can we have the same policy with different parameters for different business units?”
“Why is this policy showing compliant in DEV but non-compliant in production when they’re identical?”
These are the questions I kept running into after the initial EPAC setup. The basics work fine, but real environments are messy. This article covers the patterns I’ve found useful for handling that mess: exemptions, assignment strategies, initiative design, and debugging.
Policy exemptions: when rules need exceptions
Every policy framework needs escape hatches. The question is whether those escape hatches are documented, time-bound, and auditable, or whether they’re “someone disabled that in the portal three years ago.”
EPAC exemption structure
Exemptions in EPAC live alongside your assignments:
Definitions/├── policyAssignments/│ └── security-baseline.json└── policyExemptions/ └── legacy-app-exemptions.jsonCreating time-bound exemptions
{ "exemptions": [ { "name": "legacy-erp-storage-encryption", "displayName": "Legacy ERP Storage Encryption Exemption", "description": "Temporary exemption for legacy ERP system during migration. Expires 2026-04-30.", "exemptionCategory": "Mitigated", "scope": "/subscriptions/12345678-1234-1234-1234-123456789abc/resourceGroups/legacy-erp", "policyAssignmentId": "/providers/Microsoft.Management/managementGroups/production/providers/Microsoft.Authorization/policyAssignments/require-storage-encryption", "expiresOn": "2026-04-30T23:59:59Z", "metadata": { "ticketNumber": "SEC-2024-1234", "approvedBy": "security-team@company.com", "mitigatingControls": "Network isolation, enhanced monitoring, scheduled migration Q2 2026" } } ]}Exemption categories
| Category | Use when | Compliance impact |
|---|---|---|
Waiver | Risk accepted, no compensating controls | Non-compliant (acknowledged) |
Mitigated | Compensating controls in place | Compliant (with exemption) |
Use Mitigated when you have alternative controls. Use Waiver when you’re accepting the risk knowingly.
Exemption lifecycle
EPAC handles the full lifecycle:
# Exemptions are included in the deployment planBuild-DeploymentPlans -PacEnvironmentSelector "prod"
# Plan shows:# - New exemptions to create# - Expired exemptions to remove# - Modified exemptions to updateOne thing to watch out for: set a calendar reminder for 30 days before exemption expiry. When EPAC removes an expired exemption, the resource becomes non-compliant, which might trigger unwanted remediation.
Bulk exemptions
If you have a lot of exemptions, organise them by category:
policyExemptions/├── security/│ ├── legacy-systems.json│ └── third-party-integrations.json├── cost-management/│ └── dev-environment-exemptions.json└── compliance/ └── regional-requirements.jsonEach file can contain multiple exemptions:
{ "exemptions": [ { "name": "exemption-1", ... }, { "name": "exemption-2", ... }, { "name": "exemption-3", ... } ]}Complex assignment scenarios
Real environments need more than “apply policy X to scope Y.” Here are patterns I’ve used to deal with that.
Pattern 1: same policy, different parameters per scope
Say you want to require tags, but each business unit needs different required tags.
{ "nodeName": "/Root/", "children": [ { "nodeName": "FinanceUnit/", "scope": { "tenant1": [ "/providers/Microsoft.Management/managementGroups/finance" ] }, "assignment": { "name": "require-tags-finance", "displayName": "Require Tags - Finance" }, "definitionEntry": { "policySetName": "require-resource-tags" }, "parameters": { "requiredTags": ["CostCenter", "Project", "FinanceApprover"] } }, { "nodeName": "EngineeringUnit/", "scope": { "tenant1": [ "/providers/Microsoft.Management/managementGroups/engineering" ] }, "assignment": { "name": "require-tags-engineering", "displayName": "Require Tags - Engineering" }, "definitionEntry": { "policySetName": "require-resource-tags" }, "parameters": { "requiredTags": ["CostCenter", "Team", "Repository"] } } ]}Pattern 2: inheritance with override
Base security policies at the root, stricter controls for regulated workloads.
{ "nodeName": "/Root/", "scope": { "tenant1": [ "/providers/Microsoft.Management/managementGroups/landing-zones" ] }, "assignment": { "name": "security-baseline", "displayName": "Security Baseline" }, "definitionEntry": { "policySetName": "security-baseline-initiative" }, "parameters": { "allowedLocations": ["westeurope", "northeurope"], "requiredEncryption": "platform-managed" }, "children": [ { "nodeName": "RegulatedWorkloads/", "scope": { "tenant1": [ "/providers/Microsoft.Management/managementGroups/regulated" ] }, "assignment": { "name": "security-baseline-regulated", "displayName": "Security Baseline - Regulated" }, "parameters": { "allowedLocations": ["westeurope"], "requiredEncryption": "customer-managed" }, "overrides": [ { "kind": "policyEffect", "selectors": [ { "kind": "policyDefinitionReferenceId", "in": ["auditStorageEncryption"] } ], "value": "deny" } ] } ]}Pattern 3: non-compliance messaging
This one is easy to skip but makes a big difference. When someone’s deployment fails, a good message saves them from filing a ticket:
{ "assignment": { "name": "require-approved-images", "displayName": "Require Approved Container Images" }, "nonComplianceMessages": [ { "message": "Container images must come from the approved registry: acr.company.com. See https://wiki.company.com/approved-images for the list of approved images and how to request additions." }, { "policyDefinitionReferenceId": "containerImageSource", "message": "This specific container image is not from an approved source. Contact platform-team@company.com for assistance." } ]}Pattern 4: resource selectors
You can limit policies to specific resource types or locations:
{ "assignment": { "name": "require-private-endpoints", "displayName": "Require Private Endpoints for PaaS" }, "resourceSelectors": [ { "name": "StorageAccountsInProduction", "selectors": [ { "kind": "resourceType", "in": [ "Microsoft.Storage/storageAccounts", "Microsoft.KeyVault/vaults", "Microsoft.Sql/servers" ] }, { "kind": "resourceLocation", "in": ["westeurope", "northeurope"] } ] } ]}Initiative (policy set) design
Initiatives group related policies. Get the grouping right and assignments stay manageable. Get it wrong and you’ll spend your time figuring out which initiative a policy belongs to.
A few things I’ve learned:
- Group by compliance domain, not by resource type
- Keep initiatives under 20 policies if you can. 50-policy initiatives are painful to manage
- Use parameter references so you don’t duplicate parameters across assignments
- Version your initiatives so you can roll out changes gradually
Initiative structure
{ "name": "storage-security-initiative", "properties": { "displayName": "Storage Security Standards", "description": "Security controls for Azure Storage accounts", "metadata": { "version": "2.1.0", "category": "Storage" }, "parameters": { "effect": { "type": "String", "defaultValue": "Audit", "allowedValues": ["Audit", "Deny", "Disabled"], "metadata": { "displayName": "Effect", "description": "The effect for all policies in this initiative" } }, "allowedSkus": { "type": "Array", "defaultValue": ["Standard_LRS", "Standard_GRS", "Standard_RAGRS"], "metadata": { "displayName": "Allowed Storage SKUs" } } }, "policyDefinitions": [ { "policyDefinitionReferenceId": "requireHttpsTraffic", "policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/404c3081-a854-4457-ae30-26a93ef643f9", "parameters": { "effect": { "value": "[parameters('effect')]" } } }, { "policyDefinitionReferenceId": "requireEncryption", "policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/b5ec538c-daa0-4006-8596-35468b9148e2", "parameters": { "effect": { "value": "[parameters('effect')]" } } }, { "policyDefinitionReferenceId": "restrictSkus", "policyDefinitionId": "${varPolicyDefinitionStorageSku}", "parameters": { "listOfAllowedSKUs": { "value": "[parameters('allowedSkus')]" } } } ] }}Initiative versioning
policySetDefinitions/├── storage-security/│ ├── v1.0.0/│ │ └── storage-security.json│ ├── v2.0.0/│ │ └── storage-security.json│ └── v2.1.0/│ └── storage-security.jsonOr use metadata versioning:
{ "metadata": { "version": "2.1.0", "changeLog": { "2.1.0": "Added private endpoint requirement", "2.0.0": "Breaking: Changed default effect to Deny", "1.0.0": "Initial release" } }}Troubleshooting EPAC deployments
Things will go wrong. Here’s what I’ve seen most often and how to deal with it.
Plan shows changes that shouldn’t exist
Every deployment shows updates even when nothing changed. This is almost always formatting, ordering, or metadata differences between what’s in Azure and what’s in your source files.
# Get the raw policy from Azure$azurePolicy = Get-AzPolicyDefinition -Name "your-policy" | ConvertTo-Json -Depth 100
# Compare with your source file$sourcePolicy = Get-Content "./Definitions/policyDefinitions/your-policy.json" | ConvertFrom-Json | ConvertTo-Json -Depth 100
# Use a diff tool to find differencesCompare-Object ($azurePolicy -split "`n") ($sourcePolicy -split "`n")Remediation tasks failing
You created remediation tasks but resources aren’t getting fixed. Here’s how to check:
# Check remediation task status$remediation = Get-AzPolicyRemediation -Name "your-remediation" -Scope "/subscriptions/..."
# Look at individual deployment status$remediation.DeploymentStatus
# Check for role assignment issuesGet-AzRoleAssignment -ObjectId $remediation.PolicyAssignmentIdMost of the time it’s one of these:
- The managed identity doesn’t have the right permissions
- A resource provider isn’t registered
- The template in your deployIfNotExists policy has errors
Policy showing wrong compliance state
A resource should be compliant but shows non-compliant (or the other way around).
# Trigger a compliance scanStart-AzPolicyComplianceScan -ResourceGroupName "your-rg"
# Check the specific policy stateGet-AzPolicyState -PolicyAssignmentName "your-assignment" ` -Filter "ResourceId eq '/subscriptions/.../your-resource'"
# Look at the detailed compliance reason$state = Get-AzPolicyState -PolicyAssignmentName "your-assignment" -Top 1$state.ComplianceReasonCodeBuild fails with circular reference
Build-DeploymentPlans fails with a circular reference error. Usually this means policy definitions reference each other, or you have malformed JSON somewhere.
# Validate JSON files individuallyGet-ChildItem -Path "./Definitions" -Filter "*.json" -Recurse | ForEach-Object { try { $null = Get-Content $_.FullName | ConvertFrom-Json Write-Host "Valid: $($_.FullName)" -ForegroundColor Green } catch { Write-Host "Invalid: $($_.FullName)" -ForegroundColor Red Write-Host $_.Exception.Message }}Debug mode
When nothing else works, turn on verbose output:
$VerbosePreference = "Continue"$DebugPreference = "Continue"
Build-DeploymentPlans ` -PacEnvironmentSelector "dev" ` -OutputFolder "./debug-output"Multi-tenant patterns
If you manage more than one Entra ID tenant, here’s how to structure it.
Separate configurations per tenant
{ "pacEnvironments": [ { "pacSelector": "tenant1-prod", "cloud": "AzureCloud", "tenantId": "tenant-1-guid", "deploymentRootScope": "/providers/Microsoft.Management/managementGroups/tenant1-root" }, { "pacSelector": "tenant2-prod", "cloud": "AzureCloud", "tenantId": "tenant-2-guid", "deploymentRootScope": "/providers/Microsoft.Management/managementGroups/tenant2-root" } ]}Shared definitions, tenant-specific assignments
Definitions/├── policyDefinitions/ # Shared across tenants│ └── custom-policies/├── policySetDefinitions/ # Shared across tenants│ └── security-baseline/└── policyAssignments/ ├── tenant1/ # Tenant-specific │ └── production.json └── tenant2/ # Tenant-specific └── production.jsonWhat I’d do differently next time
If I were starting over, I’d focus on three things:
-
Make every exemption expire. No exceptions. If it needs to be permanent, that’s a sign the policy itself needs adjusting.
-
Match assignment structure to your management group tree. Fighting the hierarchy just creates confusion. If the tree is wrong, fix the tree first.
-
Keep initiatives small. It’s tempting to put everything into one big initiative. Don’t. You’ll regret it when you need to change one parameter and it affects 40 policies.
Sources
-
Microsoft, “Azure Policy Exemption Structure,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/exemption-structure
-
Microsoft, “Azure Policy Assignment Structure,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/assignment-structure
-
Microsoft, “Azure Policy Initiative Definition,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/initiative-definition-structure
-
Microsoft, “Remediate Non-Compliant Resources,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/how-to/remediate-resources
-
Azure, “EPAC Documentation,” GitHub, https://github.com/Azure/enterprise-azure-policy-as-code/tree/main/Docs
-
Microsoft, “Azure Policy Troubleshooting,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/troubleshoot/general