Introduction

You’ve got EPAC running. Policies are deploying through pipelines. Life is good.

Then reality sets in.

“We need an exemption for this legacy application, but just for 90 days.”

“Can we have the same policy with different parameters for different business units?”

“Why is this policy showing compliant in DEV but non-compliant in production when they’re identical?”

These are the questions I kept running into after the initial EPAC setup. The basics work fine, but real environments are messy. This article covers the patterns I’ve found useful for handling that mess: exemptions, assignment strategies, initiative design, and debugging.

Policy exemptions: when rules need exceptions

Every policy framework needs escape hatches. The question is whether those escape hatches are documented, time-bound, and auditable, or whether they’re “someone disabled that in the portal three years ago.”

EPAC exemption structure

Exemptions in EPAC live alongside your assignments:

Definitions/
├── policyAssignments/
│ └── security-baseline.json
└── policyExemptions/
└── legacy-app-exemptions.json

Creating time-bound exemptions

{
"exemptions": [
{
"name": "legacy-erp-storage-encryption",
"displayName": "Legacy ERP Storage Encryption Exemption",
"description": "Temporary exemption for legacy ERP system during migration. Expires 2026-04-30.",
"exemptionCategory": "Mitigated",
"scope": "/subscriptions/12345678-1234-1234-1234-123456789abc/resourceGroups/legacy-erp",
"policyAssignmentId": "/providers/Microsoft.Management/managementGroups/production/providers/Microsoft.Authorization/policyAssignments/require-storage-encryption",
"expiresOn": "2026-04-30T23:59:59Z",
"metadata": {
"ticketNumber": "SEC-2024-1234",
"approvedBy": "security-team@company.com",
"mitigatingControls": "Network isolation, enhanced monitoring, scheduled migration Q2 2026"
}
}
]
}

Exemption categories

CategoryUse whenCompliance impact
WaiverRisk accepted, no compensating controlsNon-compliant (acknowledged)
MitigatedCompensating controls in placeCompliant (with exemption)

Use Mitigated when you have alternative controls. Use Waiver when you’re accepting the risk knowingly.

Exemption lifecycle

EPAC handles the full lifecycle:

Terminal window
# Exemptions are included in the deployment plan
Build-DeploymentPlans -PacEnvironmentSelector "prod"
# Plan shows:
# - New exemptions to create
# - Expired exemptions to remove
# - Modified exemptions to update

One thing to watch out for: set a calendar reminder for 30 days before exemption expiry. When EPAC removes an expired exemption, the resource becomes non-compliant, which might trigger unwanted remediation.

Bulk exemptions

If you have a lot of exemptions, organise them by category:

policyExemptions/
├── security/
│ ├── legacy-systems.json
│ └── third-party-integrations.json
├── cost-management/
│ └── dev-environment-exemptions.json
└── compliance/
└── regional-requirements.json

Each file can contain multiple exemptions:

{
"exemptions": [
{ "name": "exemption-1", ... },
{ "name": "exemption-2", ... },
{ "name": "exemption-3", ... }
]
}

Complex assignment scenarios

Real environments need more than “apply policy X to scope Y.” Here are patterns I’ve used to deal with that.

Pattern 1: same policy, different parameters per scope

Say you want to require tags, but each business unit needs different required tags.

{
"nodeName": "/Root/",
"children": [
{
"nodeName": "FinanceUnit/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/finance"
]
},
"assignment": {
"name": "require-tags-finance",
"displayName": "Require Tags - Finance"
},
"definitionEntry": {
"policySetName": "require-resource-tags"
},
"parameters": {
"requiredTags": ["CostCenter", "Project", "FinanceApprover"]
}
},
{
"nodeName": "EngineeringUnit/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/engineering"
]
},
"assignment": {
"name": "require-tags-engineering",
"displayName": "Require Tags - Engineering"
},
"definitionEntry": {
"policySetName": "require-resource-tags"
},
"parameters": {
"requiredTags": ["CostCenter", "Team", "Repository"]
}
}
]
}

Pattern 2: inheritance with override

Base security policies at the root, stricter controls for regulated workloads.

{
"nodeName": "/Root/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/landing-zones"
]
},
"assignment": {
"name": "security-baseline",
"displayName": "Security Baseline"
},
"definitionEntry": {
"policySetName": "security-baseline-initiative"
},
"parameters": {
"allowedLocations": ["westeurope", "northeurope"],
"requiredEncryption": "platform-managed"
},
"children": [
{
"nodeName": "RegulatedWorkloads/",
"scope": {
"tenant1": [
"/providers/Microsoft.Management/managementGroups/regulated"
]
},
"assignment": {
"name": "security-baseline-regulated",
"displayName": "Security Baseline - Regulated"
},
"parameters": {
"allowedLocations": ["westeurope"],
"requiredEncryption": "customer-managed"
},
"overrides": [
{
"kind": "policyEffect",
"selectors": [
{
"kind": "policyDefinitionReferenceId",
"in": ["auditStorageEncryption"]
}
],
"value": "deny"
}
]
}
]
}

Pattern 3: non-compliance messaging

This one is easy to skip but makes a big difference. When someone’s deployment fails, a good message saves them from filing a ticket:

{
"assignment": {
"name": "require-approved-images",
"displayName": "Require Approved Container Images"
},
"nonComplianceMessages": [
{
"message": "Container images must come from the approved registry: acr.company.com. See https://wiki.company.com/approved-images for the list of approved images and how to request additions."
},
{
"policyDefinitionReferenceId": "containerImageSource",
"message": "This specific container image is not from an approved source. Contact platform-team@company.com for assistance."
}
]
}

Pattern 4: resource selectors

You can limit policies to specific resource types or locations:

{
"assignment": {
"name": "require-private-endpoints",
"displayName": "Require Private Endpoints for PaaS"
},
"resourceSelectors": [
{
"name": "StorageAccountsInProduction",
"selectors": [
{
"kind": "resourceType",
"in": [
"Microsoft.Storage/storageAccounts",
"Microsoft.KeyVault/vaults",
"Microsoft.Sql/servers"
]
},
{
"kind": "resourceLocation",
"in": ["westeurope", "northeurope"]
}
]
}
]
}

Initiative (policy set) design

Initiatives group related policies. Get the grouping right and assignments stay manageable. Get it wrong and you’ll spend your time figuring out which initiative a policy belongs to.

A few things I’ve learned:

  • Group by compliance domain, not by resource type
  • Keep initiatives under 20 policies if you can. 50-policy initiatives are painful to manage
  • Use parameter references so you don’t duplicate parameters across assignments
  • Version your initiatives so you can roll out changes gradually

Initiative structure

{
"name": "storage-security-initiative",
"properties": {
"displayName": "Storage Security Standards",
"description": "Security controls for Azure Storage accounts",
"metadata": {
"version": "2.1.0",
"category": "Storage"
},
"parameters": {
"effect": {
"type": "String",
"defaultValue": "Audit",
"allowedValues": ["Audit", "Deny", "Disabled"],
"metadata": {
"displayName": "Effect",
"description": "The effect for all policies in this initiative"
}
},
"allowedSkus": {
"type": "Array",
"defaultValue": ["Standard_LRS", "Standard_GRS", "Standard_RAGRS"],
"metadata": {
"displayName": "Allowed Storage SKUs"
}
}
},
"policyDefinitions": [
{
"policyDefinitionReferenceId": "requireHttpsTraffic",
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/404c3081-a854-4457-ae30-26a93ef643f9",
"parameters": {
"effect": {
"value": "[parameters('effect')]"
}
}
},
{
"policyDefinitionReferenceId": "requireEncryption",
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/b5ec538c-daa0-4006-8596-35468b9148e2",
"parameters": {
"effect": {
"value": "[parameters('effect')]"
}
}
},
{
"policyDefinitionReferenceId": "restrictSkus",
"policyDefinitionId": "${varPolicyDefinitionStorageSku}",
"parameters": {
"listOfAllowedSKUs": {
"value": "[parameters('allowedSkus')]"
}
}
}
]
}
}

Initiative versioning

policySetDefinitions/
├── storage-security/
│ ├── v1.0.0/
│ │ └── storage-security.json
│ ├── v2.0.0/
│ │ └── storage-security.json
│ └── v2.1.0/
│ └── storage-security.json

Or use metadata versioning:

{
"metadata": {
"version": "2.1.0",
"changeLog": {
"2.1.0": "Added private endpoint requirement",
"2.0.0": "Breaking: Changed default effect to Deny",
"1.0.0": "Initial release"
}
}
}

Troubleshooting EPAC deployments

Things will go wrong. Here’s what I’ve seen most often and how to deal with it.

Plan shows changes that shouldn’t exist

Every deployment shows updates even when nothing changed. This is almost always formatting, ordering, or metadata differences between what’s in Azure and what’s in your source files.

Terminal window
# Get the raw policy from Azure
$azurePolicy = Get-AzPolicyDefinition -Name "your-policy" | ConvertTo-Json -Depth 100
# Compare with your source file
$sourcePolicy = Get-Content "./Definitions/policyDefinitions/your-policy.json" | ConvertFrom-Json | ConvertTo-Json -Depth 100
# Use a diff tool to find differences
Compare-Object ($azurePolicy -split "`n") ($sourcePolicy -split "`n")

Remediation tasks failing

You created remediation tasks but resources aren’t getting fixed. Here’s how to check:

Terminal window
# Check remediation task status
$remediation = Get-AzPolicyRemediation -Name "your-remediation" -Scope "/subscriptions/..."
# Look at individual deployment status
$remediation.DeploymentStatus
# Check for role assignment issues
Get-AzRoleAssignment -ObjectId $remediation.PolicyAssignmentId

Most of the time it’s one of these:

  • The managed identity doesn’t have the right permissions
  • A resource provider isn’t registered
  • The template in your deployIfNotExists policy has errors

Policy showing wrong compliance state

A resource should be compliant but shows non-compliant (or the other way around).

Terminal window
# Trigger a compliance scan
Start-AzPolicyComplianceScan -ResourceGroupName "your-rg"
# Check the specific policy state
Get-AzPolicyState -PolicyAssignmentName "your-assignment" `
-Filter "ResourceId eq '/subscriptions/.../your-resource'"
# Look at the detailed compliance reason
$state = Get-AzPolicyState -PolicyAssignmentName "your-assignment" -Top 1
$state.ComplianceReasonCode

Build fails with circular reference

Build-DeploymentPlans fails with a circular reference error. Usually this means policy definitions reference each other, or you have malformed JSON somewhere.

Terminal window
# Validate JSON files individually
Get-ChildItem -Path "./Definitions" -Filter "*.json" -Recurse | ForEach-Object {
try {
$null = Get-Content $_.FullName | ConvertFrom-Json
Write-Host "Valid: $($_.FullName)" -ForegroundColor Green
} catch {
Write-Host "Invalid: $($_.FullName)" -ForegroundColor Red
Write-Host $_.Exception.Message
}
}

Debug mode

When nothing else works, turn on verbose output:

Terminal window
$VerbosePreference = "Continue"
$DebugPreference = "Continue"
Build-DeploymentPlans `
-PacEnvironmentSelector "dev" `
-OutputFolder "./debug-output"

Multi-tenant patterns

If you manage more than one Entra ID tenant, here’s how to structure it.

Separate configurations per tenant

{
"pacEnvironments": [
{
"pacSelector": "tenant1-prod",
"cloud": "AzureCloud",
"tenantId": "tenant-1-guid",
"deploymentRootScope": "/providers/Microsoft.Management/managementGroups/tenant1-root"
},
{
"pacSelector": "tenant2-prod",
"cloud": "AzureCloud",
"tenantId": "tenant-2-guid",
"deploymentRootScope": "/providers/Microsoft.Management/managementGroups/tenant2-root"
}
]
}

Shared definitions, tenant-specific assignments

Definitions/
├── policyDefinitions/ # Shared across tenants
│ └── custom-policies/
├── policySetDefinitions/ # Shared across tenants
│ └── security-baseline/
└── policyAssignments/
├── tenant1/ # Tenant-specific
│ └── production.json
└── tenant2/ # Tenant-specific
└── production.json

What I’d do differently next time

If I were starting over, I’d focus on three things:

  1. Make every exemption expire. No exceptions. If it needs to be permanent, that’s a sign the policy itself needs adjusting.

  2. Match assignment structure to your management group tree. Fighting the hierarchy just creates confusion. If the tree is wrong, fix the tree first.

  3. Keep initiatives small. It’s tempting to put everything into one big initiative. Don’t. You’ll regret it when you need to change one parameter and it affects 40 policies.


Sources

  1. Microsoft, “Azure Policy Exemption Structure,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/exemption-structure

  2. Microsoft, “Azure Policy Assignment Structure,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/assignment-structure

  3. Microsoft, “Azure Policy Initiative Definition,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/concepts/initiative-definition-structure

  4. Microsoft, “Remediate Non-Compliant Resources,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/how-to/remediate-resources

  5. Azure, “EPAC Documentation,” GitHub, https://github.com/Azure/enterprise-azure-policy-as-code/tree/main/Docs

  6. Microsoft, “Azure Policy Troubleshooting,” Azure Documentation, https://learn.microsoft.com/azure/governance/policy/troubleshoot/general