DLP and Classification Scanner Integration
This article shows where a host-owned DLP, classification, or governance-screening scanner fits into an AsiBackbone flow.
Issue: #286.
In this software project, ASI means Accountable Systems Infrastructure. AsiBackbone supplies provider-neutral governance policy primitives after a host reports a DLP/classification outcome. It is not a DLP scanner, content-safety service, Microsoft Purview adapter, regex engine, classifier, compliance product, or external screening provider by itself.
Important
Scanner execution is host-owned and provider-specific. The host chooses and invokes its scanner before provider emission or before a governed execution path continues. AsiBackbone receives the normalized failure or policy-handling context through DlpFailurePolicyContext, resolves it with IAsiBackboneDlpFailurePolicyResolver, and returns a DlpFailurePolicyResolution containing a GovernanceDecision.
Where the scanner fits
The scanner sits between the host's proposed action or emission payload and the policy decision to proceed.
Host request / payload / governance emission envelope
-> host-owned scanner
regex scanner
Azure AI Content Safety
Microsoft Purview / DLP-adjacent service
proprietary classifier
internal compliance API
-> scanner result or scanner failure
-> normalize to DlpFailurePolicyContext when policy handling is needed
-> IAsiBackboneDlpFailurePolicyResolver
-> DlpFailurePolicyResolution
-> GovernanceDecision
-> host applies allow / warn / deny / defer / acknowledgment / escalation
AsiBackbone does not inspect raw content for you. It gives the host a stable way to decide what to do when screening is unavailable, times out, returns an indeterminate result, blocks content, or classifies content that requires policy handling.
Scanner outcomes and normalized failure kinds
| Host-owned scanner outcome | Normalize as | Typical meaning |
|---|---|---|
| Service unavailable | DlpClassificationFailureKind.ServiceUnavailable |
Scanner endpoint, local service, or provider dependency could not be reached. |
| Timeout | DlpClassificationFailureKind.Timeout |
Scanner did not return within the host-defined timeout. |
| Indeterminate result | DlpClassificationFailureKind.IndeterminateResult |
Scanner completed but the host cannot safely treat the result as allowed, blocked, or classified. |
| Blocked result | DlpClassificationFailureKind.BlockedResult |
Scanner explicitly blocked the payload or action. |
| Classified result requiring policy handling | DlpClassificationFailureKind.ClassifiedResult |
Scanner detected a category such as sensitive, regulated, confidential, or high-risk and the host wants governance policy to decide what happens next. |
A clean allowed scanner result does not need to call the failure-policy resolver. The host can continue with the normal policy, outbox, emission, or execution path.
Minimal fake scanner sample
This sample uses a simple regex-style scanner with no Azure dependency, no external service, and no package beyond AsiBackbone.Core.
using AsiBackbone.Core.Classification;
using AsiBackbone.Core.Decisions;
public enum FakeScannerOutcome
{
Allowed,
ServiceUnavailable,
Timeout,
Indeterminate,
Blocked,
Classified
}
public sealed record FakeScannerResult(
FakeScannerOutcome Outcome,
string? Category = null,
TimeSpan? Timeout = null);
public static class FakeRegexScanner
{
public static FakeScannerResult Scan(string payload)
{
if (payload.Contains("scanner-down", StringComparison.OrdinalIgnoreCase))
{
return new FakeScannerResult(FakeScannerOutcome.ServiceUnavailable);
}
if (payload.Contains("simulate-timeout", StringComparison.OrdinalIgnoreCase))
{
return new FakeScannerResult(FakeScannerOutcome.Timeout, Timeout: TimeSpan.FromSeconds(2));
}
if (payload.Contains("unknown-sensitive", StringComparison.OrdinalIgnoreCase))
{
return new FakeScannerResult(FakeScannerOutcome.Indeterminate);
}
if (payload.Contains("blocked-secret", StringComparison.OrdinalIgnoreCase))
{
return new FakeScannerResult(FakeScannerOutcome.Blocked, Category: "secret");
}
if (payload.Contains("regulated-data", StringComparison.OrdinalIgnoreCase))
{
return new FakeScannerResult(FakeScannerOutcome.Classified, Category: "regulated");
}
return new FakeScannerResult(FakeScannerOutcome.Allowed);
}
}
The host then maps scanner outcomes into DlpFailurePolicyContext only when policy handling is needed.
public static async ValueTask<GovernanceDecision> EvaluateScannerOutcomeAsync(
string payload,
IAsiBackboneDlpFailurePolicyResolver resolver,
CancellationToken cancellationToken)
{
FakeScannerResult result = FakeRegexScanner.Scan(payload);
if (result.Outcome == FakeScannerOutcome.Allowed)
{
return GovernanceDecision.Allow(
correlationId: "correlation-123",
policyVersion: "policy-v1",
policyHash: "policy-hash");
}
DlpFailurePolicyContext context = ToDlpContext(result);
DlpFailurePolicyResolution resolution = await resolver.ResolveAsync(
context,
cancellationToken);
return resolution.Decision;
}
The normalization function is intentionally host-owned. It keeps scanner-specific categories and provider details out of Core while preserving safe metadata.
private static DlpFailurePolicyContext ToDlpContext(FakeScannerResult result)
{
DlpClassificationFailureKind failureKind = result.Outcome switch
{
FakeScannerOutcome.ServiceUnavailable => DlpClassificationFailureKind.ServiceUnavailable,
FakeScannerOutcome.Timeout => DlpClassificationFailureKind.Timeout,
FakeScannerOutcome.Indeterminate => DlpClassificationFailureKind.IndeterminateResult,
FakeScannerOutcome.Blocked => DlpClassificationFailureKind.BlockedResult,
FakeScannerOutcome.Classified => DlpClassificationFailureKind.ClassifiedResult,
_ => throw new InvalidOperationException("Allowed scanner results do not need DLP failure policy resolution.")
};
DlpIntentRiskLevel riskLevel = result.Outcome switch
{
FakeScannerOutcome.Blocked => DlpIntentRiskLevel.High,
FakeScannerOutcome.Classified => DlpIntentRiskLevel.Medium,
FakeScannerOutcome.Timeout => DlpIntentRiskLevel.Medium,
FakeScannerOutcome.ServiceUnavailable => DlpIntentRiskLevel.Medium,
_ => DlpIntentRiskLevel.High
};
IReadOnlyDictionary<string, string> metadata = new Dictionary<string, string>
{
["scanner.family"] = "fake-regex",
["scanner.category"] = result.Category ?? "none"
};
return DlpFailurePolicyContext.Create(
failureKind,
riskLevel,
intentCategory: "governance-emission",
environment: "production",
correlationId: "correlation-123",
policyVersion: "policy-v1",
policyHash: "policy-hash",
timeout: result.Timeout,
metadata: metadata);
}
Resolver configuration sample
The default resolver uses a conservative baseline: low risk warns and allows, medium risk requires acknowledgment, and high risk denies. Hosts can override that behavior by risk level or by exact risk/failure pair.
var options = new DlpFailurePolicyOptions
{
LowRiskBehavior = DlpFailureBehavior.WarnAndAllow,
MediumRiskBehavior = DlpFailureBehavior.RequireAcknowledgment,
HighRiskBehavior = DlpFailureBehavior.Deny
};
options.BehaviorOverrides[new DlpFailurePolicyKey(
DlpIntentRiskLevel.Medium,
DlpClassificationFailureKind.Timeout)] = DlpFailureBehavior.Defer;
options.BehaviorOverrides[new DlpFailurePolicyKey(
DlpIntentRiskLevel.High,
DlpClassificationFailureKind.BlockedResult)] = DlpFailureBehavior.Escalate;
IAsiBackboneDlpFailurePolicyResolver resolver =
new DefaultAsiBackboneDlpFailurePolicyResolver(options);
The host applies the returned decision using the same execution rules it uses for any other GovernanceDecision:
GovernanceDecision decision = await EvaluateScannerOutcomeAsync(
payload,
resolver,
cancellationToken);
if (!decision.CanProceed)
{
// Host-owned behavior: return a response, defer work, require acknowledgment,
// escalate to review, write audit residue, or stop provider emission.
return decision;
}
// Continue with provider emission or governed execution.
return decision;
External provider pseudo-sample
A host adapter for Azure AI Content Safety, Microsoft Purview, a proprietary classifier, or an internal DLP API should follow the same shape:
try
{
ExternalScannerResult scannerResult = await hostScanner.ScanAsync(
minimizedPayload,
cancellationToken);
if (scannerResult.Allowed)
{
return GovernanceDecision.Allow(
correlationId: correlationId,
policyVersion: policyVersion,
policyHash: policyHash);
}
DlpClassificationFailureKind failureKind = scannerResult.Blocked
? DlpClassificationFailureKind.BlockedResult
: DlpClassificationFailureKind.ClassifiedResult;
DlpFailurePolicyContext context = DlpFailurePolicyContext.Create(
failureKind,
scannerResult.RiskLevel,
intentCategory: scannerResult.IntentCategory,
environment: environment,
correlationId: correlationId,
traceId: traceId,
policyVersion: policyVersion,
policyHash: policyHash,
metadata: scannerResult.SafeMetadata);
return (await resolver.ResolveAsync(context, cancellationToken)).Decision;
}
catch (TimeoutException)
{
var context = DlpFailurePolicyContext.TimeoutFailure(
DlpIntentRiskLevel.Medium,
timeout: TimeSpan.FromSeconds(2),
intentCategory: "governance-emission",
environment: environment,
correlationId: correlationId,
traceId: traceId,
policyVersion: policyVersion,
policyHash: policyHash,
metadata: new Dictionary<string, string>
{
["scanner.family"] = "external-provider",
["scanner.failure"] = "timeout"
});
return (await resolver.ResolveAsync(context, cancellationToken)).Decision;
}
catch (HttpRequestException)
{
var context = DlpFailurePolicyContext.Create(
DlpClassificationFailureKind.ServiceUnavailable,
DlpIntentRiskLevel.Medium,
intentCategory: "governance-emission",
environment: environment,
correlationId: correlationId,
traceId: traceId,
policyVersion: policyVersion,
policyHash: policyHash,
metadata: new Dictionary<string, string>
{
["scanner.family"] = "external-provider",
["scanner.failure"] = "service_unavailable"
});
return (await resolver.ResolveAsync(context, cancellationToken)).Decision;
}
Keep that adapter in the host application or in a provider-specific package. Do not make Core depend on the external scanner SDK.
Choosing fail-open or fail-closed behavior
Choose behavior by risk, data class, environment, and operational need.
| Scenario | Recommended posture | Reason |
|---|---|---|
| Low-risk diagnostic metadata emission | Warn and allow or allow | Blocking all low-risk telemetry during scanner outage may create unnecessary operational blind spots. |
| Medium-risk business workflow with recoverable delay | Require acknowledgment or defer | Human acknowledgment or delayed retry preserves accountability without silently allowing uncertain content. |
| High-risk regulated payload, secret, credential, protected record, or external transfer | Deny or escalate | Fail closed when the consequence of unclassified or blocked content leaving the boundary is unacceptable. |
| Provider emission after local durable audit is already written | Defer provider emission | Preserve the local record and retry or review external emission later. |
| Scanner returns explicit blocked result | Deny or escalate | Treat provider block differently from provider outage. A block says the content is unsafe or disallowed, not merely unknown. |
| Scanner returns classified/sensitive result | Require acknowledgment, defer, deny, or escalate depending on policy | Classification may be allowed in some workflows and forbidden in others. |
Default to fail closed for high-risk, regulated, externally visible, or irreversible actions. Use fail-open only when the host has accepted the risk and records the decision with safe reason codes and metadata.
Privacy and metadata guidance
DlpFailurePolicyContext.Metadata should contain only minimized, safe, provider-neutral values.
Safe examples:
scanner.family = regexscanner.failure = timeoutscanner.category = regulatedclassification.family = piiretry.category = transient
Unsafe examples:
- raw prompt text;
- raw document content;
- raw scanner findings that include personal data;
- secrets, credentials, keys, tokens, or connection strings;
- raw provider exception payloads;
- unredacted labels that reveal confidential workflow details.
Operational placement
Common placements include:
| Placement | Use |
|---|---|
| Before provider emission | Classify and minimize a governance emission envelope before it leaves the local boundary. |
| Before outbox drain delivery | Defer or dead-letter external provider delivery while preserving local audit/outbox state. |
| Before governed execution | Stop, defer, acknowledge, or escalate an execution request before side effects occur. |
| Before Purview/Event Hubs enrichment | Keep strategy/design provider paths downstream of classification and minimization. |
The host should persist the governance decision, scanner failure category, resolved behavior, and safe metadata in audit residue or lifecycle records according to its retention policy.