Deployment

BSFG Cross Zone Federation Validation Checklist

Checklist: Cross-Zone Federation Validation

Purpose

Provide a concise validation and acceptance checklist for an operational cross-zone BSFG federation relationship.

This checklist verifies that a federation relationship preserves the intended cross-zone guarantees.

Scope

This checklist validates:

  • One federation relationship or peer pair
  • Authentication and trust
  • Authorization enforcement
  • Replay semantics and cursor behavior
  • Artifact retrieval semantics, where artifact exchange is enabled
  • Partition tolerance
  • Reconciliation behavior
  • Operational visibility

This checklist does not validate:

  • Intra-zone deployment
  • Application-level correctness
  • Business logic validation

Reference

This checklist validates conformance to the Reference Interaction Pattern: Cross-Zone BSFG Federation.


Usage Notes

  • Complete only the controls applicable to the federation relationship under review.
  • Mark controls as waived only with explicit justification and approval.
  • Controls involving simulated certificate failure, incompatible history, large-gap replay, or induced partition should be treated as controlled drills and executed only in an approved test or maintenance window.
  • CLI, API, and metric examples below are illustrative unless the implementation has already standardized them as canonical operator interfaces.

1. Identity and Trust Checks

Control Method Expected Result Status Notes
Peer certificate identity matches policy Inspect peer-presented certificate through approved TLS probe or operator interface Certificate identity matches configured peer identity policy [ ] Subject or SAN per local policy
Trust chain valid Verify peer certificate chains to configured trust anchor Chain validates to trusted root or approved cross-signed anchor [ ]
Certificate validity acceptable Check certificate validity horizon Validity window meets policy threshold [ ] Example: 30+ days remaining
Unauthorized peer rejected Attempt connection using certificate not authorized for this relationship TLS handshake fails or application rejects peer [ ] Controlled drill
Expired certificate rejected Use expired test certificate or controlled time-window simulation Connection rejected; alert generated [ ] Controlled drill
Mismatched identity rejected Present certificate with non-matching configured identity Rejected by identity validation path [ ] Controlled drill
Revocation checked Verify CRL/OCSP behavior if deployed Revoked certificate rejected, or revocation control explicitly not in scope [ ] Optional if CRL/OCSP not deployed

2. Authorization Checks

Control Method Expected Result Status Notes
Allowed exported streams fetch successfully Use approved fetch interface against an authorized stream Facts returned without authorization error [ ]
Disallowed exported streams denied Use approved fetch interface against a non-authorized stream Request denied by authorization policy [ ]
Artifact access respects policy Request artifact of authorized type Access succeeds where policy allows [ ] Only if artifact exchange enabled
Artifact access denied for unauthorized type Request artifact outside authorized scope Request denied by authorization policy [ ] Only if artifact exchange enabled
Connectivity without authorization does not grant access Establish authenticated connectivity with peer not present in allow-list or not authorized for resource Application-level denial despite transport success [ ]
Authorization matrix documented Review approved configuration or policy artifact Peer relationship, exported streams, and artifact permissions explicitly recorded [ ]

3. Replication and Cursor Checks

Control Method Expected Result Status Notes
Initial fetch succeeds Execute first approved fetch after bring-up Facts returned without replay error [ ]
Cursor advances only after durable append Fetch facts, inspect durable local append state, then inspect cursor Cursor reflects durable local append progress rather than transport receipt alone [ ]
Cursor monotonic Record cursor, fetch additional facts, re-query cursor Cursor does not regress [ ]
Duplicate replay harmless Re-fetch overlapping cursor range and verify stable IDs / uniqueness behavior Duplicate replay does not create duplicate durable effects [ ] Do not rely solely on raw count
Notification loss does not break correctness Disable or ignore advisory notification path and rely on polling Replication continues correctly using receiver-driven fetch [ ] Optional where notifications are enabled
Replay resumes from durable cursor Pause fetch, resume later, inspect replay start point Replay resumes from previously durable cursor position [ ]
Per-stream cursor independence Advance one exported stream while leaving another idle Cursor movement for one stream does not alter another [ ]
Cursor initialization policy documented Review cursor policy configuration or approved operator record Initialization mode recorded per stream with justification where needed [ ]

4. Partition Checks

Control Method Expected Result Status Notes
Local zone continues durable work while peer unavailable Block peer connectivity in controlled drill and attempt local append Local durable append succeeds without remote dependency [ ] Controlled drill
Backlog accumulates for affected peer relationship Observe backlog metrics during partition Backlog grows for affected peer relationship without unexpected data loss [ ] Controlled drill
Reconnect triggers cursor comparison Restore connectivity and inspect logs or recovery telemetry Recovery path performs cursor comparison or equivalent reconciliation step [ ] Controlled drill
Small-gap backfill automatic Induce short partition and restore Small recovery gap handled automatically without operator intervention [ ] Controlled drill
Large-gap handling matches policy Induce longer partition or seeded gap Behavior matches configured policy: extended replay, bounded backfill, or operator escalation [ ] Controlled drill
Incompatible history halts automatic reconciliation Simulate incompatible cursor/history condition Automatic reconciliation halts; operator alert raised [ ] Controlled drill, lab preferred
No destructive re-init in normal recovery Review continuity of stream/store identity, cursor progression, and operator actions Recovery preserves existing state; no unplanned reset or re-initialization occurs [ ] Logs may support but not prove this
Autonomous-mode alert fires Trigger controlled partition Partition alert visible [ ] Controlled drill
Partition-resolved alert clears Restore connectivity after controlled partition Resolution alert visible and active partition alert clears [ ] Controlled drill
Replication lag metric operationally consistent Compare observed replay delay with lag metric during recovery Lag signal is directionally and operationally consistent with observed delay [ ] Tolerance per local policy

5. Artifact Checks

Complete this section only if artifact exchange is enabled for the relationship.

Control Method Expected Result Status Notes
Referenced artifact fetch succeeds when authorized Fetch artifact using reference carried by a fact Authorized artifact retrieved successfully [ ]
Artifact integrity verified Check content hash, checksum, or content-address match Retrieved artifact matches integrity reference [ ]
Missing artifact surfaces retry and alert path Request known-missing artifact in controlled drill Failure surfaced cleanly; retry and alert path observable [ ] Controlled drill
Artifact flow remains distinct from fact replay Observe transfer behavior and operator telemetry Artifact retrieval is separately observable from fact replay path [ ] Separate endpoint, control path, or telemetry acceptable
Large artifact does not block fact replay Fetch large artifact while observing fact replay Fact replay continues within expected operating envelope [ ] Controlled drill
Artifact authorization independent of fact authorization Test relationship where fact access and artifact access differ Artifact policy enforced independently where configured [ ]

6. Operational Visibility Checks

Control Method Expected Result Status Notes
Replication lag visible Dashboard, exporter, or approved operator interface Lag visible per peer and exported stream [ ]
Authentication failures visible Security dashboard, logs, or alert stream Failed TLS, rejected identity, or denied access visible [ ]
Partition alerts visible Alerting system or operations dashboard Partition-detected and partition-resolved signals visible [ ]
Backlog metrics visible Metrics dashboard or approved operator interface Backlog signals visible per affected relationship [ ]
Recovery completion visible Logs, dashboard, or recovery telemetry Reconciliation start, progress, and completion observable [ ]
Cursor position queryable Approved operator interface Current durable cursor position queryable per peer and stream [ ]
Authenticated health endpoint responds Health probe using configured trust and identity path Health response returned successfully [ ]
Certificate expiry monitored Dashboard, script output, or alert source Certificate validity horizon visible and alertable [ ]
Fetch rate and error rate visible Metrics dashboard or approved telemetry Request rate and error rate visible by peer or stream [ ]

7. Federation Variants (If Applicable)

Complete only the subsection(s) applicable to the relationship under review.

7.1 Chain: Enterprise ↔ IDMZ ↔ Plant

Control Method Expected Result Status Notes
No direct Enterprise–Plant relationship where prohibited Attempt direct connection or inspect policy/firewall path Direct connectivity absent or rejected per design [ ]
IDMZ mediates correctly Exercise end-to-end replay through IDMZ End-to-end flow succeeds through mediated path [ ]
Latency within approved envelope Measure hop or end-to-end delay Observed delay within approved operating envelope [ ]

7.2 Hub-and-Spoke

Control Method Expected Result Status Notes
Hub handles configured number of spokes Operate configured relationships concurrently Behavior remains within approved load envelope [ ]
Spoke isolation preserved Attempt or inspect direct spoke-to-spoke relationship where prohibited Non-permitted spoke relationship absent or rejected [ ]

7.3 Selective Mesh

Control Method Expected Result Status Notes
Peer relationships remain independent Pause one peer relationship while others operate Unaffected relationships continue normally [ ]
Partition impact isolated per relationship Partition one peer in controlled drill Impact isolated to affected relationship [ ] Controlled drill

8. Acceptance Gate

8.1 Summary

Category Checks Passed Checks Failed Waived
Identity and Trust
Authorization
Replication and Cursor
Partition
Artifact
Operational Visibility
Federation Variants

8.2 Critical Failures

List any failed checks that block acceptance.

Failed Check Severity Remediation Required Owner Due Date

8.3 Waivers

List any waived checks with justification.

Waived Check Justification Approved By Date

8.4 Overall Status

Select one:

  • Passed — All critical checks passed; federation relationship accepted
  • Passed with Exception — Minor issues documented and waived
  • Failed — Critical checks failed; remediation required before acceptance
  • Requires Escalation — Uncertainty or blocker requires architecture or platform review

8.5 Sign-off

Role Name Date Signature
Validation Engineer
Zone A Platform Lead
Zone B Platform Lead
Security/Compliance (if required)

9. Post-Validation Reference

Document Purpose
Runbook: Cross-Zone Federation Bring-Up Establish or extend peer relationships
Checklist: Triad-HA Commissioning Validate zones before federation
Reference Interaction Pattern: Cross-Zone BSFG Federation Architecture reference
Reference Deployment Pattern: Triad-HA with Keepalived Failover Intra-zone substrate reference