Deployment

BSFG Reference Deployment Pattern

Reference Deployment Pattern: Triad-HA with Keepalived Failover

Pattern Name

Triad-HA — Three-node single-zone deployment pattern with active-passive BSFG controller failover and quorum-backed JetStream durability.

For host-level embodiment of this pattern, see Reference Physical Realization: Triad-HA Zone.

Classification

  • Layer: Substrate
  • Kind: Reference deployment pattern
  • Scope: One BSFG zone
  • Failure model: Any single node may fail without loss of acknowledged data, subject to the configured JetStream durability tier
  • Availability model: Active-passive controller failover via Keepalived VIP
  • Persistence model: Three-node JetStream quorum
  • Operational model: Host-based services, no Kubernetes, no shared storage

Intent

Defines the minimal host-level deployment pattern for one BSFG zone that provides:

  • Automatic controller failover between two service-bearing nodes
  • Quorum-backed JetStream durability across three nodes
  • No shared storage between nodes
  • Bounded operational complexity for industrial IT/OT boundary deployments

Applicability

Use this pattern when:

  • One BSFG zone must tolerate any single node failure
  • Brief failover interruption is acceptable
  • Host-based operations are preferred over cluster orchestration
  • Cross-zone replication is acceptable for artifact recovery
  • Dual-node failure may render the zone unavailable

Non-Goals

This pattern does not provide:

  • Zone availability after simultaneous loss of both controller-bearing nodes
  • Synchronous within-zone artifact replication
  • Point-in-time recovery
  • Dynamic orchestration or automatic workload rescheduling beyond host-level failover

Logical Node Roles

Node Role Controller JetStream
Alpha Primary service-bearing Active when VIP held RAFT voter
Beta Secondary service-bearing Standby; promotable RAFT voter
Gamma Non-controller quorum None RAFT voter with reduced workload expectations

Canonical host placement and physical embodiment are defined in the physical realization document.

Network Model

Logical Bindings

Service Bind Target Notes
BSFG Connect RPC Zone VIP mTLS required
NATS client Localhost BSFG connects only to local JetStream
JetStream cluster Node IPs Full mesh among Alpha, Beta, Gamma
Keepalived VRRP Alpha ↔ Beta only Dedicated coordination path

Communication Requirements

  • VIP on active service-bearing node
  • JetStream cluster communication among the three nodes
  • VRRP coordination between Alpha and Beta

Physical NIC, segment, and bind details are defined in the physical realization document.

Security Profile

TLS

  • Minimum version: TLS 1.2
  • Preferred version: TLS 1.3
  • Authentication mode: Mutual TLS
  • Cipher policy: Enterprise-approved baseline; Mozilla Intermediate is acceptable if no stricter standard exists
  • Certificate rotation window: Alert at 30 days before expiry; rotate via rolling restart

VRRP / Keepalived

  • VRRP traffic is not treated as a security boundary
  • Keepalived peers must communicate only across a dedicated, restricted segment
  • Unicast peering is preferred over multicast
  • VRRP exposure outside the zone-local control segment is prohibited

Storage Model

Artifact Semantics

Aspect Behavior
Storage location Local storage on current active node
Intra-zone replication None
Cross-zone replication Via BSFG only
Failover behavior Artifacts may be temporarily unavailable after controller failover
Recovery model Rehydrate from peer zones or restore from backup

Storage Principles

  • Artifact storage is local to the active service-bearing node
  • No shared storage is required between nodes
  • JetStream durability is quorum-backed across all three nodes
  • Artifact availability after failover depends on prior cross-zone replication

Mount paths, RAID configuration, and physical disk specifications are defined in the physical realization document.

Service Model

The canonical runtime substrate is host-based:

  • Host OS with systemd for service lifecycle
  • Docker Compose or equivalent host-local container runner
  • Keepalived for VIP failover on service-bearing nodes

The examples below illustrate one canonical host-based realization. They are not the sole valid packaging format, provided the pattern invariants are preserved.

Example Service Layout (Alpha / Beta)

# Example: /opt/bsfg/docker-compose.yml
services:
  jetstream:
    image: nats:2.10-jetstream
    volumes:
      - /data/jetstream:/data/jetstream
      - ./jetstream.conf:/etc/nats/nats-server.conf:ro
    network_mode: host

  bsfg-controller:
    image: bsfg:v1.x
    environment:
      - ZONE_ID=${ZONE_ID}
      - NODE_NAME=${NODE_NAME}
      - PEER_ENDPOINTS=${PEER_ZONE_ENDPOINTS}
      - JETSTREAM_URL=nats://localhost:4222
      - BIND_ADDRESS=${VIP}:9443
    volumes:
      - ./certs:/certs:ro
      - /artifacts:/artifacts:ro
    network_mode: host
    # Started only by promotion flow

Example Service Layout (Gamma)

# Example: /opt/bsfg/docker-compose.yml
services:
  jetstream:
    image: nats:2.10-jetstream
    volumes:
      - /data/jetstream:/data/jetstream
      - ./jetstream.conf:/etc/nats/nats-server.conf:ro
    network_mode: host
    # No BSFG controller

Detailed host placement, resource controls, and systemd configuration are defined in the physical realization document.

JetStream Durability Profile

Baseline Settings

Parameter Value Rationale
Stream replicas 3 Survives any single node loss
Consumer replicas 3 Consumer state survives single node loss
Node count 3 Smallest practical quorum configuration
sync_interval Tier-dependent Controls window between acknowledgment and durable flush

Durability Tiers

Standard

  • sync_interval: default
  • Intended for general zone traffic
  • Survives any single node loss
  • May lose recently acknowledged messages under correlated crash or sudden power-loss scenarios before flush

Critical

  • sync_interval: always
  • Intended for streams whose acknowledged writes must survive correlated crash conditions
  • Higher latency and lower throughput accepted as tradeoff for stronger durability

Semantics

  • Single-node failure tolerance is guaranteed only within the selected durability tier
  • The document does not claim immunity to all correlated crash modes under the Standard tier
  • Durability claims apply to acknowledged JetStream data, not to local artifact files stored only on the active node

Failover Contract

A node may run the BSFG controller in active mode only when all of the following are true:

  1. The node currently holds the zone VIP
  2. Local JetStream is reachable
  3. JetStream cluster quorum is present
  4. Required artifact storage is available
  5. Required certificates satisfy the local certificate-validity policy
  6. The controller binds successfully to the VIP address

Loss of any of the above must cause controller demotion or failed promotion.

Additionally:

  • The BSFG controller must not be independently enabled for unconditional boot-time start
  • Active startup is governed by VIP ownership and promotion gates, not by generic service auto-start

Promotion Gates

Promotion to active controller is gated by:

  • Local JetStream health
  • Cluster quorum presence
  • Artifact storage availability
  • Certificate validity satisfying local policy
  • VIP ownership confirmation

The exact certificate-validity threshold is a local policy choice. Implementation thresholds belong in the physical realization, runbooks, or local operations policy—not in this pattern.

Failover Mechanics

Keepalived Integration

  • Keepalived manages VIP ownership between Alpha and Beta
  • State transitions trigger controller promotion/demotion
  • VRRP health tracking ensures JetStream availability before VIP acquisition

Detailed Keepalived configuration, notification scripts, and health probes are defined in the physical realization document and deployment runbook.

Dual-Active Prevention

Dual-active operation is prevented by all of the following:

  • Keepalived triggers explicit stop on BACKUP and FAULT transitions
  • BSFG binds specifically to the VIP, not to 0.0.0.0
  • Controller start is gated by local dependency checks
  • A node lacking the VIP cannot successfully bind the service endpoint
  • The controller is not independently auto-started outside promotion flow

Failure Semantics

Scenario Result Recovery Action
Alpha fails VIP moves to Beta; Beta promotes if all gates pass Repair Alpha; rejoin as standby
Beta fails No role change; Alpha remains active Repair Beta; rejoin as standby
Gamma fails JetStream remains available with 2-node quorum Repair Gamma; rejoin cluster
Alpha and Beta fail Zone unavailable; no controller and no quorum Recover at least one service-bearing node
Alpha isolated from Beta+Gamma Alpha loses effective quorum; Beta may promote if healthy Restore network; Alpha rejoins as standby
Gamma isolated from Alpha+Beta No controller change; Alpha+Beta retain quorum Restore Gamma connectivity
Artifact storage unavailable on promoted node Promotion denied Restore storage or perform operator intervention
Certificate validity below policy threshold Promotion denied or warned according to local policy Rotate certificates or apply approved exception

Bootstrap Semantics

  • Static cluster configuration is assumed
  • Startup order is not semantically significant
  • Gamma-first startup is conventional, not required
  • BSFG controller startup is driven by Keepalived state transitions, not by container orchestration
  • Alpha is expected to acquire the VIP first under normal conditions
  • Beta remains standby until promotion criteria are met

Step-by-step bootstrap procedures are defined in the deployment runbook.

Backup and Recovery

This pattern supports snapshot-based recovery only. It does not provide point-in-time recovery.

Failure Recovery
Single disk failure Replace disk; rebuild RAID; verify JetStream rejoin
Single node loss Rebuild node; restore from peers or snapshot as appropriate
Logical corruption / ransomware Restore clean snapshot to new cluster; replay from peer zones if available
Complete zone loss Rebuild zone and rehydrate from peer-zone replication and retained snapshots

Snapshot procedures and recovery commands are defined in the deployment runbook.

Monitoring Requirements

Check Target Alert Threshold
VIP held by expected active node Alpha / Beta VIP absent from both, or present on unexpected host
Keepalived role transition churn Alpha / Beta Excessive transition rate
BSFG controller health Zone VIP Non-200 health response
Controller bound to VIP only Active node Bound to non-VIP address
Unexpected controller on standby node Beta / Gamma BSFG process running unexpectedly
Local JetStream health All nodes Health probe failure
JetStream cluster membership Any node Fewer than 3 configured voters visible
JetStream quorum availability Any node Quorum not confirmed
Artifact storage availability Active node Storage unavailable
Certificate expiry All nodes Less than policy threshold remaining
Cross-zone replication lag Peer zones Excessive lag

Detailed monitoring configuration, alert thresholds, and escalation tiers are defined in the physical realization document and deployment runbook.

Accepted Limitations

Limitation Rationale
Zone unavailable if both Alpha and Beta are lost Higher-node-count patterns rejected for complexity
Artifacts may be temporarily unavailable after failover Within-zone synchronous artifact replication rejected
Failover may take 5–30 seconds Health gates plus service startup are accepted operational cost
Standard durability tier may lose very recent acknowledged messages under correlated crash conditions Flush interval tradeoff accepted for general traffic
No point-in-time recovery Snapshot recovery sufficient for intended use
No automatic recovery of local artifacts absent peer-zone copy Cross-zone replay accepted as recovery mechanism

Invariants Preserved by This Pattern

  • No shared storage is required for zone operation
  • No Kubernetes or distributed scheduler dependency is introduced
  • A single node failure does not require operator intervention to preserve message durability
  • Controller leadership is coupled to VIP ownership and local health gates
  • Durable message state remains quorum-backed
  • Artifact handling remains outside durable middleware semantics at the zone boundary

References