Context
BSFG fact streams are the durable semantic history of cross-zone communication. Large binaries such as PDFs, images, batch artifacts, archives, and generated documents have different operational characteristics from facts:
- they are much larger than ordinary fact bodies
- they inflate replay cost if embedded inline
- they often need separate retention and lifecycle policy
- they must still remain referentially linked to the fact history
JetStream Object Store provides bucketed object storage using chunked transfer and retrieval, which makes it suitable for large files, while still living inside the zone-local JetStream estate.
Options Considered
| Option | Description | Benefits | Drawbacks |
|---|---|---|---|
| Inline binary in fact message | Embed file bytes directly inside the fact payload or attachment body. |
- single append operation
- simple reference model
|
- bloats fact streams
- hurts replay and retention efficiency
- mixes semantic history with artifact transport
| | External storage outside zone estate | Store artifacts in a separate object system outside JetStream and reference them from facts. |
- can scale independently
- may use existing enterprise storage
|
- weakens zone-local autonomy
- adds another cross-system dependency
- reference durability becomes less aligned with local boundary storage
| | Zone-local object storage with fact references (Selected) | Store large artifacts in zone-local Object Store buckets, then append facts that reference them by metadata. |
- keeps fact log small and replayable
- preserves zone-local durability
- allows separate retention and lifecycle rules
- maintains explicit provenance through references
|
- two-step write flow
- orphan cleanup must be handled
- reference validation becomes part of append policy
|
Decision
BSFG will store large artifacts out-of-band in zone-local Object Store buckets. Facts will carry references to those artifacts rather than embedding large binary bodies inline.
PutObject(blob) -> object store
AppendFact(ref) -> fact stream
Artifact references must include enough metadata to make retrieval and integrity verification explicit, typically including:
- bucket
- key
- digest
- size
- media type
- optional file name
Bucket layout is organized by subject kind, for example:
batch-files
asset-files
alarm-files
document-files
lot-files
recipe-files
Small payloads may still be carried inline when they are semantically part of the fact and operationally trivial, but this is the exception rather than the default.
Consequences
Benefits:
- fact streams remain compact and efficient to replay
- large artifact lifecycle is governed separately from fact retention
- zone-local durability is preserved
- provenance remains explicit through fact-to-artifact references
Tradeoffs:
- producers perform a two-step write path
- implementations must handle orphaned uploads and cleanup
- object existence and integrity should be validated before or during fact append