SLICE-1.1: Multi-Tag Telemetry
- Status: Proposed
- Date: 2026-04-24
- Depends on: Requirements, Evolution Roadmap, SLICE-006: Observability Baseline
Goal
Replace the two-field MachineTelemetry(Temperature, Pressure) snapshot with a keyed tag stream — TagSample(Name, Timestamp, Value, Quality) driven by a TagDefinition registry — so the simulator can emit dozens of independent signals at independent rates. This is the first slice that puts real Phase-1 load through the pipeline.
Why This Slice
Today the simulator emits exactly two scalars (TemperatureCelsius, PressureBar) on a single ~5 Hz ticker (SimulatedTelemetrySource at SimulatorProfile.TelemetryIntervalMs = 200). The whole telemetry path — channel, pipeline service, AppState field, UI binding — is shaped around one snapshot type with two named scalars. That shape is the load-bearing reason the prototype "feels healthy": there is almost nothing to coalesce.
A real wafer inspection tool publishes hundreds of named signals — vacuum, temperatures across multiple zones, lamp currents, encoder deltas, air-knife pressures, gas flows — at heterogeneous rates from a few Hz up into the hundreds. Until the simulator produces something shaped like that, every measurement on the phase-1 measurements table is measuring an artificially light workload.
This slice introduces the shape of multi-tag telemetry and seeds enough of it (50 tags, 1–500 Hz mix) to expose the first real per-tag and aggregate pressure on the existing single-AppState-snapshot pipeline. The store is not refactored here — that is Phase 2's job, after this slice produces evidence that it needs refactoring.
Requirements Coverage
- 04. UI and Technical Requirements: telemetry surface for inspection; bounded streaming with measurable behavior
- 05. Failure Modes and Workflow Requirements: high-rate streams must not destabilize the workflow state machine
- 07. AI Delivery Constraints and Roadmap: each phase ships a measurable before-and-after; this is row
slice-1-1-multi-tag-telemetryin the measurements table
In Scope
- a new domain shape replacing
MachineTelemetry:TagSample(string Name, DateTimeOffset Timestamp, double Value, TagQuality Quality)TagQualityenum:Good,Uncertain,BadTagDefinition(string Name, string Unit, double IntervalMs, NoiseModel Noise)— the registry entryNoiseModeldiscriminated record with at least four variants:SineNoise(double Baseline, double Amplitude, double PeriodMs)DriftNoise(double Baseline, double SlopePerSecond, double JitterStdDev)RandomWalkNoise(double Baseline, double StepStdDev, double ClampMin, double ClampMax)StepNoise(double Low, double High, double PeriodMs, double DutyCycle)
- a new latest-values snapshot in
AppStatethat carries the keyed tag map:LatestTagValues— an immutable name →TagSamplemap- the existing
AppState.LatestTelemetryscalar field is removed; the existing UI temperature/pressure readout is rewired to read named tags (temperature.celsius,pressure.bar) fromLatestTagValues
- a new producer abstraction replacing
ITelemetrySource:ITagStreamexposing a bounded channel ofTagSnapshot(an immutable name →TagSamplemap captured at one instant), plus per-tag and aggregate coalesce countersSimulatedTagSource(inInfrastructure.Simulator) that runs one logical emitter perTagDefinitionat its configuredIntervalMsand publishes coalesced snapshots downstream
- pipeline rewire:
TelemetryPipelineServiceis renamedTagStreamPipelineService(or reshaped — implementation choice in the task) and updatesAppState.LatestTagValuesfrom incoming snapshots- the existing
PipelineCounters.TelemetryCoalescedfield is preserved as the aggregate coalesce count; per-tag coalesce numbers live in metrics, not inAppState
- configuration: a
Simulator:Tagssection bound toSimulatorTagsOptions; the seedappsettings.jsonships 50 tags with realistic names, units, intervals (mix of 1, 5, 10, 50, 100, 250, 500 Hz), and noise models, including the two well-known namestemperature.celsiusandpressure.barso the existing UI binding keeps working - metrics: extend the
InspectionPrototypemeter (see SLICE-006) so per-tag observability exists:samples.ingested(counter, dimensiontag.name) — incremented on every producedTagSamplesamples.coalesced(counter, dimensiontag.name) — incremented when a per-tag producer overwrites a still-unread valuetags.active(observable gauge) — count ofTagDefinitions currently emitting- the existing
telemetry.ingestedandtelemetry.coalescedcounters inAppMetricscontinue to exist, but now reflect snapshot publishes and snapshot drops respectively (semantics documented in the runbook)
- a new measurement scenario
§4.2 Multi-tag soak (30 min, MultiTag profile)added todocs/runbook/capturing-measurements.md, plus a newMultiTagsimulator profile in seed configuration with aTelemetryIntervalMsof 50 (snapshot publish at 20 Hz; per-tag rates come fromSimulator:Tags) - before/after rows in
docs/reviews/phase-1-measurements.mdagainst the row-0 demo baseline, captured under the new scenario, including the full 16-metric set already used by row 0
Out of Scope
- moving telemetry data out of
AppState(Phase 2 / SLICE 2.3 —ITelemetryBufferlift-out) - per-tag observables, slice-level subscriptions on
IAppStateStore(Phase 2 / SLICE 2.4) - real frame payloads (SLICE-1.2)
- the encoder-rate motion stream (SLICE-1.3)
- storm / soak / chaos profiles beyond the single new
MultiTagprofile (SLICE-1.4) - charting or trending UI for tag history — the slice only needs the latest value per tag visible somewhere
- persisting
TagSamplehistory to disk (Phase 3 / SLICE 3.3) - changing the
DiagnosticsEntriescap or routing - changing the existing snapshot-channel policy from
DropOldestcapacity 1; producers may use a different per-tag channel internally, but the snapshot channel keeps latest-value semantics - removing or renaming the existing
frames.*andruns.*counters
Runtime Behavior
Tag registry and configuration
SimulatorTagsOptionsbinds theSimulator:Tagssection to a list ofSimulatorTagOptions { Name, Unit, IntervalMs, Noise }, whereNoiseis a polymorphic block with aKinddiscriminator (Sine,Drift,RandomWalk,Step).- The seed
appsettings.jsonships exactly 50 tags. Names use a dotted convention (zone1.temperature.celsius,vacuum.chamber.pressure.bar,lamp.current.amps, etc.). At least one tag per noise variant is present. IntervalMsper tag is configurable; valid range is[2, 1000](2 ms ≈ 500 Hz, 1000 ms = 1 Hz). Configuration outside this range fails fast at startup with a clear validation error; the app does not partially start.- The existing tag names
temperature.celsiusandpressure.barare reserved and must be present in any seeded configuration so the UI's existing temperature/pressure readout keeps working.
Producer (SimulatedTagSource)
- One logical emitter per
TagDefinition, each driven by its ownIntervalMscadence. - Each emitter writes its newest
TagSampleinto a shared latest-value cell for its tag. Overwriting an unread cell incrementssamples.coalesced{tag.name=…}for that tag. - A single snapshot publisher ticks at the active profile's
TelemetryIntervalMsand publishes a frozenTagSnapshot(immutable map of all current per-tag latest values) to the bounded channel. - The snapshot channel keeps the existing capacity-1 /
DropOldestpolicy fromSimulatedTelemetrySource. A snapshot drop increments the existingtelemetry.coalescedcounter and aggregatePipelineCounters.TelemetryCoalesced, exactly as today. - The producer survives a 30-minute continuous run without throwing.
Pipeline (TagStreamPipelineService)
- Drains the snapshot channel and, for each snapshot, replaces
AppState.LatestTagValueswith the new map via a singleIAppStateStore.Updatecall. - Per-snapshot coalesce events still log a
Warningdiagnostics entry (preserving SLICE-004 behavior) but do not log per-tag coalesce events to diagnostics — those are metric-only to keep the diagnostics timeline readable. AppMetrics.TelemetryIngestedincrements once per snapshot published, not once per tag (snapshot semantics). Per-tag totals are available viasamples.ingested.
AppState shape
LatestTelemetry: MachineTelemetry?is removed.LatestTagValues: ImmutableDictionary<string, TagSample>is added withInitial = ImmutableDictionary<string, TagSample>.Empty.MachineTelemetryandITelemetrySourceare deleted along withSimulatedTelemetrySourceandFakeTelemetrySource. New fakes (FakeTagStream) replace them in the test stubs.
UI
- The existing temperature/pressure readout in
MainViewModelreadsstate.LatestTagValues.GetValueOrDefault("temperature.celsius")and…("pressure.bar")and formats them identically to today ("Temp: {Value:F1}°C Pressure: {Value:F3} bar"), falling back to"–"when the named tag has not yet been observed. - No new tag panel, no new chart. A future slice may add a tag-browser panel; this slice only needs the legacy readout to stay correct.
Metrics surfaces
dotnet-counters monitor --name InspectionPrototype.App --counters InspectionPrototypeshowssamples.ingestedandsamples.coalescedas totals across all tag dimensions (the live console aggregates dimensions). For per-tag breakdowns, the runbook §4.2 procedure usesdotnet-counters collect --format csv, which preserves thetag.namedimension column.
Acceptance Criteria
This slice is satisfied only if all of the following are true:
MachineTelemetry,ITelemetrySource,SimulatedTelemetrySource, andFakeTelemetrySourceare deleted from the codebase; no production or test code references them.TagSample,TagQuality,TagDefinition, the fourNoiseModelvariants,ITagStream,SimulatedTagSource, andTagStreamPipelineServiceexist with the shapes described above and are wired throughAddInfrastructureServicesandAddApplicationServices.AppStateexposesLatestTagValuesinstead ofLatestTelemetry; the UI temperature/pressure readout renders the same string as before for any tag map containing bothtemperature.celsiusandpressure.bar, and renders"Temp: – Pressure: –"(or the equivalent existing fallback) when neither is present.- The seed
appsettings.jsoncontains exactly 50 entries underSimulator:Tags, includingtemperature.celsiusandpressure.bar. At least one tag uses each of the four noise models. - Configuration that yields a tag with
IntervalMs < 2or> 1000, a duplicateName, an emptyName, or an unrecognizedNoise.Kindfails app startup with a clear validation error message; no partial startup state is written. samples.ingestedandsamples.coalescedcounters appear indotnet-counters monitor --name InspectionPrototype.App --counters InspectionPrototype, and a CSV capture (dotnet-counters collect --format csv) shows non-zero rows for at least 50 distincttag.namedimension values within a 30-minute run.- A 30-minute continuous run under the new
MultiTagprofile completes without an unhandled exception and withoutruns.faultedincrementing for telemetry-caused faults. All 50 tags appear in thesamples.ingestedCSV output with thetag.namedimension. Per-tag rate accuracy is documented in the row block, not gated: the simulator'sTask.Delay-based per-tag emitter loop is subject to OS scheduler and timer-resolution constraints that scale with concurrent emitter count. Observed under the seeded 50-tag MultiTag profile on Windows 11 (rates normalized to scenario duration, not CSV span): ≤5 Hz tags within ±2% of configured; 10–50 Hz tags hit 60–90% of configured (sub-OS-tick rounding + scheduling overhead); ≥100 Hz tags cap near 64 Hz from the default Windows 15.6 ms timer tick. Tightening this is out of scope for SLICE-1.1 — see roadmap-progress for the follow-up tracking simulator emit-rate accuracy under concurrent load. - A row block tagged
slice-1-1-multi-tag-telemetryis appended todocs/reviews/phase-1-measurements.mdcovering the full 16-metric set, with a CSV underdocs/captures/slice-1-1-multi-tag-<date>.csvand the commit hash under measurement. docs/runbook/capturing-measurements.mdgains a §4.2 entry that names theMultiTagprofile, the 30-minute scenario, and the post-processing step that converts the CSV into per-tag rate and coalesce numbers.- The full existing test suite still passes, plus new tests covering:
TagSampleconstruction, eachNoiseModelvariant emitting bounded values across 1 000 ticks, configuration validation rejecting all four invalid cases in criterion 5, andTagStreamPipelineServiceupdatingLatestTagValuesfrom aFakeTagStreamsnapshot.
Verification Notes
The implementation task for this spec must include verification for:
- the simulator does not allocate per-
TagSampleon the hot path beyond what the producer's bounded buffers hold (verified by checking thatsamples.ingestedrate at 50 tags × 100 Hz aggregate does not visibly elevategen-0-gc-countagainst the row-0 baseline; this is a soft check in this slice — the hard check is the Phase-1 exit gate) - per-tag coalesce dimensions actually appear in the CSV output (verified by piping the CSV through a small script that groups by
Tagscolumn and sums) - no production code path references the deleted types via
using, partial class, or generated code - the UI temperature/pressure readout binds correctly under both "no tags yet" and "both tags present" states (manual verification — note in runbook)
- profile switching at runtime (the existing SLICE-004 behavior) continues to work: switching from
NormaltoMultiTagdoes not require an app restart, and the per-tag emitter set is rebuilt from the new profile + tag registry without leaking the previous timers