Skip to content

SLICE-1.1: Multi-Tag Telemetry

Goal

Replace the two-field MachineTelemetry(Temperature, Pressure) snapshot with a keyed tag stream — TagSample(Name, Timestamp, Value, Quality) driven by a TagDefinition registry — so the simulator can emit dozens of independent signals at independent rates. This is the first slice that puts real Phase-1 load through the pipeline.

Why This Slice

Today the simulator emits exactly two scalars (TemperatureCelsius, PressureBar) on a single ~5 Hz ticker (SimulatedTelemetrySource at SimulatorProfile.TelemetryIntervalMs = 200). The whole telemetry path — channel, pipeline service, AppState field, UI binding — is shaped around one snapshot type with two named scalars. That shape is the load-bearing reason the prototype "feels healthy": there is almost nothing to coalesce.

A real wafer inspection tool publishes hundreds of named signals — vacuum, temperatures across multiple zones, lamp currents, encoder deltas, air-knife pressures, gas flows — at heterogeneous rates from a few Hz up into the hundreds. Until the simulator produces something shaped like that, every measurement on the phase-1 measurements table is measuring an artificially light workload.

This slice introduces the shape of multi-tag telemetry and seeds enough of it (50 tags, 1–500 Hz mix) to expose the first real per-tag and aggregate pressure on the existing single-AppState-snapshot pipeline. The store is not refactored here — that is Phase 2's job, after this slice produces evidence that it needs refactoring.

Requirements Coverage

In Scope

  • a new domain shape replacing MachineTelemetry:
    • TagSample(string Name, DateTimeOffset Timestamp, double Value, TagQuality Quality)
    • TagQuality enum: Good, Uncertain, Bad
    • TagDefinition(string Name, string Unit, double IntervalMs, NoiseModel Noise) — the registry entry
    • NoiseModel discriminated record with at least four variants:
      • SineNoise(double Baseline, double Amplitude, double PeriodMs)
      • DriftNoise(double Baseline, double SlopePerSecond, double JitterStdDev)
      • RandomWalkNoise(double Baseline, double StepStdDev, double ClampMin, double ClampMax)
      • StepNoise(double Low, double High, double PeriodMs, double DutyCycle)
  • a new latest-values snapshot in AppState that carries the keyed tag map:
    • LatestTagValues — an immutable name → TagSample map
    • the existing AppState.LatestTelemetry scalar field is removed; the existing UI temperature/pressure readout is rewired to read named tags (temperature.celsius, pressure.bar) from LatestTagValues
  • a new producer abstraction replacing ITelemetrySource:
    • ITagStream exposing a bounded channel of TagSnapshot (an immutable name → TagSample map captured at one instant), plus per-tag and aggregate coalesce counters
    • SimulatedTagSource (in Infrastructure.Simulator) that runs one logical emitter per TagDefinition at its configured IntervalMs and publishes coalesced snapshots downstream
  • pipeline rewire:
    • TelemetryPipelineService is renamed TagStreamPipelineService (or reshaped — implementation choice in the task) and updates AppState.LatestTagValues from incoming snapshots
    • the existing PipelineCounters.TelemetryCoalesced field is preserved as the aggregate coalesce count; per-tag coalesce numbers live in metrics, not in AppState
  • configuration: a Simulator:Tags section bound to SimulatorTagsOptions; the seed appsettings.json ships 50 tags with realistic names, units, intervals (mix of 1, 5, 10, 50, 100, 250, 500 Hz), and noise models, including the two well-known names temperature.celsius and pressure.bar so the existing UI binding keeps working
  • metrics: extend the InspectionPrototype meter (see SLICE-006) so per-tag observability exists:
    • samples.ingested (counter, dimension tag.name) — incremented on every produced TagSample
    • samples.coalesced (counter, dimension tag.name) — incremented when a per-tag producer overwrites a still-unread value
    • tags.active (observable gauge) — count of TagDefinitions currently emitting
    • the existing telemetry.ingested and telemetry.coalesced counters in AppMetrics continue to exist, but now reflect snapshot publishes and snapshot drops respectively (semantics documented in the runbook)
  • a new measurement scenario §4.2 Multi-tag soak (30 min, MultiTag profile) added to docs/runbook/capturing-measurements.md, plus a new MultiTag simulator profile in seed configuration with a TelemetryIntervalMs of 50 (snapshot publish at 20 Hz; per-tag rates come from Simulator:Tags)
  • before/after rows in docs/reviews/phase-1-measurements.md against the row-0 demo baseline, captured under the new scenario, including the full 16-metric set already used by row 0

Out of Scope

  • moving telemetry data out of AppState (Phase 2 / SLICE 2.3 — ITelemetryBuffer lift-out)
  • per-tag observables, slice-level subscriptions on IAppStateStore (Phase 2 / SLICE 2.4)
  • real frame payloads (SLICE-1.2)
  • the encoder-rate motion stream (SLICE-1.3)
  • storm / soak / chaos profiles beyond the single new MultiTag profile (SLICE-1.4)
  • charting or trending UI for tag history — the slice only needs the latest value per tag visible somewhere
  • persisting TagSample history to disk (Phase 3 / SLICE 3.3)
  • changing the DiagnosticsEntries cap or routing
  • changing the existing snapshot-channel policy from DropOldest capacity 1; producers may use a different per-tag channel internally, but the snapshot channel keeps latest-value semantics
  • removing or renaming the existing frames.* and runs.* counters

Runtime Behavior

Tag registry and configuration

  • SimulatorTagsOptions binds the Simulator:Tags section to a list of SimulatorTagOptions { Name, Unit, IntervalMs, Noise }, where Noise is a polymorphic block with a Kind discriminator (Sine, Drift, RandomWalk, Step).
  • The seed appsettings.json ships exactly 50 tags. Names use a dotted convention (zone1.temperature.celsius, vacuum.chamber.pressure.bar, lamp.current.amps, etc.). At least one tag per noise variant is present.
  • IntervalMs per tag is configurable; valid range is [2, 1000] (2 ms ≈ 500 Hz, 1000 ms = 1 Hz). Configuration outside this range fails fast at startup with a clear validation error; the app does not partially start.
  • The existing tag names temperature.celsius and pressure.bar are reserved and must be present in any seeded configuration so the UI's existing temperature/pressure readout keeps working.

Producer (SimulatedTagSource)

  • One logical emitter per TagDefinition, each driven by its own IntervalMs cadence.
  • Each emitter writes its newest TagSample into a shared latest-value cell for its tag. Overwriting an unread cell increments samples.coalesced{tag.name=…} for that tag.
  • A single snapshot publisher ticks at the active profile's TelemetryIntervalMs and publishes a frozen TagSnapshot (immutable map of all current per-tag latest values) to the bounded channel.
  • The snapshot channel keeps the existing capacity-1 / DropOldest policy from SimulatedTelemetrySource. A snapshot drop increments the existing telemetry.coalesced counter and aggregate PipelineCounters.TelemetryCoalesced, exactly as today.
  • The producer survives a 30-minute continuous run without throwing.

Pipeline (TagStreamPipelineService)

  • Drains the snapshot channel and, for each snapshot, replaces AppState.LatestTagValues with the new map via a single IAppStateStore.Update call.
  • Per-snapshot coalesce events still log a Warning diagnostics entry (preserving SLICE-004 behavior) but do not log per-tag coalesce events to diagnostics — those are metric-only to keep the diagnostics timeline readable.
  • AppMetrics.TelemetryIngested increments once per snapshot published, not once per tag (snapshot semantics). Per-tag totals are available via samples.ingested.

AppState shape

  • LatestTelemetry: MachineTelemetry? is removed.
  • LatestTagValues: ImmutableDictionary<string, TagSample> is added with Initial = ImmutableDictionary<string, TagSample>.Empty.
  • MachineTelemetry and ITelemetrySource are deleted along with SimulatedTelemetrySource and FakeTelemetrySource. New fakes (FakeTagStream) replace them in the test stubs.

UI

  • The existing temperature/pressure readout in MainViewModel reads state.LatestTagValues.GetValueOrDefault("temperature.celsius") and …("pressure.bar") and formats them identically to today ("Temp: {Value:F1}°C Pressure: {Value:F3} bar"), falling back to "–" when the named tag has not yet been observed.
  • No new tag panel, no new chart. A future slice may add a tag-browser panel; this slice only needs the legacy readout to stay correct.

Metrics surfaces

  • dotnet-counters monitor --name InspectionPrototype.App --counters InspectionPrototype shows samples.ingested and samples.coalesced as totals across all tag dimensions (the live console aggregates dimensions). For per-tag breakdowns, the runbook §4.2 procedure uses dotnet-counters collect --format csv, which preserves the tag.name dimension column.

Acceptance Criteria

This slice is satisfied only if all of the following are true:

  1. MachineTelemetry, ITelemetrySource, SimulatedTelemetrySource, and FakeTelemetrySource are deleted from the codebase; no production or test code references them.
  2. TagSample, TagQuality, TagDefinition, the four NoiseModel variants, ITagStream, SimulatedTagSource, and TagStreamPipelineService exist with the shapes described above and are wired through AddInfrastructureServices and AddApplicationServices.
  3. AppState exposes LatestTagValues instead of LatestTelemetry; the UI temperature/pressure readout renders the same string as before for any tag map containing both temperature.celsius and pressure.bar, and renders "Temp: – Pressure: –" (or the equivalent existing fallback) when neither is present.
  4. The seed appsettings.json contains exactly 50 entries under Simulator:Tags, including temperature.celsius and pressure.bar. At least one tag uses each of the four noise models.
  5. Configuration that yields a tag with IntervalMs < 2 or > 1000, a duplicate Name, an empty Name, or an unrecognized Noise.Kind fails app startup with a clear validation error message; no partial startup state is written.
  6. samples.ingested and samples.coalesced counters appear in dotnet-counters monitor --name InspectionPrototype.App --counters InspectionPrototype, and a CSV capture (dotnet-counters collect --format csv) shows non-zero rows for at least 50 distinct tag.name dimension values within a 30-minute run.
  7. A 30-minute continuous run under the new MultiTag profile completes without an unhandled exception and without runs.faulted incrementing for telemetry-caused faults. All 50 tags appear in the samples.ingested CSV output with the tag.name dimension. Per-tag rate accuracy is documented in the row block, not gated: the simulator's Task.Delay-based per-tag emitter loop is subject to OS scheduler and timer-resolution constraints that scale with concurrent emitter count. Observed under the seeded 50-tag MultiTag profile on Windows 11 (rates normalized to scenario duration, not CSV span): ≤5 Hz tags within ±2% of configured; 10–50 Hz tags hit 60–90% of configured (sub-OS-tick rounding + scheduling overhead); ≥100 Hz tags cap near 64 Hz from the default Windows 15.6 ms timer tick. Tightening this is out of scope for SLICE-1.1 — see roadmap-progress for the follow-up tracking simulator emit-rate accuracy under concurrent load.
  8. A row block tagged slice-1-1-multi-tag-telemetry is appended to docs/reviews/phase-1-measurements.md covering the full 16-metric set, with a CSV under docs/captures/slice-1-1-multi-tag-<date>.csv and the commit hash under measurement.
  9. docs/runbook/capturing-measurements.md gains a §4.2 entry that names the MultiTag profile, the 30-minute scenario, and the post-processing step that converts the CSV into per-tag rate and coalesce numbers.
  10. The full existing test suite still passes, plus new tests covering: TagSample construction, each NoiseModel variant emitting bounded values across 1 000 ticks, configuration validation rejecting all four invalid cases in criterion 5, and TagStreamPipelineService updating LatestTagValues from a FakeTagStream snapshot.

Verification Notes

The implementation task for this spec must include verification for:

  • the simulator does not allocate per-TagSample on the hot path beyond what the producer's bounded buffers hold (verified by checking that samples.ingested rate at 50 tags × 100 Hz aggregate does not visibly elevate gen-0-gc-count against the row-0 baseline; this is a soft check in this slice — the hard check is the Phase-1 exit gate)
  • per-tag coalesce dimensions actually appear in the CSV output (verified by piping the CSV through a small script that groups by Tags column and sums)
  • no production code path references the deleted types via using, partial class, or generated code
  • the UI temperature/pressure readout binds correctly under both "no tags yet" and "both tags present" states (manual verification — note in runbook)
  • profile switching at runtime (the existing SLICE-004 behavior) continues to work: switching from Normal to MultiTag does not require an app restart, and the per-tag emitter set is rebuilt from the new profile + tag registry without leaking the previous timers

Docs-first project memory for AI-assisted implementation.