Capturing Measurements Runbook

This runbook covers how to capture the before/after numbers that live in the phase-1-measurements table. Every Phase 1 slice's exit gate — and the baseline "row 0" that precedes Phase 1 — is captured with this procedure.

Read this cold, end to end, the first time. Subsequent captures should take about 15 minutes (10 minutes of scenario + 5 minutes of CSV extraction and table editing).

1. When to use this runbook

Use this procedure when any of the following is true:

you need to capture the demo baseline (row 0 of the measurements table)
you are about to start work on a Phase 1 or Phase 2 slice and need the before numbers
you have just finished a Phase 1 or Phase 2 slice and need the after numbers for the exit gate
something about the system's behavior under load surprised you, and you want a reproducible capture you can point to in a discussion

Do not use this procedure for debugging a single operator session — the observability runbook covers live counters and log reading.

2. Prerequisites

Check these once per machine:

dotnet-counters installed globally:
```
dotnet tool install -g dotnet-counters
```
TASK-006 (observability baseline) Pass 3 merged — without the InspectionPrototype meter, there is nothing to collect beyond System.Runtime.
The docs/captures/ directory exists (create it if not; it is committed).
docs/reviews/phase-1-measurements.md exists.

Check these at the start of every capture:

Working tree is clean and on the commit you intend to measure. A capture that cannot be tied to a specific commit hash is not useful evidence.
No other dotnet-counters session is attached to the same process.

3. The capture procedure

Three terminals. Sequence matters.

Terminal 1 — launch the app

dotnet build --configuration Release
dotnet run --configuration Release --project src/InspectionPrototype.App --no-build

Wait until the main window appears and settles (about 2 seconds after launch).

Terminal 2 — start the CSV collector

dotnet-counters ps
# note the PID row for InspectionPrototype.App

dotnet-counters collect \
  --name InspectionPrototype.App \
  --counters InspectionPrototype,System.Runtime \
  --format csv \
  --output docs/captures/<capture-name>.csv \
  --refresh-interval 1

<capture-name> convention: <row-tag>-<yyyy-MM-dd>.csv, for example demo-baseline-2026-04-22.csv or slice-1-1-frame-payloads-after-2026-05-03.csv.

collect runs until you Ctrl+C. It writes one CSV row per second, one column per counter.

Terminal 3 — follow the scenario script

Pick the scenario for the capture you are running. The demo-baseline scenario lives in section 4 of this runbook. Every Phase 1 slice adds its own scenario stub to that section once it lands.

Finishing cleanly

When the scenario completes:

Ctrl+C Terminal 2 (stops collection, flushes the CSV file).
Close the app cleanly in Terminal 1 (do not use taskkill /F — a forced kill skips OnExit and can leave the single-instance mutex briefly in limbo; the OS cleans it up, but the log trail is cleaner with a graceful exit).
Commit the CSV file and the measurements-table edit in a single commit. CSV files are evidence; they belong in git.

3a. Disable system sleep before any capture

A captured CSV records timestamps at wall-clock granularity. If the machine sleeps mid-capture, the process pauses with it, but the CSV span still grows to include the wall-time gap — diluting every rate metric by the sleep duration. The row 0a / slice-1-1 notes in phase-1-measurements.md describe the failure mode after a 63-min mid-capture sleep stretched a 30-min scenario into a 96-min CSV.

Before any capture longer than a few minutes:

powershell

# Disable AC sleep timeout for the capture session.
powercfg /change standby-timeout-ac 0

# Optionally also disable display sleep so the dotnet-counters terminal stays alive.
powercfg /change monitor-timeout-ac 0

# After the capture, restore your usual settings (or run the same commands
# with your preferred minute values).

Note on the retired automated capture rig. TASK-1.5 / TASK-1.5.1 added a --scenario / Capture-Measurements.ps1 headless-capture path that drove IOperatorCommands directly without rendering the UI. That rig was retired in favor of a UI-Automation-driven approach (planned SLICE-1.6, FlaUI); rows 0a, 0b, and slice-1-1-multi-tag-telemetry were captured under it and remain valid historical evidence. The scenario classes, ScenarioRunner, Capture-Measurements.ps1, and the --scenario CLI flag have all been removed; the manual §3 procedure is the supported capture path until the FlaUI rig lands. The CSV-math helpers in tools/MeasurementExtraction.psm1 (ConvertTo-MeasurementRow, Get-GcPauseP95, Get-LohAllocRateAvg) are kept and are the recommended way to extract a row from a manually-collected CSV.

3b. Automated capture with the FlaUI rig (SLICE-1.6)

An earlier headless rig (SLICE-1.5) drove ICommand instances directly without rendering the UI; that rig was retired 2026-04-27 in favour of this UI-Automation approach.

SLICE-1.6 ships a tools/Capture-Measurements.ps1 orchestrator that drives the real WPF main window via FlaUI/UIA3. This exercises the full XAML-binding path — exactly what a real operator would do — while automating away the stopwatch, button-clicking, and CSV extraction steps. Every registered scenario takes about DurationSeconds + 40 s wall-clock time and exits with code 0 on success or 1 on any scenario or extraction failure.

Invocation:

powershell

cd <repo root>

tools\Capture-Measurements.ps1 `
  -Scenario <name> `
  -DurationSeconds <secs> `
  -OutputCsv docs/captures/<name>-<date>.csv `
  -CommitHash (git rev-parse --short HEAD) `
  [-Profile <profileName>] `
  [-SliceTag <tag>] `
  [-AppendToTable]

-AppendToTable appends the 18-metric markdown block to docs/reviews/phase-1-measurements.md under the ## Phase 1 rows heading automatically. Without it, the block is printed to stdout and must be pasted manually.

FlaUI prerequisites

Foreground window required. FlaUI drives the real WPF window. The app must be visible on the primary display — not minimised, not covered by a full-screen overlay. The orchestrator launches the app visibly (-NoNewWindow / no -WindowStyle Hidden); do not move it to the background.
No screen lock during capture. A locked workstation pauses the WPF dispatcher, which pauses the scenario. The powercfg commands above eliminate the automatic lock; do not lock manually.
Display scaling = 100% recommended. FlaUI element hit-testing uses logical pixels. Non-100% display scaling can shift coordinates under some controls. If a click misses its target, setting scaling to 100% usually fixes it.

Registered scenarios

`-Scenario` value	FlaUI test class	Default `-DurationSeconds`	Profile flag needed
`DemoBaseline`	`DemoBaselineFlaUi`	600	none (Normal)
`MultiTagSoak`	`MultiTagSoakFlaUi`	1800	none (MultiTag)
`MultiTagSoak`	`MultiTagSoakFlaUi`	600	`-Profile HighFrameRate` for SLICE-1.2
`MultiTagSoak`	`MultiTagSoakFlaUi`	600	`-Profile EncoderRate` for SLICE-1.3
`MultiTagSoak`	`MultiTagSoakFlaUi`	1800	`-Profile ChaosMonkey` for SLICE-1.4 (requires `FlakySdk:Enabled=true` — see §4.5)
`MultiTagSoak`	`MultiTagSoakFlaUi`	28800	`-Profile Soak8h` for SLICE-1.4 (dedicated session — see §4.6)
`MultiTagSoak`	`MultiTagSoakFlaUi`	1800	`-Profile HighDefect` for SLICE-3.1 (see §5.3)

When to fall back to manual §3

Use the manual §3 procedure instead of the FlaUI rig when:

doing a quick exploratory capture shorter than 60 s (manual setup is faster than the rig's 40 s overhead for very short windows)
on a machine that does not have the full dev environment cloned (dotnet test and the AcceptanceTests project are required by the orchestrator)
troubleshooting a FlaUI element-not-found failure — run the app interactively, use Inspect.exe or Accessibility Insights to verify the AutomationId is present, then re-run the rig

4. Scenarios

Each capture runs against a named scenario so that captures are comparable across commits. A scenario is a fixed sequence of operator actions plus timing that the app is driven through. Scenarios are not optional — an undefined sequence produces unreproducible numbers.

4.1 Demo baseline (row 0)

Used for the reference row that precedes Phase 1. Captures the current simulator at current rates with no architectural changes.

Duration: 10 minutes (stopwatch; the last Start click must land before 10:00)
Preconditions: app just launched; no prior state; simulator profile "Normal"

Note: Start Run is disabled after a run reaches Completed (CommandGuards.CanStart requires Idle | Ready). Home returns the workflow to Ready.

Steps:
  1.  Click Connect.                        Wait until status = Connected.
  2.  In the Recipe panel, click Refresh.   Wait until catalog populates.
  3.  Select the recipe "standard-5pt-wafer-scan".
  4.  Click Load Recipe.                    Wait until recipe is loaded.
  5.  Click Home.                           Wait until homing completes.
  6.  Click Start Run.                      Wait until the run completes.
  7.  Click Home.                           Wait until homing completes.
  8.  Repeat steps 6-7 continuously until the 10:00 mark on the stopwatch.
  9.  If a run is in progress at 10:00, click Stop; otherwise skip.
 10.  Click Disconnect.
Do not: change the simulator profile mid-capture; open unrelated UI panes;
        minimize the window.

At current simulator rates this produces 20–40 completed runs in the 10-minute window, enough volume to produce meaningful counter totals.

4.2 Multi-tag soak — slice-1.1, `MultiTag` profile

Used for the slice-1-1-multi-tag-telemetry row of the measurements table. Drives the new tag stream (50 tags, 1–500 Hz) under the MultiTag simulator profile so we can measure per-tag emit rates and the snapshot-pipeline overhead introduced by SLICE-1.1.

Additional prerequisites (beyond §2):

on a build with the per-tag metrics wired (TASK-1.1 Pass 3 or later — the build must expose samples.ingested and samples.coalesced on the InspectionPrototype meter)
the seed appsettings.json contains exactly 50 entries under Simulator:Tags, including the reserved names temperature.celsius and pressure.bar
a MultiTag profile is present in Simulator:Profiles (built-in fallback in InfrastructureServiceCollectionExtensions if absent from appsettings.json)

Duration: 30 minutes (stopwatch; the last Start click must land before 30:00)
Preconditions: app just launched; no prior state; simulator profile MultiTag

Note: Start Run is disabled after a run reaches Completed (CommandGuards.CanStart requires Idle | Ready). Home returns the workflow to Ready.

Steps:
  1.  In the Simulator Profile selector, switch to "MultiTag" BEFORE Connect.
  2.  Click Connect.                        Wait until status = Connected.
  3.  In the Recipe panel, click Refresh.   Wait until catalog populates.
  4.  Select the recipe "standard-5pt-wafer-scan".
  5.  Click Load Recipe.                    Wait until recipe is loaded.
  6.  Click Home.                           Wait until homing completes.
  7.  Click Start Run.                      Wait until the run completes.
  8.  Click Home.                           Wait until homing completes.
  9.  Repeat steps 7-8 continuously until the 30:00 mark on the stopwatch.
 10.  If a run is in progress at 30:00, click Stop; otherwise skip.
 11.  Click Disconnect.
Do not: change the simulator profile mid-capture; edit Simulator:Tags;
        open unrelated UI panes; minimize the window.

Capture command (Terminal 2):

dotnet-counters collect \
  --name InspectionPrototype.App \
  --counters InspectionPrototype,System.Runtime \
  --format csv \
  --output docs/captures/slice-1-1-multi-tag-<YYYY-MM-DD>.csv \
  --refresh-interval 1

Start collection ~30 seconds before clicking Connect (warm-up) and stop it ~60 seconds after Disconnect (cool-down). The 16-metric row in phase-1-measurements.md is computed across the full CSV duration, the same convention as row 0; the per-tag rate check below uses the same window.

Sanity check before extracting numbers:

After Ctrl+C on the collector, eyeball the CSV:

tags.active should read 50 for the steady-state portion of the run. If it reads 0, the producer started with an empty tag registry — appsettings.json did not load. Re-launch from a working directory that contains appsettings.json and re-capture.
telemetry.ingested (Count / 1 sec) should read ~20 Hz steady-state (the MultiTag profile publishes snapshots every 50 ms). If it reads ~5, the active profile was Normal, not MultiTag — re-capture after switching the profile selector.
samples.ingested (Count / 1 sec)[tag.name=…] rows should appear for at least 50 distinct tag names. If none appear, the per-tag metric wiring is not in the running build (Pass 3 not yet merged or built).

If any of these three checks fail, the capture is not measuring what SLICE-1.1 cares about. Throw it out and re-run.

Per-tag rate-error post-processing (PowerShell):

This is the verifier for SLICE-1.1 acceptance criterion 7 (every tag's observed rate within ±2% of configured IntervalMs). It groups samples.ingested rows by their tag.name dimension, computes the observed Hz against Simulator:Tags[i].IntervalMs, and exits non-zero if any tag is out of bounds or missing.

powershell

$csvPath = 'docs/captures/slice-1-1-multi-tag-<YYYY-MM-DD>.csv'
$cfgPath = 'src/InspectionPrototype.App/appsettings.json'

# Load expected per-tag IntervalMs from config (strip JSON comments first).
$cfgRaw = Get-Content $cfgPath -Raw
$cfgRaw = $cfgRaw -replace '/\*[\s\S]*?\*/','' -replace '(?m)//.*$',''
$cfg = $cfgRaw | ConvertFrom-Json
$expected = @{}
foreach ($t in $cfg.Simulator.Tags) { $expected[$t.Name] = [double]$t.IntervalMs }

$csv = Import-Csv $csvPath
$first  = [datetime]::Parse($csv[0].Timestamp)
$last   = [datetime]::Parse($csv[-1].Timestamp)
$durSec = ($last - $first).TotalSeconds

# Counter Name shape: 'samples.ingested (Count / 1 sec)[tag.name=foo.bar]'
$pattern = '^samples\.ingested.*\[tag\.name=(?<tag>[^\]]+)\]$'
$samples = $csv | Where-Object { $_.'Counter Name' -match $pattern } |
    ForEach-Object {
        $null = $_.'Counter Name' -match $pattern
        [pscustomobject]@{ Tag = $Matches.tag; Inc = [double]$_.'Mean/Increment' }
    }

$rows    = New-Object System.Collections.Generic.List[object]
$missing = New-Object System.Collections.Generic.List[string]
$worst   = 0.0

foreach ($name in $expected.Keys) {
    $expHz = 1000.0 / $expected[$name]
    $hits  = $samples | Where-Object { $_.Tag -eq $name }
    if (-not $hits) { $missing.Add($name); continue }
    $actHz = (($hits | Measure-Object -Property Inc -Sum).Sum) / $durSec
    $err   = 100.0 * ($actHz - $expHz) / $expHz
    $rows.Add([pscustomobject]@{
        Tag        = $name
        ExpectedHz = [math]::Round($expHz, 2)
        ActualHz   = [math]::Round($actHz, 2)
        ErrorPct   = [math]::Round($err, 2)
    })
    if ([math]::Abs($err) -gt [math]::Abs($worst)) { $worst = $err }
}

$rows | Sort-Object { [math]::Abs($_.ErrorPct) } -Descending |
    Format-Table -AutoSize | Out-String | Tee-Object -FilePath `
    "docs/captures/slice-1-1-multi-tag-$(Get-Date -Format 'yyyy-MM-dd')-rate-check.txt"

if ($missing.Count -gt 0) {
    Write-Error "MISSING samples.ingested for $($missing.Count) tag(s): $($missing -join ', ')"
    exit 2
}
if ([math]::Abs($worst) -gt 2.0) {
    Write-Error "FAIL: max per-tag rate error $([math]::Round($worst,2))% exceeds ±2%."
    exit 1
}
"PASS: max per-tag rate error $([math]::Round($worst,2))%."

Commit the rate-check .txt next to the CSV. The 16-metric row goes through the §5 PowerShell extraction script unchanged — telemetry.ingested rate (Hz) will read at the snapshot rate (~20 Hz under MultiTag), not the per-tag rate; per-tag totals live in samples.ingested only and do not appear in the 16-metric table by design.

4.3 — Real frame payloads (SLICE-1.2, 30 fps × 2 MP, HighFrameRate profile)

Scenario: MultiTagSoak with the HighFrameRate simulator profile, 10 minutes.

Use the §3b FlaUI rig:

powershell

.\tools\Capture-Measurements.ps1 `
    -Scenario MultiTagSoak `
    -Profile HighFrameRate `
    -DurationSeconds 600 `
    -OutputCsv docs/captures/slice-1-2-high-fps-YYYY-MM-DD.csv `
    -SliceTag slice-1-2-real-frame-payloads `
    -CommitHash $(git rev-parse --short HEAD) `
    -AppendToTable

Profile: HighFrameRate — 2048 × 1024 × 1 byte, 33 ms frame interval (≈ 30 fps), 50 ms telemetry interval. Expected LOH activity: gen-2 GC count > 0; LOH-alloc-rate-avg ≈ 1 MB/s (averaged over the full 600 s capture); alloc-rate ≈ 300× higher than Normal-profile baseline. frames.ingested note: SimulatedCamera streams only while a run is actively executing (Connected + Running state). The criterion (≥ 17 500) assumes continuous streaming; with the multi-cycle MultiTagSoak scenario, the observed count will be lower (~8 000–12 000 depending on run count). See the criterion-scope clarification note in docs/reviews/phase-1-measurements.md row slice-1-2-real-frame-payloads.

4.4 Encoder-rate motion — SLICE-1.3, `EncoderRate` profile

Scenario: MultiTagSoak with the EncoderRate simulator profile, 10 minutes.

Use the §3b FlaUI rig:

powershell

.\tools\Capture-Measurements.ps1 `
    -Scenario MultiTagSoak `
    -Profile EncoderRate `
    -DurationSeconds 600 `
    -OutputCsv docs/captures/slice-1-3-encoder-rate-YYYY-MM-DD.csv `
    -SliceTag slice-1-3-encoder-rate-motion `
    -CommitHash $(git rev-parse --short HEAD) `
    -AppendToTable

Profile: EncoderRate — mirrors MultiTag (640 × 480 × 1 frames @ 500 ms, 50 ms telemetry snapshot, 50 tags) but adds EncoderIntervalMs = 1 so SimulatedEncoderSource ticks at 1 ms nominal cadence. The producer acquires winmm!timeBeginPeriod(1) on StartAsync to lift the Windows timer-resolution floor from ~15.6 ms to 1 ms.

Architectural design under test: the encoder stream is drained by EncoderStreamPipelineService and emits per-axis metrics, but does not write to AppState. The MultiTagSoakFlaUi scenario continues unchanged; the encoder stream runs as a background IHostedService for the lifetime of the host. A passing capture is one where runs.faulted = 0, frames.dropped = 0, tags.active = 50 even with the encoder producer ticking at 1 ms.

Sanity checks before extracting numbers:

tags.active reads 50 (regression check from the TASK-1.1 workdir bug).
frames.dropped (Count / 1 sec) is absent or zero (the encoder stream must not starve the frame pipeline).
runs.faulted (Count / 1 sec) is absent or zero (no encoder-pipeline-caused faults).
The 20-metric row block (printed by ConvertTo-MeasurementRow) includes encoder-rate-x and encoder-rate-y rows — both should be in the hundreds-of-Hz range. Exact target is documented, not gated (see SLICE-1.3 criterion-7 amendment); a typical PeriodicTimer + timeBeginPeriod(1) combination on a default Windows host lands in the 600–800 Hz range.

System-wide effect — timeBeginPeriod(1). The producer raises the Windows multimedia-timer resolution while the app is running. This is a process-issued, system-wide effect: other processes on the host see the same elevated timer resolution until the app exits. On a dedicated capture machine this is invisible; on a shared/dev workstation, latency-sensitive applications (audio, real-time video) may behave differently while the capture is running. Prefer a dedicated session.

When the receiver rate falls below ~500 Hz: the most likely cause is that AcquireOrFallback returned the no-op disposable instead of the real WinMmTimePeriod. Check the app's startup log (%LOCALAPPDATA%\InspectionPrototype\logs\) for a warning containing timeBeginPeriod — if present, the P/Invoke failed (non-Windows host, sandbox restriction, or winmm.dll absent) and the producer ran at the default ~15.6 ms tick (~64 Hz cap). The capture is still architecturally valid evidence of the stream-bypass design but the rate row should be flagged.

Implemented by: MultiTagSoakFlaUi with -Profile EncoderRate (no new IScenario or new FlaUI test class — the existing scenario reads SIMULATOR_PROFILE env var and the orchestrator wires -Profile through).

4.5 Chaos-monkey scenario — SLICE-1.4, `ChaosMonkey` profile

SLICE-1.4 criterion-11 evidence capture. Drives all four fault branches in WorkflowService (connect-failure, fault-during-home, fault-during-run, fault-clear-and-recover) under aggressive chaos settings so that log inspection can confirm branch coverage.

PREREQUISITE — flip FlakySdk:Enabled before building. The merged appsettings.json ships Simulator:FlakySdk:Enabled = false so that earlier rows reproduce against the current build. Before running this scenario, open src/InspectionPrototype.App/appsettings.json, change "Enabled": false to "Enabled": true in the Simulator:FlakySdk block, then build. Restore to false and rebuild after the capture. The row's Notes section documents this so the capture remains interpretable.

powershell

# After flipping Enabled=true and rebuilding:
$date = Get-Date -Format 'yyyy-MM-dd'
.\tools\Capture-Measurements.ps1 `
    -Scenario MultiTagSoak `
    -Profile ChaosMonkey `
    -DurationSeconds 1800 `
    -OutputCsv "docs/captures/slice-1-4-chaos-monkey-$date.csv" `
    -SliceTag slice-1-4-chaos-monkey `
    -CommitHash (git rev-parse --short HEAD) `
    -AppendToTable
# Then restore Enabled=false and rebuild.

Sanity checks before recording the row:

runs.started ≥ 5, runs.faulted ≥ 5, fault-cycles (count) ≥ 5 (the chaos knobs are aggressive enough that these thresholds are met in any 30-min window with AlarmBurstEveryMs = 45 000).
frames.dropped recorded (absent means zero — valid).
All four fault-branch log-line types present (use the Select-String recipe below to count each).

PowerShell log inspection — fault-branch coverage:

powershell

# Replace the log filename with today's date.
$log = "Logs\inspection-prototype-$(Get-Date -Format 'yyyyMMdd').log"

# Branch (a) — connect-failure paths
Select-String -Path $log -Pattern 'Connection failed \(simulated failure\)|FlakySdk: out-of-band|Connection error:' |
    Measure-Object | Select-Object -ExpandProperty Count

# Branch (b)/(c) — critical fault injection total
Select-String -Path $log -Pattern 'CRITICAL FAULT: \[CHAOS-' |
    Measure-Object | Select-Object -ExpandProperty Count

# Branch (d) — fault cleared + recovery
Select-String -Path $log -Pattern 'Fault condition cleared: \[CHAOS-' |
    Measure-Object | Select-Object -ExpandProperty Count
Select-String -Path $log -Pattern 'Recovery completed\.' |
    Measure-Object | Select-Object -ExpandProperty Count

# Defect-shower transitions
Select-String -Path $log -Pattern 'DefectShower' |
    Measure-Object | Select-Object -ExpandProperty Count

All four branch types must have a non-zero count.

The row block is 22-metric — the two new metrics beyond the standard 20 are:

working-set growth (MB) — derived from the CSV's working_set gauge: last_value − first_value divided by 1 MB (see Get-WorkingSetGrowthMb in tools/MeasurementExtraction.psm1).
fault-cycles (count) — runs.faulted sum (derived from Get-FaultCyclesCount).

Implemented by: MultiTagSoakFlaUi with -Profile ChaosMonkey (no new FlaUI scenario class — the existing MultiTagSoakFlaUi reads the SIMULATOR_PROFILE env var; the orchestrator wires -Profile through).

4.6 Soak scenario — SLICE-1.4, `Soak8h` profile

SLICE-1.4 criterion-12 evidence capture. Runs the full application under sustained low-chaos load for 8 real-time hours to detect working-set growth (memory leak bar).

Do not run on a host you also intend to use during the capture. The 8-hour window requires the display to remain on and unlocked continuously. FlaUI needs a visible, unobstructed foreground window. Light background work on a secondary monitor is acceptable; CPU-intensive workloads (builds, video encoding) will skew GC and alloc-rate metrics.

Prerequisites:

Hibernate disabled: powercfg /hibernate off (requires elevated shell)
AC standby timeout disabled: powercfg /change standby-timeout-ac 0
Screen lock disabled for the session
No other active dotnet-counters session attached to the app
Simulator:FlakySdk:Enabled = false (the default in appsettings.json)

powershell

$date = Get-Date -Format 'yyyy-MM-dd'
.\tools\Capture-Measurements.ps1 `
    -Scenario MultiTagSoak `
    -Profile Soak8h `
    -DurationSeconds 28800 `
    -OutputCsv "docs/captures/slice-1-4-soak-8h-$date.csv" `
    -SliceTag slice-1-4-soak-8h `
    -CommitHash (git rev-parse --short HEAD) `
    -AppendToTable

Sanity checks after the run:

Capture span ≥ 28 500 s (≤ 1% drift from 8 h). If less, the host slept or paused; discard the CSV and restart.
working-set growth (MB) ≤ 50 (criterion-12 ceiling). Note: the last − first metric captures the startup ramp (first 30 min of initialisation) as well as any in-flight leak. The 2026-05-03 reference capture showed a 186.5 MB last − first that is entirely startup cost (stable 228–238 MB sawtooth for hours 1–8); a criterion-12 amendment replacing last − first with a startup-excluded steady-state metric is filed as a follow-up. Until amended, treat last − first > 50 MB as a flag requiring time-series inspection before concluding a real leak exists.
Gen-2 GC count rate ≤ 4× the slice-1-2-real-frame-payloads rate (16 072/hr × 4 = 64 288/hr ceiling).
No unhandled-exception entries in Logs/inspection-prototype-*.log.
runs.faulted near zero (Soak8h has AlarmBurstEveryMs = 0; only ConnectionFailureProbability = 0.05 misconnects apply, handled by retry).

If the capture is interrupted: discard the partial CSV and restart. The last − first growth math is only meaningful on an uninterrupted real-time run; a partial CSV will show artificially low or misleading growth.

Implemented by: MultiTagSoakFlaUi with -Profile Soak8h.

4.7+ — Phase 2 scenarios

Phase 2 was deferred following the SLICE-2.0 measurement (2026-05-07). All three rubric gates cleared: store alloc share 0.5% (gate: ≥10%), lock-wait p95 0.4 µs (gate: ≥100 µs), no data-plane pipeline dominates calls (WorkflowService and TagStreamPipelineService each hold 46.4%). No Phase 2 scenarios are scheduled. If a future slice re-triggers the rubric, §5 scenarios will be added here.

5. Phase 2 capture scenarios

5.1 Store-allocation profiling — SLICE-2.0, `MultiTag` profile

SLICE-2.0 baseline capture. Measures AppStateStore.Update allocation share, lock-wait distribution, call rate, and caller distribution under the MultiTag profile, which drives the highest sustained store write rate of any baseline profile (~43 calls/sec from 50-tag emitters + position events + run lifecycle).

Scenario: MultiTagSoak with the default MultiTag simulator profile, 30 minutes.

powershell

$date = Get-Date -Format 'yyyy-MM-dd'
.\tools\Capture-Measurements.ps1 `
    -Scenario MultiTagSoak `
    -Profile MultiTag `
    -DurationSeconds 1800 `
    -OutputCsv "docs/captures/slice-2-0-store-profiling-$date.csv" `
    -SliceTag slice-2-0-store-profiling `
    -CommitHash (git rev-parse --short HEAD)

Note: -AppendToTable is not used here because the row goes to docs/reviews/phase-2-measurements.md, not phase-1-measurements.md. After the capture completes, run ConvertTo-MeasurementRow manually and paste the output into phase-2-measurements.md.

Sanity checks before recording the row:

tags.active reads 50 steady-state. If 0, appsettings.json did not load (launch from the app's bin directory).
telemetry.ingested (Count / 1 sec) reads ~20 Hz steady-state. If ~5 Hz, the profile was Normal not MultiTag.
store.update.calls (Count / 1 sec)[caller=…] rows appear for at least 2 distinct callers. If absent, the SLICE-2.0 instrumentation is not in the running build.
store.update.lock_wait.micros histogram rows appear. If absent, same cause.

Extracting the 26-metric row:

powershell

Import-Module tools\MeasurementExtraction.psm1 -Force
$csv = Import-Csv "docs\captures\slice-2-0-store-profiling-$date.csv"
ConvertTo-MeasurementRow `
    -CsvPath    "docs\captures\slice-2-0-store-profiling-$date.csv" `
    -SliceTag   'slice-2-0-store-profiling' `
    -Scenario   'MultiTagSoak' `
    -CommitHash (git rev-parse --short HEAD) `
    -Date       $date

Full caller distribution (not in the 26-metric table — goes in Notes):

powershell

Get-StoreUpdateCallerDistribution -Csv $csv | Format-Table -AutoSize

Reference capture: docs/captures/slice-2-0-store-profiling-2026-05-07.csv (1812 s, 77 125 total store.update calls). Row committed in docs/reviews/phase-2-measurements.md.

5.2 Subscriber-invocation profiling — SLICE-2.4, `HighDefect` profile

SLICE-2.4 capture. Measures subscriber.invocations rate, subscriber-to-store ratio, and selector distribution under the HighDefect simulator profile. The HighDefect profile is chosen because it produces the highest defect-event density, which drives the most discriminating subscriber call pattern (subscribers for defect-related state fire at full rate, while unrelated subscribers should remain idle).

Exit gate: subscriber-to-store ratio ≤ 4.0 (≥ 80% reduction from the pre-slice baseline of ~20.0, which reflected one monolithic Project(state) call per store update touching ~20 property groups).

Scenario: MultiTagSoak with -Profile HighDefect, 30 minutes.

powershell

$date = Get-Date -Format 'yyyy-MM-dd'
.\tools\Capture-Measurements.ps1 `
    -Scenario MultiTagSoak `
    -Profile HighDefect `
    -DurationSeconds 1800 `
    -OutputCsv "docs/captures/slice-2-4-per-slice-observables-$date.csv" `
    -SliceTag slice-2-4-per-slice-observables `
    -CommitHash (git rev-parse --short HEAD)

Note: After the capture completes, run ConvertTo-MeasurementRow manually and paste the output into phase-2-measurements.md.

Sanity checks before recording the row:

tags.active reads 50 steady-state.
telemetry.ingested (Count / 1 sec) reads ~20 Hz steady-state.
subscriber.invocations (Count / 1 sec)[selector=…] rows appear for at least 3 distinct selectors (one per per-slice subscription in MainViewModel). If absent, Pass 1 AppMetrics instrumentation is not in the running build.
store.update.calls (Count / 1 sec)[caller=…] rows appear (needed for ratio computation).

Extracting the 29-metric row (26 base + 3 subscriber metrics):

powershell

Import-Module tools\MeasurementExtraction.psm1 -Force
$date = Get-Date -Format 'yyyy-MM-dd'
$csvPath = "docs\captures\slice-2-4-per-slice-observables-$date.csv"
ConvertTo-MeasurementRow `
    -CsvPath    $csvPath `
    -SliceTag   'slice-2-4-per-slice-observables' `
    -Scenario   'MultiTagSoak (HighDefect)' `
    -CommitHash (git rev-parse --short HEAD) `
    -Date       $date

Full selector distribution (not in the 29-metric table — goes in Notes):

powershell

$csv = Import-Csv $csvPath
Get-SubscriberSelectorDistribution -Csv $csv | Format-Table -AutoSize

5.3 SQLite persistence profiling — SLICE-3.3, `MultiTag` profile

SLICE-3.3 capture. Measures the same 26 Phase 2 metrics plus three new persistence metrics: runs and alarms written to SQLite during the capture window, and the p95 history-hydration load time. The DB is pre-populated with 10 000 synthetic rows so that pagination and LoadRecentAsync behaviour is exercised from a non-trivial starting state.

Scenario: MultiTagSoak with the default MultiTag simulator profile, 30 minutes.

Step 1 — Disable sleep and build

powershell

powercfg /change standby-timeout-ac 0
powercfg /change monitor-timeout-ac 0
dotnet build --configuration Release

Step 2 — Locate the database file

powershell

$dbPath = "$env:LOCALAPPDATA\InspectionPrototype\inspection.db"

If the DB does not exist yet, launch the app once (the MigrationRunner hosted service creates the schema on first start) and then close it before proceeding. Check:

powershell

Test-Path $dbPath

Step 3 — Back up and pre-populate

powershell

$date = Get-Date -Format 'yyyy-MM-dd'

# Optional: back up any existing DB to avoid contaminating the snapshot delta
if (Test-Path $dbPath) {
    Copy-Item $dbPath "$dbPath.pre-$date"
}

# Pre-populate with 10 000 synthetic rows
.\tools\Populate-SyntheticHistory.ps1 -DatabasePath $dbPath -RowCount 10000

# Snapshot before state
Copy-Item $dbPath "$env:TEMP\inspection-before-$date.db"

The populate script is idempotent — if the DB already has ≥10 000 rows it prints a no-op message and exits.

Step 4 — Run the capture

powershell

.\tools\Capture-Measurements.ps1 `
    -Scenario       MultiTagSoak `
    -Profile        MultiTag `
    -DurationSeconds 1800 `
    -OutputCsv      "docs/captures/slice-3-3-sqlite-persistence-$date.csv" `
    -SliceTag       slice-3-3-sqlite-persistence `
    -CommitHash     (git rev-parse --short HEAD)

Note: -AppendToTable is not used here because the row goes to docs/reviews/phase-3-measurements.md, not phase-1-measurements.md.

Step 5 — Snapshot after state

powershell

Copy-Item $dbPath "$env:TEMP\inspection-after-$date.db"

Step 6 — Extract the 29-metric row

powershell

Import-Module tools\MeasurementExtraction.psm1 -Force

$logDate    = Get-Date -Format 'yyyyMMdd'
$logPath    = "$env:LOCALAPPDATA\InspectionPrototype\logs\app-$logDate.log"
$dbBefore   = "$env:TEMP\inspection-before-$date.db"
$dbAfter    = "$env:TEMP\inspection-after-$date.db"

ConvertTo-MeasurementRow `
    -CsvPath        "docs\captures\slice-3-3-sqlite-persistence-$date.csv" `
    -SliceTag       'slice-3-3-sqlite-persistence' `
    -Scenario       'MultiTagSoak' `
    -CommitHash     (git rev-parse --short HEAD) `
    -Date           $date `
    -DatabaseBefore $dbBefore `
    -DatabaseAfter  $dbAfter `
    -LogPath        $logPath

The output block ends with three Phase 3 rows:

| runs.persisted (count)              | <value>  |
| alarms.persisted (count)            | <value>  |
| recent-history-load p95 (ms)        | <value>  |

Paste the full output block into docs/reviews/phase-3-measurements.md under ### Row — slice-3-3-sqlite-persistence.

Sanity checks before recording the row:

runs.persisted > 0. If 0, the SqliteRunHistoryStore.SaveAsync path is not being hit (check that a run actually completed — runs.completed > 0).
alarms.persisted ≥ 0. Under MultiTag (no ChaosMonkey) there may be zero alarms, which is expected and correct.
recent-history-load p95 is a non-zero value. If —, the app log at $logPath did not contain a matching hydration line; verify the correct log date is being read.
runs.persisted + 10000 ≈ the row count in $dbAfter's run_summaries table.

5.4 Rich defect model capture — SLICE-3.1, `HighDefect` profile

SLICE-3.1 capture. Measures the standard 29 Phase 2/3 metrics plus three new defect-specific metrics: defect rows written to SQLite during the capture window, classification distribution, and p95 persist latency. The HighDefect simulator profile fires frequent defects to stress the defect pipeline.

Scenario: MultiTagSoak with -Profile HighDefect, 30 minutes. Exit gate: defects.persisted ≥ 5000.

Step 1 — Disable sleep and build

powershell

powercfg /change standby-timeout-ac 0
powercfg /change monitor-timeout-ac 0
dotnet build --configuration Release

Step 2 — Locate the database file

powershell

$dbPath = "$env:LOCALAPPDATA\InspectionPrototype\inspection.db"

If the DB does not exist yet, launch the app once (the MigrationRunner hosted service creates the schema on first start) and then close it before proceeding:

powershell

Test-Path $dbPath

Step 3 — Snapshot before state

No pre-population is required — defects are generated by the simulator profile.

powershell

$date = Get-Date -Format 'yyyy-MM-dd'

# Optional: back up any existing DB
if (Test-Path $dbPath) {
    Copy-Item $dbPath "$dbPath.pre-$date"
}

# Snapshot before state
Copy-Item $dbPath "$env:TEMP\inspection-before-$date.db"

Step 4 — Run the capture

powershell

.\tools\Capture-Measurements.ps1 `
    -Scenario       MultiTagSoak `
    -Profile        HighDefect `
    -DurationSeconds 1800 `
    -OutputCsv      "docs/captures/slice-3-1-rich-defect-model-$date.csv" `
    -SliceTag       slice-3-1-rich-defect-model `
    -CommitHash     (git rev-parse --short HEAD)

Note: -AppendToTable is not used here because the row goes to docs/reviews/phase-3-measurements.md, not phase-1-measurements.md.

Step 5 — Snapshot after state

powershell

Copy-Item $dbPath "$env:TEMP\inspection-after-$date.db"

Step 6 — Extract the 32-metric row

powershell

Import-Module tools\MeasurementExtraction.psm1 -Force

$logDate    = Get-Date -Format 'yyyyMMdd'
$logPath    = "$env:LOCALAPPDATA\InspectionPrototype\logs\app-$logDate.log"
$dbBefore   = "$env:TEMP\inspection-before-$date.db"
$dbAfter    = "$env:TEMP\inspection-after-$date.db"

ConvertTo-MeasurementRow `
    -CsvPath        "docs\captures\slice-3-1-rich-defect-model-$date.csv" `
    -SliceTag       'slice-3-1-rich-defect-model' `
    -Scenario       'MultiTagSoak (HighDefect)' `
    -CommitHash     (git rev-parse --short HEAD) `
    -Date           $date `
    -DatabaseBefore $dbBefore `
    -DatabaseAfter  $dbAfter `
    -LogPath        $logPath

The output block ends with three Phase 3 defect rows:

| defects.persisted (count)           | <value>  |
| defect-classification distribution  | <value>  |
| defect-persist p95 (ms)             | <value>  |

Paste the full output block into docs/reviews/phase-3-measurements.md under ### Row — slice-3-1-rich-defect-model.

Sanity checks before recording the row:

defects.persisted ≥ 5 000. If below, the HighDefect profile is not configured to fire at the expected rate — check AppSettings.json simulator profile settings.
defect-classification distribution shows all five classification names with roughly equal percentages (the HighDefect profile generates a uniform mix).
defect-persist p95 (ms) will be — unless Debug-level Serilog logging is enabled (it requires LogLevel.Debug for InspectionPrototype.Application.Services.FramePipelineService). That is acceptable.

dotnet-counters collect --format csv writes long format: one row per counter per refresh interval, not one column per counter. The columns are:

Timestamp, Provider, Counter Name, Counter Type, Mean/Increment

Counter Type is Rate (a per-interval delta — e.g. frames ingested in this second) or Metric (a gauge snapshot — e.g. working-set bytes right now).
For Rate counters, sum Mean/Increment across rows to get the scenario total. Taking the last row gives you the last second's delta, not the cumulative total.
For Metric counters, take Max / Min / Avg as the metric demands.
Counters that never get incremented at all do not appear in the CSV — dotnet-counters only emits rows for counters that produced at least one sample. A frames.dropped counter that stays at zero will be absent from the file, not present-with-zero. Record it as 0 in the table.

Values to extract for the measurements table

The Counter Name column uses the raw OTel-style names from the runtime's built-in meter, not the legacy System.Runtime.* EventCounter names.

Metric	Counter Name (Provider)	How to aggregate
frames.ingested (total)	`frames.ingested (Count / 1 sec)` (InspectionPrototype)	sum `Mean/Increment`
frames.ingested rate (fps)	same	total ÷ duration seconds
frames.dropped (total)	`frames.dropped (Count / 1 sec)` (InspectionPrototype)	sum, or `0` if absent
telemetry.ingested (total)	`telemetry.ingested (Count / 1 sec)` (InspectionPrototype)	sum
telemetry.ingested rate (Hz)	same	total ÷ duration seconds
telemetry.coalesced (total)	`telemetry.coalesced (Count / 1 sec)` (InspectionPrototype)	sum, or `0` if absent
runs.started	`runs.started (Count / 1 sec)` (InspectionPrototype)	sum
runs.completed	`runs.completed (Count / 1 sec)` (InspectionPrototype)	sum
runs.faulted	`runs.faulted (Count / 1 sec)` (InspectionPrototype)	sum, or `0` if absent
working-set peak (MB)	`dotnet.process.memory.working_set (By)` (System.Runtime)	max ÷ 1 048 576
gen-0-gc-count (total)	`dotnet.gc.collections ({collection} / 1 sec)[gc.heap.generation=gen0]`	sum
gen-1-gc-count (total)	`dotnet.gc.collections ({collection} / 1 sec)[gc.heap.generation=gen1]`	sum
gen-2-gc-count (total)	`dotnet.gc.collections ({collection} / 1 sec)[gc.heap.generation=gen2]`	sum
alloc-rate avg (B/s)	`dotnet.gc.heap.total_allocated (By / 1 sec)` (System.Runtime)	avg
cpu-usage avg / peak (%)	`dotnet.process.cpu.time (s / 1 sec)[cpu.mode=user]` + `[cpu.mode=system]`	see below

CPU% is not a direct counter. Per timestamp, sum the user and system increments (seconds of CPU time used in that wall-second), divide by dotnet.process.cpu.count ({cpu}) to normalize to % of a single core, and multiply by 100. Take avg across timestamps for the avg row, max for the peak row.

PowerShell extraction script

Copy this into a .ps1 or run it inline. Replace the capture filename. It prints every value needed for a measurements-table row.

powershell

$csv = Import-Csv docs/captures/<capture-name>.csv
$first = [datetime]::Parse($csv[0].Timestamp)
$last  = [datetime]::Parse($csv[-1].Timestamp)
$dur   = ($last - $first).TotalSeconds

function SumCounter($name) { (($csv | Where-Object { $_.'Counter Name' -eq $name } |
    ForEach-Object { [double]$_.'Mean/Increment' }) | Measure-Object -Sum).Sum }
function AvgCounter($name) { (($csv | Where-Object { $_.'Counter Name' -eq $name } |
    ForEach-Object { [double]$_.'Mean/Increment' }) | Measure-Object -Average).Average }
function MaxCounter($name) { (($csv | Where-Object { $_.'Counter Name' -eq $name } |
    ForEach-Object { [double]$_.'Mean/Increment' }) | Measure-Object -Maximum).Maximum }

$cpuCount = [int](($csv | Where-Object { $_.'Counter Name' -eq 'dotnet.process.cpu.count ({cpu})' } |
    Select-Object -First 1).'Mean/Increment')
$cpuByTs = @{}
$csv | Where-Object { $_.'Counter Name' -eq 'dotnet.process.cpu.time (s / 1 sec)[cpu.mode=user]' } |
    ForEach-Object { $cpuByTs[$_.Timestamp] = [double]$_.'Mean/Increment' }
$csv | Where-Object { $_.'Counter Name' -eq 'dotnet.process.cpu.time (s / 1 sec)[cpu.mode=system]' } |
    ForEach-Object { $cpuByTs[$_.Timestamp] = ($cpuByTs[$_.Timestamp] + [double]$_.'Mean/Increment') }
$cpuPct = $cpuByTs.Values | ForEach-Object { 100.0 * $_ / $cpuCount }

"duration_s:            $dur"
"frames.ingested:       $(SumCounter 'frames.ingested (Count / 1 sec)')"
"telemetry.ingested:    $(SumCounter 'telemetry.ingested (Count / 1 sec)')"
"runs.started:          $(SumCounter 'runs.started (Count / 1 sec)')"
"runs.completed:        $(SumCounter 'runs.completed (Count / 1 sec)')"
"working-set peak MB:   {0:F1}" -f ((MaxCounter 'dotnet.process.memory.working_set (By)') / 1MB)
"gen-0 GCs:             $(SumCounter 'dotnet.gc.collections ({collection} / 1 sec)[gc.heap.generation=gen0]')"
"gen-1 GCs:             $(SumCounter 'dotnet.gc.collections ({collection} / 1 sec)[gc.heap.generation=gen1]')"
"gen-2 GCs:             $(SumCounter 'dotnet.gc.collections ({collection} / 1 sec)[gc.heap.generation=gen2]')"
"alloc-rate avg B/s:    {0:N0}" -f (AvgCounter 'dotnet.gc.heap.total_allocated (By / 1 sec)')
"cpu avg %:             {0:F2}" -f (($cpuPct | Measure-Object -Average).Average)
"cpu peak %:            {0:F2}" -f (($cpuPct | Measure-Object -Maximum).Maximum)

Remember to add 0 for any InspectionPrototype counter that is absent from the file. Absence means "never incremented"; in the measurements table that's still a data point.

5.4 Cassette cadence capture — SLICE-3.2, 25-wafer Soak8h

SLICE-3.2 capture. Verifies the full 25-wafer cassette loop under the Soak8h simulator profile and confirms that the FK band-aid retirement (stub-row pattern introduced in this slice) produces zero SqliteException (Error 19) entries and zero orphan defect rows.

IMPORTANT: this is NOT a Capture-Measurements.ps1 invocation. The cassette scheduler does not go through MultiTagSoakFlaUi. This scenario is invoked directly via the CassetteSoakFlaUi acceptance test:

powershell

$env:APP_PROCESS_ID = $appPid
dotnet test tests/InspectionPrototype.AcceptanceTests --configuration Release --no-build `
    --filter "FullyQualifiedName~CassetteSoakFlaUi"

Exit criteria: wafers.completed = 25, cassette.wall-clock ≤ 1 200 s (20 min), runs.faulted = 0, zero SqliteException Error 19 in app log.

Step 1 — Disable sleep and rebuild

powershell

powercfg /change standby-timeout-ac 0
powercfg /change monitor-timeout-ac 0
Stop-Process -Name 'InspectionPrototype.App' -Force -EA SilentlyContinue
Stop-Process -Name 'dotnet-counters' -Force -EA SilentlyContinue
dotnet build --configuration Release

Step 2 — Clean the database

Remove any existing WAL/SHM and the DB file so the capture starts from an empty state:

powershell

$dbPath = "$env:LOCALAPPDATA\LcnWaferInspection\inspection.db"
Remove-Item $dbPath -EA SilentlyContinue
Remove-Item "$dbPath-wal" -EA SilentlyContinue
Remove-Item "$dbPath-shm" -EA SilentlyContinue

Note: the DB path is LcnWaferInspection (not InspectionPrototype) — this is where the SqlitePersistenceOptions default path places the file.

Step 3 — Launch app and take before-snapshot

powershell

$date = Get-Date -Format 'yyyy-MM-dd'
$appDir = ".\src\InspectionPrototype.App\bin\Release\net10.0-windows"
Start-Process -FilePath "$appDir\InspectionPrototype.App.exe" -WorkingDirectory $appDir
Start-Sleep -Seconds 5
$appPid = (Get-Process 'InspectionPrototype.App').Id
Write-Host "App PID: $appPid"

# Before-snapshot (app has just initialised, 0 run_summaries rows)
Copy-Item $dbPath "$env:TEMP\inspection.db.before-$date" -Force

Note on WAL mode: SQLite WAL mode means the DB file may be 4096 bytes (header only) immediately after startup even though the schema is present — the schema pages live in the .db-wal file until the first checkpoint. The before-snapshot at this stage records the "0 run_summaries" baseline; Get-SqliteCount handles the missing-table case gracefully by returning 0.

Step 4 — Start dotnet-counters

powershell

$finalCsv = "docs\captures\slice-3-2-cassette-cadence-$date.csv"
Start-Process -FilePath 'dotnet-counters' `
    -ArgumentList @('collect','--process-id',"$appPid",
        '--counters','InspectionPrototype,System.Runtime',
        '--format','csv','--output',$finalCsv,'--refresh-interval','1') `
    -NoNewWindow
Start-Sleep -Seconds 3
Test-Path $finalCsv  # expect True

Important: start dotnet-counters in a dedicated terminal (not in the same window you will use for the FlaUI test command). A Ctrl+C in the test terminal sends SIGINT to all processes sharing that console, which will prematurely stop the counters.

Step 5 — Run the CassetteSoakFlaUi acceptance test

In a separate terminal:

powershell

$env:APP_PROCESS_ID = "$appPid"
dotnet test tests\InspectionPrototype.AcceptanceTests `
    --configuration Release --no-build `
    --filter "FullyQualifiedName~CassetteSoakFlaUi" `
    --logger "console;verbosity=detailed"

The test sequence:

Switches simulator to Soak8h profile.
Connects, refreshes recipe catalog, loads recipe, homes.
Clicks Load Cassette → Start Cassette Run.
Waits for the Unload Cassette button to become enabled (fires when phase = Complete).
Clicks Unload Cassette → Disconnect.

Expected duration: ~4–6 minutes for 25 wafers at Soak8h timing (2 000 ms load + 1 500 ms align + ~5 500 ms run + 1 500 ms unload per wafer ≈ 262 s wafer-loop, plus ~30 s setup/teardown). Total test wall-clock: 5–6 minutes.

Step 6 — Take after-snapshot and stop counters

After the test reports Passed:

powershell

# Kill the app to force WAL checkpoint before taking the after-snapshot
Stop-Process -Name 'InspectionPrototype.App' -Force -EA SilentlyContinue
Start-Sleep -Seconds 3

Copy-Item $dbPath "$env:TEMP\inspection.db.after-$date" -Force
Write-Host "After-snapshot: $((Get-Item "$env:TEMP\inspection.db.after-$date").Length) bytes"
# Expect ~65 000–80 000 bytes (schema + 25 run_summaries + defects + alarm_history)

# Stop counters (in the dedicated dotnet-counters terminal, press Q or use:)
Stop-Process -Name 'dotnet-counters' -Force -EA SilentlyContinue

Step 7 — Verify DB contents

powershell

$dotNetQDir = "$env:TEMP\InspProto_Q2"
# (see existing InspProto_Q2 dotnet-run project, or use the psm1 helpers directly)

Import-Module .\tools\MeasurementExtraction.psm1 -Force
$lotId = 'LOT-<date>-HHMMSS'   # read from app log or DB

Get-WafersCompletedCount    -DatabasePath "$env:TEMP\inspection.db.after-$date" -LotId $lotId
# expect 25

Get-CassetteWallClockSeconds -DatabasePath "$env:TEMP\inspection.db.after-$date" -LotId $lotId
# expect ≤ 1200

# FK error check
Select-String -Path "$env:LOCALAPPDATA\InspectionPrototype\logs\app-$($date -replace '-','').log" `
    -Pattern 'SqliteException|Error 19'
# expect 0 matches

To find the LotId without a query tool:

powershell

Select-String -Path "$env:LOCALAPPDATA\InspectionPrototype\logs\app-$($date -replace '-','').log" `
    -Pattern 'LotId=LOT-' | Select-Object -Last 1

Step 8 — Extract the 34-metric row

powershell

Import-Module .\tools\MeasurementExtraction.psm1 -Force

ConvertTo-MeasurementRow `
    -CsvPath        "docs\captures\slice-3-2-cassette-cadence-$date.csv" `
    -SliceTag       'slice-3-2-cassette-cadence' `
    -Scenario       'CassetteCadence' `
    -CommitHash     (git rev-parse --short HEAD) `
    -Date           $date `
    -DatabaseBefore "$env:TEMP\inspection.db.before-$date" `
    -DatabaseAfter  "$env:TEMP\inspection.db.after-$date" `
    -LotId          $lotId

The output block ends with two cassette-specific rows:

| wafers.completed (count)            | 25       |
| cassette.wall-clock (s)             | <value>  |

Paste the full block into docs/reviews/phase-3-measurements.md under ### Row — slice-3-2-cassette-cadence.

Sanity checks before recording the row:

wafers.completed = 25. If below, check the RunWaferAsync guard in SimulatedCassetteScheduler — CommandGuards.CanStart must allow WorkflowState.Completed (added in SLICE-3.2 Pass 3).
cassette.wall-clock ≤ 1 200. If above, check the Soak8h profile timing fields (WaferLoadMs, WaferAlignMs, WaferUnloadMs).
runs.faulted = 0 in the CSV. If non-zero, a fault was injected during the cassette run — check the app log for alarm entries.
Zero SqliteException (Error 19) in the app log. This is the FK band-aid retirement proof. Non-zero means the stub-row pattern is not working (e.g., stub-row insert is failing silently).
defects.persisted ≥ 0. Soak8h has low but non-zero DefectProbabilityPerFrame; a few defects per wafer is normal.
Phase 2 trigger assessment: store.update alloc share < 10% and store.update lock-wait p95 < 100 µs means Phase 2 remains deferred. If either threshold is crossed, open the relevant Phase 2 slice.

6. Writing to the measurements table

The table lives at docs/reviews/phase-1-measurements.md. Its conventions:

one row per (slice, metric) pair
the slice column identifies which change produced the delta; row 0's slice tag is demo-baseline (pre-Phase-1)
the baseline column is the number from the before capture
the after column is the number from the after capture
the delta column is after − baseline (or after ÷ baseline for rates — pick one convention per metric and stick with it)
the capture method column names the CSV file and the scenario
the date column is the capture date in ISO format

See the table file itself for the live columns and filled rows.

When a metric does not apply

Some metrics only matter for some slices. For example, telemetry.coalesced is near zero at demo rates and only becomes interesting at 200 tags × 10 Hz. Record every metric for every row anyway — a zero is a data point. The shape of the table stays uniform.

7. Troubleshooting

`dotnet-counters ps` does not show the app

The app must be running as a managed .NET process the tool can attach to. Common causes:

the app was launched from Visual Studio with "attach to process" disabled
the app crashed during launch (check %LOCALAPPDATA%\InspectionPrototype\logs\ and %LOCALAPPDATA%\InspectionPrototype\crashes\)
the PID printed is of the dotnet host rather than the .App child — attach by PID in that case, not by name

The CSV file is empty or has only a header

You stopped collection before the first refresh interval elapsed, or the process exited before collection started. Re-run and confirm the app is responding before Ctrl+C.

Counter values look wrong (all zero, or a single giant spike)

All zero: the meter name is wrong in the --counters flag. It must be exactly InspectionPrototype (case sensitive).
Giant spike: you captured across the warm-up period. First ~3 seconds after launch include JIT and DI startup work; those spikes are not representative. For soak measurements, drop the first 30 seconds when computing averages.

`dotnet-counters collect` says "process not found"

The process you named exited. If you attach by --name rather than PID and the app crashes, collection silently stops. Attach by PID (--process-id) for longer captures, and cross-check by opening a second monitor session.

8. Adding a new capture type

Each Phase 1 slice adds its own scenario entry in section 4.2 and, after the slice merges, produces two CSV files in docs/captures/ (before and after) and two rows per metric in the measurements table.

Checklist when adding a new capture type:

Add the scenario stub to section 4 of this runbook. Scenario must be deterministic — fixed duration, fixed button sequence, fixed simulator profile.
Ensure the new slice's code exposes any new counters via the existing AppMetrics class. Do not create a second meter.
Capture the before row using the new scenario before starting the slice's implementation work. If you only capture after, you have no baseline to compare against, and the slice's exit gate cannot be evaluated.
After the slice merges, capture the after row with the same scenario.
Commit both CSVs and the table edit in the same commit as the slice work, so the evidence travels with the change.

Observability runbook — what counters exist, where logs and crash files live, live monitor mode
Evolution roadmap — why the measurement discipline exists at all
Phase 1 measurements table — the table being populated

Streaming Pipelines Dotnet Real World

Capturing Measurements Runbook ​

1. When to use this runbook ​

2. Prerequisites ​

3. The capture procedure ​

Terminal 1 — launch the app ​

Terminal 2 — start the CSV collector ​

Terminal 3 — follow the scenario script ​

Finishing cleanly ​

3a. Disable system sleep before any capture ​

3b. Automated capture with the FlaUI rig (SLICE-1.6) ​

FlaUI prerequisites ​

Registered scenarios ​

When to fall back to manual §3 ​

4. Scenarios ​

4.1 Demo baseline (row 0) ​

4.2 Multi-tag soak — slice-1.1, MultiTag profile ​

4.3 — Real frame payloads (SLICE-1.2, 30 fps × 2 MP, HighFrameRate profile) ​

4.4 Encoder-rate motion — SLICE-1.3, EncoderRate profile ​

4.5 Chaos-monkey scenario — SLICE-1.4, ChaosMonkey profile ​

4.6 Soak scenario — SLICE-1.4, Soak8h profile ​

4.7+ — Phase 2 scenarios ​

5. Phase 2 capture scenarios ​

5.1 Store-allocation profiling — SLICE-2.0, MultiTag profile ​

5.2 Subscriber-invocation profiling — SLICE-2.4, HighDefect profile ​

5.3 SQLite persistence profiling — SLICE-3.3, MultiTag profile ​

Step 1 — Disable sleep and build ​

Step 2 — Locate the database file ​

Step 3 — Back up and pre-populate ​

Step 4 — Run the capture ​

Step 5 — Snapshot after state ​

Step 6 — Extract the 29-metric row ​

5.4 Rich defect model capture — SLICE-3.1, HighDefect profile ​

Step 1 — Disable sleep and build ​

Step 2 — Locate the database file ​

Step 3 — Snapshot before state ​

Step 4 — Run the capture ​

Step 5 — Snapshot after state ​

Step 6 — Extract the 32-metric row ​

Values to extract for the measurements table ​

PowerShell extraction script ​

5.4 Cassette cadence capture — SLICE-3.2, 25-wafer Soak8h ​

Step 1 — Disable sleep and rebuild ​

Step 2 — Clean the database ​

Step 3 — Launch app and take before-snapshot ​

Step 4 — Start dotnet-counters ​

Step 5 — Run the CassetteSoakFlaUi acceptance test ​

Step 6 — Take after-snapshot and stop counters ​

Step 7 — Verify DB contents ​

Step 8 — Extract the 34-metric row ​

6. Writing to the measurements table ​

When a metric does not apply ​

7. Troubleshooting ​

dotnet-counters ps does not show the app ​

The CSV file is empty or has only a header ​

Counter values look wrong (all zero, or a single giant spike) ​

dotnet-counters collect says "process not found" ​

8. Adding a new capture type ​

9. Related material ​

Capturing Measurements Runbook

1. When to use this runbook

2. Prerequisites

3. The capture procedure

Terminal 1 — launch the app

Terminal 2 — start the CSV collector

Terminal 3 — follow the scenario script

Finishing cleanly

3a. Disable system sleep before any capture

3b. Automated capture with the FlaUI rig (SLICE-1.6)

FlaUI prerequisites

Registered scenarios

When to fall back to manual §3

4. Scenarios

4.1 Demo baseline (row 0)

4.2 Multi-tag soak — slice-1.1, `MultiTag` profile

4.3 — Real frame payloads (SLICE-1.2, 30 fps × 2 MP, HighFrameRate profile)

4.4 Encoder-rate motion — SLICE-1.3, `EncoderRate` profile

4.5 Chaos-monkey scenario — SLICE-1.4, `ChaosMonkey` profile

4.6 Soak scenario — SLICE-1.4, `Soak8h` profile

4.7+ — Phase 2 scenarios

5. Phase 2 capture scenarios

5.1 Store-allocation profiling — SLICE-2.0, `MultiTag` profile

5.2 Subscriber-invocation profiling — SLICE-2.4, `HighDefect` profile

5.3 SQLite persistence profiling — SLICE-3.3, `MultiTag` profile

Step 1 — Disable sleep and build

Step 2 — Locate the database file

Step 3 — Back up and pre-populate

Step 4 — Run the capture

Step 5 — Snapshot after state

Step 6 — Extract the 29-metric row

5.4 Rich defect model capture — SLICE-3.1, `HighDefect` profile

Step 1 — Disable sleep and build

Step 2 — Locate the database file

Step 3 — Snapshot before state

Step 4 — Run the capture

Step 5 — Snapshot after state

Step 6 — Extract the 32-metric row

Values to extract for the measurements table

PowerShell extraction script

5.4 Cassette cadence capture — SLICE-3.2, 25-wafer Soak8h

Step 1 — Disable sleep and rebuild

Step 2 — Clean the database

Step 3 — Launch app and take before-snapshot

Step 4 — Start dotnet-counters

Step 5 — Run the CassetteSoakFlaUi acceptance test

Step 6 — Take after-snapshot and stop counters

Step 7 — Verify DB contents

Step 8 — Extract the 34-metric row

6. Writing to the measurements table

When a metric does not apply

7. Troubleshooting

`dotnet-counters ps` does not show the app

The CSV file is empty or has only a header

Counter values look wrong (all zero, or a single giant spike)

`dotnet-counters collect` says "process not found"

8. Adding a new capture type

9. Related material