Skip to content

TASK-1.5: Implement Automated Measurement Capture

Status: Superseded as of 2026-04-27. The shipped Pass 1/2/3 work (IScenario, ScenarioRunner, Capture-Measurements.ps1, etc.) was removed when the slice was retired in favor of a UI-Automation approach (shipped as SLICE-1.6, FlaUI, 2026-04-27). The Copilot pass prompts below are no longer the right execution contract — read them only for context on the design tradeoffs that were debated. Do not run any of the prompts.

Objective

Add an in-app IScenario abstraction and a --scenario/--duration/--output-csv CLI flag to InspectionPrototype.App that runs a named scenario to completion without showing a window, plus a tools/Capture-Measurements.ps1 orchestrator that wraps dotnet-counters collect around the scenario run and emits a 16-metric markdown row block. Re-baseline the demo scenario as row 0a so every Phase 1 delta from this slice forward compares against an automated-capture reference.

Scope

  • introduce IScenario and IOperatorCommands abstractions in the Application layer
  • add a ScenarioRunner hosted-style service that resolves scenarios by name and drives them to completion under a duration-bounded cancellation token
  • add CLI parsing to App.OnStartup so --scenario <name> routes to scenario mode (no window) and absent flag preserves today's interactive behavior
  • implement two concrete scenarios — DemoBaselineScenario and MultiTagSoakScenario — calling the existing IRelayCommand instances on MainViewModel
  • add tools/Capture-Measurements.ps1 orchestrator and tools/MeasurementExtraction.psm1 extraction module (the latter is the lifted §5 extraction script with one source of truth)
  • capture row 0a using DemoBaselineScenario and append it to docs/reviews/phase-1-measurements.md
  • add §3a. Automated capture (preferred) to docs/runbook/capturing-measurements.md and name the IScenario class in the existing §4.1 and §4.2 entries
  • add tests for: scenario name resolution, unknown-scenario rejection, FakeScenario completion under cancellation, both concrete scenarios reaching their final step against in-memory fakes

Non-Scope

  • replacing dotnet-counters with an in-process exporter
  • driving scenarios via Windows UI Automation (FlaUI / TestStack.White)
  • CI integration / GitHub Actions wiring
  • new measurement metrics beyond the existing 16
  • new simulator profiles or scenarios beyond Demo Baseline and Multi-Tag Soak (SLICE-1.2/1.3/1.4 add their own)
  • a GUI for picking scenarios
  • back-porting any other historical row under the new tooling — only row 0a is captured

Touched Projects

  • src/InspectionPrototype.ApplicationIScenario, IOperatorCommands, ScenarioRunner, Scenarios/ folder
  • src/InspectionPrototype.PresentationMainViewModel exposes the existing IRelayCommand instances via the new IOperatorCommands facade (no behavior change)
  • src/InspectionPrototype.App — CLI parsing in App.OnStartup, DI registration of IScenario implementations
  • tests/InspectionPrototype.TestsScenarioRunnerTests, DemoBaselineScenarioTests, MultiTagSoakScenarioTests, FakeOperatorCommands, FakeAppStateStore (or extend an existing one)
  • tools/Capture-Measurements.ps1, tools/MeasurementExtraction.psm1 — new directory
  • docs/runbook/capturing-measurements.md — new §3a, plus one-line additions to §4.1 and §4.2
  • docs/reviews/phase-1-measurements.md — new row 0a block
  • docs/captures/demo-baseline-automated-<date>.csv — evidence

AI Tool Guidance

This task spans abstraction work, scripting, and a measurement run. Split it into three Copilot passes; do not paste all three prompts into a single session.

  1. Scenario abstraction and entry-point routingIScenario, IOperatorCommands, ScenarioRunner, CLI parsing in App.OnStartup, a FakeScenario for tests. No concrete scenarios, no orchestrator script. Existing interactive launch is unchanged.
  2. Concrete scenarios and orchestrator scriptDemoBaselineScenario, MultiTagSoakScenario, the PowerShell orchestrator and extraction module. Tests verify each scenario reaches its final step against in-memory fakes.
  3. Row 0a capture and runbook §3a — run the orchestrator end-to-end against the freshly-built app, commit the CSV, append the row 0a block, write the new runbook section, update CLAUDE.md and the session log.

Each pass ends with its own commit. Run dotnet test and confirm acceptance criteria for that pass before kicking off the next session.

Acceptance Criteria Mapping

The implementation must satisfy all acceptance criteria from SLICE-1.5:

  • Pass 1 covers criteria 1 (entry-point routing — initially with FakeScenario only), 3 (unknown-scenario rejection), 8 (scenario-runner tests), 9 (no behavior change to existing command surface)
  • Pass 2 covers criteria 1 and 2 (both concrete scenarios producing valid CSVs), 4 (orchestrator script + markdown block), 8 (concrete-scenario tests)
  • Pass 3 covers criteria 5, 6, 7, 10 (row 0a capture, comparability footnote, runbook §3a, no Phase 1 row added)

Copilot Agent Prompts

Pass 1 — Scenario abstraction and entry-point routing

You are implementing Pass 1 of TASK-1.5 in this repository: introduce the IScenario
abstraction, the ScenarioRunner, and CLI routing in App.OnStartup so the app can be
launched headless via `--scenario <name>`. NO concrete scenarios yet — those land in
Pass 2. The interactive WPF launch (no flag) must behave exactly as today.

## Authoritative references

Read these before making changes:
- docs/specs/SLICE-1.5-automated-measurement-capture.md           (the requirements)
- docs/tasks/TASK-1.5-implement-automated-measurement-capture.md  (this task)
- src/InspectionPrototype.App/App.xaml.cs                          (entry point)
- src/InspectionPrototype.Presentation/ViewModels/MainViewModel.cs (IRelayCommand surface to facade)
- src/InspectionPrototype.Application/Abstractions/IAppStateStore.cs
- src/InspectionPrototype.Application/ApplicationServiceCollectionExtensions.cs

Spec acceptance criteria 1 (entry-point routing — using FakeScenario), 3
(unknown-scenario rejection), 8 (runner tests), and 9 (no caller delta on the
existing command surface) are the definition of done for this pass.

## Scope of this pass

Abstractions, runner, CLI parsing, FakeScenario, tests. NO DemoBaselineScenario
or MultiTagSoakScenario. NO PowerShell scripts. NO captures.

## Deliverables

1. New abstractions under src/InspectionPrototype.Application/Scenarios/:
   - IScenario.cs:
       string Name { get; }                                       // case-insensitive match key
       Task RunAsync(IOperatorCommands ops, IAppStateStore state, CancellationToken ct);
   - IOperatorCommands.cs — facade over the existing IRelayCommand instances on
     MainViewModel. Expose only what scenarios need:
       IRelayCommand Connect { get; }
       IRelayCommand Disconnect { get; }
       IRelayCommand RefreshCatalog { get; }
       IRelayCommand LoadRecipe { get; }
       IRelayCommand Home { get; }
       IRelayCommand StartRun { get; }
       IRelayCommand Stop { get; }
       IRelayCommand ApplySimulatorProfile { get; }
       string? SelectedRecipeName { get; set; }      // for LoadRecipe
       string? SelectedSimulatorProfileName { get; set; }  // for ApplySimulatorProfile
     The facade is a property-passthrough, NOT a re-implementation. It binds to
     MainViewModel and forwards each property to the corresponding command /
     selection on MainViewModel. Do NOT introduce new commands or change behavior.

2. ScenarioRunner under src/InspectionPrototype.Application/Scenarios/:
   - file: ScenarioRunner.cs
   - constructor: IEnumerable<IScenario>, IOperatorCommands, IAppStateStore,
     ILogger<ScenarioRunner>
   - public method: Task<int> RunAsync(string scenarioName, TimeSpan duration,
     CancellationToken stoppingToken)
       * resolves the IScenario whose Name matches scenarioName (StringComparer.OrdinalIgnoreCase)
       * if no match: log a Warning naming the requested name and the registered names,
         return exit code 2
       * if duration <= TimeSpan.Zero: log Warning, return exit code 2
       * otherwise: link stoppingToken with a CancellationTokenSource cancelled at
         `duration`, await scenario.RunAsync(ops, state, linkedToken)
       * on OperationCanceledException from the duration timeout: this is the normal
         exit path — log Information "Scenario {Name} reached duration {Duration}",
         return exit code 0
       * on any other exception: log Critical with the exception, return exit code 3
       * log Information at scenario start with the resolved scenario name, registered
         scenario count, and duration

3. CLI parsing in src/InspectionPrototype.App/App.xaml.cs:
   - Define a record `ScenarioCliArgs(string Name, TimeSpan Duration, string? OutputCsv,
     string? Profile, TimeSpan OperatorDelay)` in a new file
     src/InspectionPrototype.App/ScenarioCliArgs.cs along with a static
     `TryParse(string[] args, out ScenarioCliArgs? parsed, out string? error)` method.
     Recognized flags: --scenario, --duration (seconds, integer), --output-csv,
     --profile, --operator-delay (milliseconds, integer, default 0). --output-csv
     is parsed but NOT yet acted on (Pass 2 wires it to the actual capture).
     Unknown flags produce a clear error string. --scenario without --duration is
     an error.
   - In App.OnStartup, after the single-instance mutex acquisition but BEFORE Serilog
     bootstrap and Host build, call ScenarioCliArgs.TryParse(e.Args, ...). If parsed
     is non-null, set a private bool _isScenarioMode = true and store the parsed args
     on a field. If TryParse returns an error, write to bootstrap.log and Environment.Exit(2).
     If --scenario is absent, behavior is unchanged.
   - When _isScenarioMode is true:
       * still build the Host with the same ConfigureServices (scenarios need the
         full DI graph)
       * still start the Host (await _host.StartAsync())
       * register the unhandled-exception handlers EXCEPT the one that calls
         workflowService.AbortAsync() on UI exceptions — in scenario mode there is
         no UI dispatcher loop, so log the exception and let the runner surface it
         via its return code instead
       * resolve ScenarioRunner from the host's services and await
         runner.RunAsync(args.Name, TimeSpan.FromSeconds(args.Duration), CancellationToken.None)
       * on completion: await _host.StopAsync, dispose the mutex, call
         Application.Current.Shutdown(exitCode) so WPF returns the runner's exit code
       * do NOT call mainWindow.Show()
   - When _isScenarioMode is false: call mainWindow.Show() exactly as today.

4. DI registration:
   - In ApplicationServiceCollectionExtensions.AddApplicationServices, register
     ScenarioRunner as a transient (it holds no state between runs).
   - Register IScenario implementations via services.AddSingleton<IScenario, FakeScenario>()
     (Pass 2 adds the real ones; FakeScenario stays registered for test runs only —
     gate it behind an `#if DEBUG` or a configuration flag so production release
     builds do not expose it. Implementation choice: prefer a `services.AddScenarios()`
     extension method on the Application layer that takes a flag, called from
     App.xaml.cs's ConfigureServices with `includeFakes: Debugger.IsAttached`).
   - Register IOperatorCommands -> MainViewModelOperatorCommandsAdapter (a new
     class in src/InspectionPrototype.Presentation/Scenarios/ that takes
     MainViewModel via constructor and exposes the IRelayCommand properties).
     Lifetime: singleton, same as MainViewModel.

5. FakeScenario under src/InspectionPrototype.Application/Scenarios/Testing/:
   - file: FakeScenario.cs
   - Name = "Fake"
   - RunAsync awaits Task.Delay(Timeout.InfiniteTimeSpan, ct) so the runner's
     duration timeout is what ends it. The point is to verify the runner's
     duration / cancellation plumbing without exercising any real commands.

6. Tests under tests/InspectionPrototype.Tests/Scenarios/:
   - ScenarioCliArgsTests.cs:
       * --scenario X --duration 30 parses OK with default profile null and
         operator-delay zero
       * --scenario X without --duration produces an error
       * --scenario X --duration 0 (or negative) produces an error
       * --scenario X --duration 30 --output-csv path --profile MultiTag
         --operator-delay 1500 parses all five fields
       * unknown flag produces an error naming the flag
       * absent --scenario returns parsed = null with no error (interactive launch path)
   - ScenarioRunnerTests.cs:
       * resolves "Fake" (case-insensitive: "fake", "FAKE", "Fake" all match)
         and runs to completion under a 100 ms duration with exit code 0
       * unknown name "Nonsense" returns exit code 2 without invoking any
         scenario
       * duration TimeSpan.Zero returns exit code 2
       * a scenario that throws InvalidOperationException returns exit code 3
         and logs the exception at Critical
       * registers MULTIPLE IScenario implementations; the runner picks the right
         one by name (use FakeScenario plus a second test-only AnotherFakeScenario
         with Name = "Another")

## Constraints

- Do NOT change any existing behavior of MainViewModel or its commands.
- Do NOT add Program.cs — WPF projects use App.xaml.cs as the entry point.
- Do NOT capture any measurements in this pass.
- Do NOT add any concrete scenarios beyond FakeScenario.
- Do NOT introduce a separate console-app project; the same App.xaml.cs entry
  point routes both modes.
- Do NOT change the single-instance mutex behavior. A second scenario launch
  while one is in flight should be rejected by the existing mutex, exactly the
  same as a second interactive launch.
- The IOperatorCommands adapter must NOT call MainViewModel methods directly —
  it forwards IRelayCommand properties only. Scenarios go through Execute(null)
  on the same command instances XAML binds to.

## Verification before you report done

  dotnet build --configuration Release
  dotnet test --configuration Release

Manual smoke tests:
  - launch the app with no args: window appears as today, normal interactive flow
  - launch with `--scenario Fake --duration 5`: no window appears, app exits
    after ~5 seconds with code 0, app-<date>.log shows scenario-start and
    scenario-end Information entries
  - launch with `--scenario Nonsense --duration 5`: exits within ~1 second with
    a non-zero code, log entry names the requested scenario and lists registered
    scenario names
  - launch with `--unknown-flag`: exits with code 2 and bootstrap.log shows the
    parse error

## Report format when finished

- files created and files modified
- confirmation that all existing tests still pass plus new scenario-runner tests
- a single commit hash
- commit message: "feat(scenarios): add IScenario abstraction, runner, and CLI routing for headless captures (pass 1/3 of TASK-1.5)"

Pass 2 — Concrete scenarios and orchestrator script

You are implementing Pass 2 of TASK-1.5. Pass 1 (IScenario, ScenarioRunner, CLI
routing, FakeScenario) is already merged. This pass adds the two concrete
scenarios — DemoBaselineScenario and MultiTagSoakScenario — and the PowerShell
orchestrator that wraps `dotnet-counters collect` around a scenario run and
emits a 16-metric markdown row block.

## Authoritative references

Read these before making changes:
- docs/specs/SLICE-1.5-automated-measurement-capture.md   (criteria 1, 2, 4, 8)
- docs/runbook/capturing-measurements.md                   (existing §4.1 step list, §5 extraction script)
- docs/reviews/phase-1-measurements.md                     (row 0 column shape to mirror)
- src/InspectionPrototype.Application/Scenarios/           (Pass 1 output)
- src/InspectionPrototype.Application/State/AppState.cs    (WorkflowState shape — what scenarios wait on)
- src/InspectionPrototype.Presentation/ViewModels/MainViewModel.cs  (command names and selection properties)

Pass 1's IScenario, IOperatorCommands, ScenarioRunner, and CLI routing must
already be in place. Confirm by running `--scenario Fake --duration 5` and
checking exit code 0 before starting.

## Scope of this pass

Two concrete IScenario implementations, a thin PowerShell orchestrator, an
extraction module, and tests. NO row 0a capture (Pass 3), NO runbook updates
beyond what is needed to document the script's invocation.

## Deliverables

1. DemoBaselineScenario under src/InspectionPrototype.Application/Scenarios/:
   - Name = "DemoBaseline"
   - RunAsync sequence — each step awaits a state transition via IAppStateStore.StateChanged
     with a per-step timeout (default 10 s; on timeout throw with a message naming
     the expected state and the actual state):
       1. ops.Connect.Execute(null);     wait until state.WorkflowState == Connected (or any
                                         state that is downstream of Connected)
       2. ops.RefreshCatalog.Execute(null); wait until state.RecipeCatalog is non-empty
       3. ops.SelectedRecipeName = "standard-5pt-wafer-scan";
       4. ops.LoadRecipe.Execute(null);  wait until state.LoadedRecipe?.Name == "standard-5pt-wafer-scan"
       5. ops.Home.Execute(null);        wait until state.WorkflowState == Idle (post-home)
       6. loop until ct.IsCancellationRequested:
            ops.StartRun.Execute(null);
            wait until state.WorkflowState == Running
            wait until state.WorkflowState is one of { Idle, Faulted } (run terminated)
            (no operator-delay sleep at default; if config injects one, sleep here —
             see deliverable 4 for the wiring)
       7. on cancellation: if state.WorkflowState == Running, ops.Stop.Execute(null)
          and wait up to 10 s for terminal state; then ops.Disconnect.Execute(null)
   - Helper: a private static `WaitForStateAsync(IAppStateStore state, Func<AppState, bool> predicate,
     TimeSpan timeout, CancellationToken ct)` that subscribes to StateChanged, checks
     predicate against Current first (avoid missing already-true conditions), and
     completes the awaiter on first match. The same helper is reused by
     MultiTagSoakScenario.

2. MultiTagSoakScenario under src/InspectionPrototype.Application/Scenarios/:
   - Name = "MultiTagSoak"
   - RunAsync sequence — identical to DemoBaselineScenario EXCEPT step 0 inserted
     before Connect:
       0. ops.SelectedSimulatorProfileName = "MultiTag";
          ops.ApplySimulatorProfile.Execute(null);
          wait until state.ActiveSimulatorProfileName == "MultiTag"
     The §4.2 sanity check exists exactly because this step gets skipped under
     manual operation; in the automated path it is sequential and unmissable.
   - Reuse the same WaitForStateAsync helper and the same StartRun loop.

3. Operator-delay wiring:
   - Both scenarios accept an optional TimeSpan operator-delay via constructor or
     options pattern. The CLI flag --operator-delay (parsed in Pass 1) flows from
     ScenarioCliArgs into the runner via a new IScenarioOptions interface that
     ScenarioRunner sets before calling RunAsync, OR via a ScenarioContext record
     passed to RunAsync (implementation choice — pick whichever is simpler given
     the Pass 1 shape; document the choice in the commit message).
   - Default operator-delay is TimeSpan.Zero. When non-zero, sleep for that
     duration after each terminal-state observation in the StartRun loop. This
     is the only place operator-delay is honored — pre-Connect sequencing is
     gated on state transitions, not delays.

4. DI registration:
   - In src/InspectionPrototype.App/App.xaml.cs ConfigureServices (or in the
     AddScenarios extension introduced in Pass 1):
       services.AddSingleton<IScenario, DemoBaselineScenario>();
       services.AddSingleton<IScenario, MultiTagSoakScenario>();
   - FakeScenario registration stays gated as in Pass 1.

5. tools/Capture-Measurements.ps1:
   - Create the tools/ directory.
   - Parameters:
       [string]$Scenario      (mandatory: 'DemoBaseline' or 'MultiTagSoak')
       [int]$Duration         (mandatory: seconds)
       [string]$OutputCsv     (mandatory: full path under docs/captures/)
       [string]$CommitHash    (mandatory: short or full hash; included in the row block header)
       [string]$Profile       (optional)
       [int]$OperatorDelayMs  (optional; default 0)
       [switch]$AppendToTable (optional; if present, append the markdown block to
                              docs/reviews/phase-1-measurements.md under the right
                              ## Phase 1 rows subsection rather than just printing)
   - Sequence:
       1. dotnet build --configuration Release (fail script on non-zero)
       2. Start the app in the background:
            $appProcess = Start-Process -FilePath ".\src\InspectionPrototype.App\bin\Release\net10.0-windows\InspectionPrototype.App.exe" `
                -ArgumentList "--scenario $Scenario --duration $Duration --output-csv $OutputCsv --profile $Profile --operator-delay $OperatorDelayMs" `
                -PassThru
       3. Poll `dotnet-counters ps` for the new PID (timeout 30 s; fail if not found)
       4. Start dotnet-counters collect against that PID with --refresh-interval 1
          and --counters InspectionPrototype,System.Runtime, output to $OutputCsv.
          Run as a background job so the script can wait on the app process.
       5. Wait-Process -InputObject $appProcess (no timeout; the runner enforces
          duration internally, then exits)
       6. Sleep 2 seconds (one extra refresh interval to let the collector flush)
       7. Stop the dotnet-counters job; confirm $OutputCsv exists and is non-empty
       8. If $appProcess.ExitCode -ne 0: rename $OutputCsv to $OutputCsv + ".partial",
          write an error, exit 1
       9. Import-Module .\tools\MeasurementExtraction.psm1
      10. $row = ConvertTo-MeasurementRow -CsvPath $OutputCsv -SliceTag $Scenario `
              -Scenario $Scenario -CommitHash $CommitHash -Date (Get-Date -Format yyyy-MM-dd)
      11. If $Scenario -eq 'MultiTagSoak': run the per-tag rate-check from the
          existing §4.2 PowerShell snippet and write the .txt file next to the CSV.
          Fail script if any tag exceeds ±2% or any tag is missing.
      12. If $AppendToTable: read docs/reviews/phase-1-measurements.md, find the
          "## Phase 1 rows" header (insert if missing), append $row block under it
          with a blank line separator. Otherwise: Write-Output $row.

6. tools/MeasurementExtraction.psm1:
   - Lift the §5 extraction script from docs/runbook/capturing-measurements.md
     into a module function:
       function ConvertTo-MeasurementRow {
           param([string]$CsvPath, [string]$SliceTag, [string]$Scenario,
                 [string]$CommitHash, [string]$Date)
           # returns a multi-line string: a 16-row markdown table block
           # plus a one-line header naming SliceTag, Scenario, CommitHash, Date
       }
   - Helper functions SumCounter, AvgCounter, MaxCounter, plus the CPU% computation
     are private to the module.
   - Export-ModuleMember -Function ConvertTo-MeasurementRow.
   - The module's output for an existing capture (e.g. docs/captures/demo-baseline-2026-04-23.csv)
     must produce numbers that match row 0's existing values to within ±0.5% on every
     numeric cell. This is the verification that the lifted module is faithful to the
     original embedded script. Document this comparison run in the commit message.

7. Tests under tests/InspectionPrototype.Tests/Scenarios/:
   - DemoBaselineScenarioTests.cs:
       * In-memory FakeOperatorCommands (a stub IOperatorCommands that records each
         Execute call and exposes flags for "should the next state transition fire?")
       * In-memory FakeAppStateStore that lets the test push state transitions
       * Test: scenario advances through Connect → RefreshCatalog → LoadRecipe →
         Home → first StartRun → Idle, then the test cancels the token and
         scenario calls Stop + Disconnect cleanly
       * Test: when WaitForStateAsync times out (10 s expected state, fake never
         transitions), scenario throws with a message naming the expected state
   - MultiTagSoakScenarioTests.cs:
       * Test: scenario sets SelectedSimulatorProfileName = "MultiTag" and
         calls ApplySimulatorProfile BEFORE Connect
       * Test: rest of sequence mirrors DemoBaselineScenario
   - WaitForStateAsyncTests.cs (or inlined in the scenario tests):
       * Predicate already true at entry: completes immediately without waiting
         for a StateChanged event
       * Predicate becomes true after one event: completes
       * Timeout reached: throws TimeoutException naming the timeout duration

## Constraints

- Do NOT capture row 0a in this pass. The capture run is Pass 3.
- Do NOT modify docs/runbook/capturing-measurements.md beyond adding the script's
  invocation example as a code block in the existing §5 (a one-paragraph
  forward-reference is fine; the §3a section lands in Pass 3).
- Do NOT use Thread.Sleep or Task.Delay as a substitute for state-transition
  waits. The whole point is determinism via observable state.
- Do NOT change the bounded-channel policies, AppMetrics, or any other
  cross-cutting infrastructure.
- The orchestrator script must be Windows-PowerShell-7-compatible (use forward
  slashes in paths where possible, single-quoted here-strings for any inlined
  text). It does NOT need to run on PowerShell 5.1.

## Verification before you report done

  dotnet build --configuration Release
  dotnet test --configuration Release

Plus:
  - run `tools/Capture-Measurements.ps1 -Scenario DemoBaseline -Duration 60
    -OutputCsv docs/captures/_smoke.csv -CommitHash $(git rev-parse --short HEAD)`
    end-to-end. Confirm the CSV is non-empty, the printed markdown block has
    16 rows, and the app exited with code 0. Delete the smoke CSV before commit.
  - run the same against MultiTagSoak with --duration 120 and confirm the
    rate-check .txt prints "PASS" (the 2-minute window is enough to verify
    plumbing; the real 30-min capture is Pass 3-equivalent for SLICE-1.1, not
    this slice).
  - confirm ConvertTo-MeasurementRow against docs/captures/demo-baseline-2026-04-23.csv
    produces numbers matching row 0 within ±0.5%.

## Report format when finished

- files created and files modified
- confirmation that all tests pass plus new concrete-scenario tests
- the smoke-test stdout (the markdown block) included in the report
- the row-0 fidelity comparison results (which cells matched, which differed and by how much)
- a single commit hash
- commit message: "feat(scenarios): add DemoBaseline + MultiTagSoak scenarios and capture orchestrator (pass 2/3 of TASK-1.5)"

Pass 3 — Row 0a capture and runbook §3a

You are implementing Pass 3 of TASK-1.5, the final pass. Passes 1 and 2 are
already merged. This pass runs the orchestrator end-to-end, captures the row 0a
demo-baseline-automated reference, writes the new §3a runbook section, and
updates the session-handoff documents.

## Authoritative references

Read these before making changes:
- docs/specs/SLICE-1.5-automated-measurement-capture.md   (criteria 5, 6, 7, 10)
- docs/runbook/capturing-measurements.md                   (existing §3, §4.1, §4.2 to extend)
- docs/reviews/phase-1-measurements.md                     (row 0 to mirror; row 0a appended below it)
- CLAUDE.md                                                (current-position block to update)
- docs/reviews/roadmap-progress.md                         (session log to append)

## Scope of this pass

Capture, table edit, runbook §3a, scenario-class cross-references in §4.1 and
§4.2, session-handoff updates. NO code changes — Passes 1 and 2 own those.

## Deliverables

1. Run the row 0a capture:
   - command:
       tools/Capture-Measurements.ps1 -Scenario DemoBaseline -Duration 600 `
         -OutputCsv docs/captures/demo-baseline-automated-$(Get-Date -Format yyyy-MM-dd).csv `
         -CommitHash $(git rev-parse --short HEAD) -OperatorDelayMs 0
   - duration is 600 s to match row 0's 10-minute scenario length
   - operator-delay is 0 — see SLICE-1.5 §"Verification Notes" for why this is
     the production setting and not a fudged human-pacing value
   - if the run fails (any exit code other than 0), capture the bootstrap.log
     and the partial CSV in the report, do NOT proceed to step 2 — fix the
     failure, re-run, and report both attempts

2. Append row 0a block to docs/reviews/phase-1-measurements.md:
   - new "### Row 0a — demo-baseline (automated, pre-Phase-1 reference)" header
     under "## Row 0 — demo baseline (pre-Phase-1)" and BEFORE "## Phase 1 rows"
   - mirror the row 0 header (Scenario / Capture / Commit / Date lines)
   - 16-metric table with Slice column "demo-baseline-automated (pre-Phase-1)"
     and Capture method naming the new CSV file plus "§3a"
   - "### Notes on row 0a" block with two notes:
       (a) why row 0a exists and which row Phase 1 deltas now compare against
           (row 0a; row 0 is preserved as historical evidence only)
       (b) the comparability data from criterion 6:
             samples.ingested ÷ runs.completed: row 0a vs row 0 = N% delta
           (must be within ±5%; if outside, STOP and investigate before
            committing — do not paper over a real difference)
       (c) the runs.completed delta (row 0a will exceed row 0 because automation
           paces tighter; document the absolute numbers and note this is
           expected, not a regression)

3. Add §3a to docs/runbook/capturing-measurements.md:
   - section title: "## 3a. Automated capture (preferred)"
   - placed AFTER §3 ("The capture procedure") and BEFORE §4 ("Scenarios")
   - content covers:
       * one-paragraph rationale (links back to SLICE-1.5 spec for the design
         tradeoff discussion)
       * the one-command invocation:
           tools/Capture-Measurements.ps1 -Scenario <name> -Duration <seconds> `
             -OutputCsv docs/captures/<name>-<date>.csv -CommitHash <hash>
       * a table of supported scenarios mapping <name> -> IScenario class:
           DemoBaseline -> DemoBaselineScenario
           MultiTagSoak -> MultiTagSoakScenario
           (future slices add rows here when their scenarios land)
       * a "when to fall back to manual §3" note:
           - verifying a UI-binding regression specifically (the automated
             path bypasses the dispatcher loop)
           - debugging a capture that the automated path produced and you
             cannot reproduce
           - first capture against a brand-new scenario class (one manual
             run as a sanity check before committing the automated row)
       * the operator-delay knob: documented as a diagnostic-only flag for
         comparing against the human baseline; default 0 is the production
         setting

4. Update §4.1 and §4.2 of docs/runbook/capturing-measurements.md:
   - §4.1: add a one-line "Implemented by: DemoBaselineScenario
     (src/InspectionPrototype.Application/Scenarios/DemoBaselineScenario.cs)"
     under the Steps block
   - §4.2: same, naming MultiTagSoakScenario
   - DO NOT delete the manual step list. Both step lists remain authoritative;
     a divergence between the manual step list and the IScenario implementation
     is a bug in whichever was changed without updating the other.

5. Update CLAUDE.md "Current position" block:
   - Phase: 1 (Simulator to scale) — SLICE-1.5 complete
   - Last completed slice: TASK-1.5 Pass 3 — automated capture orchestrator,
     row 0a baseline, runbook §3a; commit <hash>
   - Next action: kick off SLICE-1.2 (Real frame payloads) — first capture
     under the new automation can be done in one command
   - Blocked on: nothing
   - Last updated: <today's date>

6. Append session-log entry to docs/reviews/roadmap-progress.md under today's
   date: "TASK-1.5 Pass 3 — captured row 0a (CSV: …), added runbook §3a, scenarios
   §4.1/§4.2 cross-referenced their IScenario classes. Commit <hash>. SLICE-1.5
   marked Completed."

7. Update the SLICE-1.5 progress-table row in docs/reviews/roadmap-progress.md
   from Proposed/In-progress to Completed.

## Constraints

- Do NOT add a row for SLICE-1.5 itself in docs/reviews/phase-1-measurements.md.
  This slice produces tooling, not a Phase 1 performance row (criterion 10).
  The only edit to the table is row 0a.
- Do NOT delete or rewrite row 0 or its notes block. Row 0 is historical
  evidence; row 0a sits below it.
- Do NOT modify any code in this pass. If a defect surfaces during the capture
  run, stop and file a follow-up task — do not patch in Pass 3.
- Do NOT skip the comparability check in deliverable 2(b). The whole point of
  row 0a is that it is comparable to row 0; if the per-run-normalized delta is
  outside ±5%, that is a finding worth investigating before declaring the
  slice done.

## Verification before you report done

  dotnet build --configuration Release
  dotnet test --configuration Release

Plus:
  - the docs/captures/demo-baseline-automated-<date>.csv file exists and is
    committed
  - the row 0a block is present in docs/reviews/phase-1-measurements.md with
    all 16 metrics filled
  - the comparability footnote shows samples.ingested÷runs.completed within ±5%
  - the runbook §3a renders correctly (no broken markdown tables)
  - CLAUDE.md current-position block reflects SLICE-1.5 closure

## Report format when finished

- files created and modified
- the row 0a markdown block included in the report
- the comparability footnote numbers (row 0 vs row 0a, normalized and absolute)
- the docs/captures/ CSV path
- a single commit hash
- commit message: "docs(measurements): capture row 0a automated baseline and add runbook §3a (pass 3/3 of TASK-1.5)"

Operator notes

  • One pass per Copilot session. Same protocol as TASK-1.1. Do not feed all three prompts into a single agent.
  • Pass 1 is the riskiest for regressions. Touching App.OnStartup is touching the single-instance mutex, Serilog bootstrap, and crash-handler registration. Verify the no-flag interactive path still launches identically before reporting Pass 1 done — a regression in startup sequencing is exactly the kind of thing the existing test suite cannot catch.
  • Pass 2's row-0 fidelity check is non-negotiable. Lifting the §5 extraction script into a module is a refactor; the comparison against row 0's known-good numbers is the only verification that the refactor is faithful. If numbers drift by more than ±0.5%, find the divergence before adding new functionality on top.
  • Pass 3's capture is the slice's exit gate. Without row 0a in the table and the §3a runbook section, this slice has not delivered its purpose. The orchestrator script being green on a 60-second smoke run (Pass 2) is not a substitute for the real 600-second capture (Pass 3).
  • Update the index files only at the end of the phase, not per-slice. Same rationale as TASK-1.1's operator notes.

Docs-first project memory for AI-assisted implementation.