TASK-1.5: Implement Automated Measurement Capture
Status: Superseded as of 2026-04-27. The shipped Pass 1/2/3 work (
IScenario,ScenarioRunner,Capture-Measurements.ps1, etc.) was removed when the slice was retired in favor of a UI-Automation approach (shipped as SLICE-1.6,FlaUI, 2026-04-27). The Copilot pass prompts below are no longer the right execution contract — read them only for context on the design tradeoffs that were debated. Do not run any of the prompts.
- Status: Superseded by SLICE-1.6 (FlaUI capture, shipped)
- Date: 2026-04-25
- Spec: SLICE-1.5: Automated Measurement Capture
- Depends on: TASK-006: Implement Observability Baseline, TASK-1.1: Implement Multi-Tag Telemetry
Objective
Add an in-app IScenario abstraction and a --scenario/--duration/--output-csv CLI flag to InspectionPrototype.App that runs a named scenario to completion without showing a window, plus a tools/Capture-Measurements.ps1 orchestrator that wraps dotnet-counters collect around the scenario run and emits a 16-metric markdown row block. Re-baseline the demo scenario as row 0a so every Phase 1 delta from this slice forward compares against an automated-capture reference.
Scope
- introduce
IScenarioandIOperatorCommandsabstractions in the Application layer - add a
ScenarioRunnerhosted-style service that resolves scenarios by name and drives them to completion under a duration-bounded cancellation token - add CLI parsing to
App.OnStartupso--scenario <name>routes to scenario mode (no window) and absent flag preserves today's interactive behavior - implement two concrete scenarios —
DemoBaselineScenarioandMultiTagSoakScenario— calling the existingIRelayCommandinstances onMainViewModel - add
tools/Capture-Measurements.ps1orchestrator andtools/MeasurementExtraction.psm1extraction module (the latter is the lifted§5extraction script with one source of truth) - capture row 0a using
DemoBaselineScenarioand append it todocs/reviews/phase-1-measurements.md - add
§3a. Automated capture (preferred)todocs/runbook/capturing-measurements.mdand name theIScenarioclass in the existing §4.1 and §4.2 entries - add tests for: scenario name resolution, unknown-scenario rejection,
FakeScenariocompletion under cancellation, both concrete scenarios reaching their final step against in-memory fakes
Non-Scope
- replacing
dotnet-counterswith an in-process exporter - driving scenarios via Windows UI Automation (FlaUI / TestStack.White)
- CI integration / GitHub Actions wiring
- new measurement metrics beyond the existing 16
- new simulator profiles or scenarios beyond Demo Baseline and Multi-Tag Soak (SLICE-1.2/1.3/1.4 add their own)
- a GUI for picking scenarios
- back-porting any other historical row under the new tooling — only row 0a is captured
Touched Projects
src/InspectionPrototype.Application—IScenario,IOperatorCommands,ScenarioRunner,Scenarios/foldersrc/InspectionPrototype.Presentation—MainViewModelexposes the existingIRelayCommandinstances via the newIOperatorCommandsfacade (no behavior change)src/InspectionPrototype.App— CLI parsing inApp.OnStartup, DI registration ofIScenarioimplementationstests/InspectionPrototype.Tests—ScenarioRunnerTests,DemoBaselineScenarioTests,MultiTagSoakScenarioTests,FakeOperatorCommands,FakeAppStateStore(or extend an existing one)tools/Capture-Measurements.ps1,tools/MeasurementExtraction.psm1— new directorydocs/runbook/capturing-measurements.md— new §3a, plus one-line additions to §4.1 and §4.2docs/reviews/phase-1-measurements.md— new row 0a blockdocs/captures/demo-baseline-automated-<date>.csv— evidence
AI Tool Guidance
This task spans abstraction work, scripting, and a measurement run. Split it into three Copilot passes; do not paste all three prompts into a single session.
- Scenario abstraction and entry-point routing —
IScenario,IOperatorCommands,ScenarioRunner, CLI parsing inApp.OnStartup, aFakeScenariofor tests. No concrete scenarios, no orchestrator script. Existing interactive launch is unchanged. - Concrete scenarios and orchestrator script —
DemoBaselineScenario,MultiTagSoakScenario, the PowerShell orchestrator and extraction module. Tests verify each scenario reaches its final step against in-memory fakes. - Row 0a capture and runbook §3a — run the orchestrator end-to-end against the freshly-built app, commit the CSV, append the row 0a block, write the new runbook section, update CLAUDE.md and the session log.
Each pass ends with its own commit. Run dotnet test and confirm acceptance criteria for that pass before kicking off the next session.
Acceptance Criteria Mapping
The implementation must satisfy all acceptance criteria from SLICE-1.5:
- Pass 1 covers criteria 1 (entry-point routing — initially with
FakeScenarioonly), 3 (unknown-scenario rejection), 8 (scenario-runner tests), 9 (no behavior change to existing command surface) - Pass 2 covers criteria 1 and 2 (both concrete scenarios producing valid CSVs), 4 (orchestrator script + markdown block), 8 (concrete-scenario tests)
- Pass 3 covers criteria 5, 6, 7, 10 (row 0a capture, comparability footnote, runbook §3a, no Phase 1 row added)
Copilot Agent Prompts
Pass 1 — Scenario abstraction and entry-point routing
You are implementing Pass 1 of TASK-1.5 in this repository: introduce the IScenario
abstraction, the ScenarioRunner, and CLI routing in App.OnStartup so the app can be
launched headless via `--scenario <name>`. NO concrete scenarios yet — those land in
Pass 2. The interactive WPF launch (no flag) must behave exactly as today.
## Authoritative references
Read these before making changes:
- docs/specs/SLICE-1.5-automated-measurement-capture.md (the requirements)
- docs/tasks/TASK-1.5-implement-automated-measurement-capture.md (this task)
- src/InspectionPrototype.App/App.xaml.cs (entry point)
- src/InspectionPrototype.Presentation/ViewModels/MainViewModel.cs (IRelayCommand surface to facade)
- src/InspectionPrototype.Application/Abstractions/IAppStateStore.cs
- src/InspectionPrototype.Application/ApplicationServiceCollectionExtensions.cs
Spec acceptance criteria 1 (entry-point routing — using FakeScenario), 3
(unknown-scenario rejection), 8 (runner tests), and 9 (no caller delta on the
existing command surface) are the definition of done for this pass.
## Scope of this pass
Abstractions, runner, CLI parsing, FakeScenario, tests. NO DemoBaselineScenario
or MultiTagSoakScenario. NO PowerShell scripts. NO captures.
## Deliverables
1. New abstractions under src/InspectionPrototype.Application/Scenarios/:
- IScenario.cs:
string Name { get; } // case-insensitive match key
Task RunAsync(IOperatorCommands ops, IAppStateStore state, CancellationToken ct);
- IOperatorCommands.cs — facade over the existing IRelayCommand instances on
MainViewModel. Expose only what scenarios need:
IRelayCommand Connect { get; }
IRelayCommand Disconnect { get; }
IRelayCommand RefreshCatalog { get; }
IRelayCommand LoadRecipe { get; }
IRelayCommand Home { get; }
IRelayCommand StartRun { get; }
IRelayCommand Stop { get; }
IRelayCommand ApplySimulatorProfile { get; }
string? SelectedRecipeName { get; set; } // for LoadRecipe
string? SelectedSimulatorProfileName { get; set; } // for ApplySimulatorProfile
The facade is a property-passthrough, NOT a re-implementation. It binds to
MainViewModel and forwards each property to the corresponding command /
selection on MainViewModel. Do NOT introduce new commands or change behavior.
2. ScenarioRunner under src/InspectionPrototype.Application/Scenarios/:
- file: ScenarioRunner.cs
- constructor: IEnumerable<IScenario>, IOperatorCommands, IAppStateStore,
ILogger<ScenarioRunner>
- public method: Task<int> RunAsync(string scenarioName, TimeSpan duration,
CancellationToken stoppingToken)
* resolves the IScenario whose Name matches scenarioName (StringComparer.OrdinalIgnoreCase)
* if no match: log a Warning naming the requested name and the registered names,
return exit code 2
* if duration <= TimeSpan.Zero: log Warning, return exit code 2
* otherwise: link stoppingToken with a CancellationTokenSource cancelled at
`duration`, await scenario.RunAsync(ops, state, linkedToken)
* on OperationCanceledException from the duration timeout: this is the normal
exit path — log Information "Scenario {Name} reached duration {Duration}",
return exit code 0
* on any other exception: log Critical with the exception, return exit code 3
* log Information at scenario start with the resolved scenario name, registered
scenario count, and duration
3. CLI parsing in src/InspectionPrototype.App/App.xaml.cs:
- Define a record `ScenarioCliArgs(string Name, TimeSpan Duration, string? OutputCsv,
string? Profile, TimeSpan OperatorDelay)` in a new file
src/InspectionPrototype.App/ScenarioCliArgs.cs along with a static
`TryParse(string[] args, out ScenarioCliArgs? parsed, out string? error)` method.
Recognized flags: --scenario, --duration (seconds, integer), --output-csv,
--profile, --operator-delay (milliseconds, integer, default 0). --output-csv
is parsed but NOT yet acted on (Pass 2 wires it to the actual capture).
Unknown flags produce a clear error string. --scenario without --duration is
an error.
- In App.OnStartup, after the single-instance mutex acquisition but BEFORE Serilog
bootstrap and Host build, call ScenarioCliArgs.TryParse(e.Args, ...). If parsed
is non-null, set a private bool _isScenarioMode = true and store the parsed args
on a field. If TryParse returns an error, write to bootstrap.log and Environment.Exit(2).
If --scenario is absent, behavior is unchanged.
- When _isScenarioMode is true:
* still build the Host with the same ConfigureServices (scenarios need the
full DI graph)
* still start the Host (await _host.StartAsync())
* register the unhandled-exception handlers EXCEPT the one that calls
workflowService.AbortAsync() on UI exceptions — in scenario mode there is
no UI dispatcher loop, so log the exception and let the runner surface it
via its return code instead
* resolve ScenarioRunner from the host's services and await
runner.RunAsync(args.Name, TimeSpan.FromSeconds(args.Duration), CancellationToken.None)
* on completion: await _host.StopAsync, dispose the mutex, call
Application.Current.Shutdown(exitCode) so WPF returns the runner's exit code
* do NOT call mainWindow.Show()
- When _isScenarioMode is false: call mainWindow.Show() exactly as today.
4. DI registration:
- In ApplicationServiceCollectionExtensions.AddApplicationServices, register
ScenarioRunner as a transient (it holds no state between runs).
- Register IScenario implementations via services.AddSingleton<IScenario, FakeScenario>()
(Pass 2 adds the real ones; FakeScenario stays registered for test runs only —
gate it behind an `#if DEBUG` or a configuration flag so production release
builds do not expose it. Implementation choice: prefer a `services.AddScenarios()`
extension method on the Application layer that takes a flag, called from
App.xaml.cs's ConfigureServices with `includeFakes: Debugger.IsAttached`).
- Register IOperatorCommands -> MainViewModelOperatorCommandsAdapter (a new
class in src/InspectionPrototype.Presentation/Scenarios/ that takes
MainViewModel via constructor and exposes the IRelayCommand properties).
Lifetime: singleton, same as MainViewModel.
5. FakeScenario under src/InspectionPrototype.Application/Scenarios/Testing/:
- file: FakeScenario.cs
- Name = "Fake"
- RunAsync awaits Task.Delay(Timeout.InfiniteTimeSpan, ct) so the runner's
duration timeout is what ends it. The point is to verify the runner's
duration / cancellation plumbing without exercising any real commands.
6. Tests under tests/InspectionPrototype.Tests/Scenarios/:
- ScenarioCliArgsTests.cs:
* --scenario X --duration 30 parses OK with default profile null and
operator-delay zero
* --scenario X without --duration produces an error
* --scenario X --duration 0 (or negative) produces an error
* --scenario X --duration 30 --output-csv path --profile MultiTag
--operator-delay 1500 parses all five fields
* unknown flag produces an error naming the flag
* absent --scenario returns parsed = null with no error (interactive launch path)
- ScenarioRunnerTests.cs:
* resolves "Fake" (case-insensitive: "fake", "FAKE", "Fake" all match)
and runs to completion under a 100 ms duration with exit code 0
* unknown name "Nonsense" returns exit code 2 without invoking any
scenario
* duration TimeSpan.Zero returns exit code 2
* a scenario that throws InvalidOperationException returns exit code 3
and logs the exception at Critical
* registers MULTIPLE IScenario implementations; the runner picks the right
one by name (use FakeScenario plus a second test-only AnotherFakeScenario
with Name = "Another")
## Constraints
- Do NOT change any existing behavior of MainViewModel or its commands.
- Do NOT add Program.cs — WPF projects use App.xaml.cs as the entry point.
- Do NOT capture any measurements in this pass.
- Do NOT add any concrete scenarios beyond FakeScenario.
- Do NOT introduce a separate console-app project; the same App.xaml.cs entry
point routes both modes.
- Do NOT change the single-instance mutex behavior. A second scenario launch
while one is in flight should be rejected by the existing mutex, exactly the
same as a second interactive launch.
- The IOperatorCommands adapter must NOT call MainViewModel methods directly —
it forwards IRelayCommand properties only. Scenarios go through Execute(null)
on the same command instances XAML binds to.
## Verification before you report done
dotnet build --configuration Release
dotnet test --configuration Release
Manual smoke tests:
- launch the app with no args: window appears as today, normal interactive flow
- launch with `--scenario Fake --duration 5`: no window appears, app exits
after ~5 seconds with code 0, app-<date>.log shows scenario-start and
scenario-end Information entries
- launch with `--scenario Nonsense --duration 5`: exits within ~1 second with
a non-zero code, log entry names the requested scenario and lists registered
scenario names
- launch with `--unknown-flag`: exits with code 2 and bootstrap.log shows the
parse error
## Report format when finished
- files created and files modified
- confirmation that all existing tests still pass plus new scenario-runner tests
- a single commit hash
- commit message: "feat(scenarios): add IScenario abstraction, runner, and CLI routing for headless captures (pass 1/3 of TASK-1.5)"Pass 2 — Concrete scenarios and orchestrator script
You are implementing Pass 2 of TASK-1.5. Pass 1 (IScenario, ScenarioRunner, CLI
routing, FakeScenario) is already merged. This pass adds the two concrete
scenarios — DemoBaselineScenario and MultiTagSoakScenario — and the PowerShell
orchestrator that wraps `dotnet-counters collect` around a scenario run and
emits a 16-metric markdown row block.
## Authoritative references
Read these before making changes:
- docs/specs/SLICE-1.5-automated-measurement-capture.md (criteria 1, 2, 4, 8)
- docs/runbook/capturing-measurements.md (existing §4.1 step list, §5 extraction script)
- docs/reviews/phase-1-measurements.md (row 0 column shape to mirror)
- src/InspectionPrototype.Application/Scenarios/ (Pass 1 output)
- src/InspectionPrototype.Application/State/AppState.cs (WorkflowState shape — what scenarios wait on)
- src/InspectionPrototype.Presentation/ViewModels/MainViewModel.cs (command names and selection properties)
Pass 1's IScenario, IOperatorCommands, ScenarioRunner, and CLI routing must
already be in place. Confirm by running `--scenario Fake --duration 5` and
checking exit code 0 before starting.
## Scope of this pass
Two concrete IScenario implementations, a thin PowerShell orchestrator, an
extraction module, and tests. NO row 0a capture (Pass 3), NO runbook updates
beyond what is needed to document the script's invocation.
## Deliverables
1. DemoBaselineScenario under src/InspectionPrototype.Application/Scenarios/:
- Name = "DemoBaseline"
- RunAsync sequence — each step awaits a state transition via IAppStateStore.StateChanged
with a per-step timeout (default 10 s; on timeout throw with a message naming
the expected state and the actual state):
1. ops.Connect.Execute(null); wait until state.WorkflowState == Connected (or any
state that is downstream of Connected)
2. ops.RefreshCatalog.Execute(null); wait until state.RecipeCatalog is non-empty
3. ops.SelectedRecipeName = "standard-5pt-wafer-scan";
4. ops.LoadRecipe.Execute(null); wait until state.LoadedRecipe?.Name == "standard-5pt-wafer-scan"
5. ops.Home.Execute(null); wait until state.WorkflowState == Idle (post-home)
6. loop until ct.IsCancellationRequested:
ops.StartRun.Execute(null);
wait until state.WorkflowState == Running
wait until state.WorkflowState is one of { Idle, Faulted } (run terminated)
(no operator-delay sleep at default; if config injects one, sleep here —
see deliverable 4 for the wiring)
7. on cancellation: if state.WorkflowState == Running, ops.Stop.Execute(null)
and wait up to 10 s for terminal state; then ops.Disconnect.Execute(null)
- Helper: a private static `WaitForStateAsync(IAppStateStore state, Func<AppState, bool> predicate,
TimeSpan timeout, CancellationToken ct)` that subscribes to StateChanged, checks
predicate against Current first (avoid missing already-true conditions), and
completes the awaiter on first match. The same helper is reused by
MultiTagSoakScenario.
2. MultiTagSoakScenario under src/InspectionPrototype.Application/Scenarios/:
- Name = "MultiTagSoak"
- RunAsync sequence — identical to DemoBaselineScenario EXCEPT step 0 inserted
before Connect:
0. ops.SelectedSimulatorProfileName = "MultiTag";
ops.ApplySimulatorProfile.Execute(null);
wait until state.ActiveSimulatorProfileName == "MultiTag"
The §4.2 sanity check exists exactly because this step gets skipped under
manual operation; in the automated path it is sequential and unmissable.
- Reuse the same WaitForStateAsync helper and the same StartRun loop.
3. Operator-delay wiring:
- Both scenarios accept an optional TimeSpan operator-delay via constructor or
options pattern. The CLI flag --operator-delay (parsed in Pass 1) flows from
ScenarioCliArgs into the runner via a new IScenarioOptions interface that
ScenarioRunner sets before calling RunAsync, OR via a ScenarioContext record
passed to RunAsync (implementation choice — pick whichever is simpler given
the Pass 1 shape; document the choice in the commit message).
- Default operator-delay is TimeSpan.Zero. When non-zero, sleep for that
duration after each terminal-state observation in the StartRun loop. This
is the only place operator-delay is honored — pre-Connect sequencing is
gated on state transitions, not delays.
4. DI registration:
- In src/InspectionPrototype.App/App.xaml.cs ConfigureServices (or in the
AddScenarios extension introduced in Pass 1):
services.AddSingleton<IScenario, DemoBaselineScenario>();
services.AddSingleton<IScenario, MultiTagSoakScenario>();
- FakeScenario registration stays gated as in Pass 1.
5. tools/Capture-Measurements.ps1:
- Create the tools/ directory.
- Parameters:
[string]$Scenario (mandatory: 'DemoBaseline' or 'MultiTagSoak')
[int]$Duration (mandatory: seconds)
[string]$OutputCsv (mandatory: full path under docs/captures/)
[string]$CommitHash (mandatory: short or full hash; included in the row block header)
[string]$Profile (optional)
[int]$OperatorDelayMs (optional; default 0)
[switch]$AppendToTable (optional; if present, append the markdown block to
docs/reviews/phase-1-measurements.md under the right
## Phase 1 rows subsection rather than just printing)
- Sequence:
1. dotnet build --configuration Release (fail script on non-zero)
2. Start the app in the background:
$appProcess = Start-Process -FilePath ".\src\InspectionPrototype.App\bin\Release\net10.0-windows\InspectionPrototype.App.exe" `
-ArgumentList "--scenario $Scenario --duration $Duration --output-csv $OutputCsv --profile $Profile --operator-delay $OperatorDelayMs" `
-PassThru
3. Poll `dotnet-counters ps` for the new PID (timeout 30 s; fail if not found)
4. Start dotnet-counters collect against that PID with --refresh-interval 1
and --counters InspectionPrototype,System.Runtime, output to $OutputCsv.
Run as a background job so the script can wait on the app process.
5. Wait-Process -InputObject $appProcess (no timeout; the runner enforces
duration internally, then exits)
6. Sleep 2 seconds (one extra refresh interval to let the collector flush)
7. Stop the dotnet-counters job; confirm $OutputCsv exists and is non-empty
8. If $appProcess.ExitCode -ne 0: rename $OutputCsv to $OutputCsv + ".partial",
write an error, exit 1
9. Import-Module .\tools\MeasurementExtraction.psm1
10. $row = ConvertTo-MeasurementRow -CsvPath $OutputCsv -SliceTag $Scenario `
-Scenario $Scenario -CommitHash $CommitHash -Date (Get-Date -Format yyyy-MM-dd)
11. If $Scenario -eq 'MultiTagSoak': run the per-tag rate-check from the
existing §4.2 PowerShell snippet and write the .txt file next to the CSV.
Fail script if any tag exceeds ±2% or any tag is missing.
12. If $AppendToTable: read docs/reviews/phase-1-measurements.md, find the
"## Phase 1 rows" header (insert if missing), append $row block under it
with a blank line separator. Otherwise: Write-Output $row.
6. tools/MeasurementExtraction.psm1:
- Lift the §5 extraction script from docs/runbook/capturing-measurements.md
into a module function:
function ConvertTo-MeasurementRow {
param([string]$CsvPath, [string]$SliceTag, [string]$Scenario,
[string]$CommitHash, [string]$Date)
# returns a multi-line string: a 16-row markdown table block
# plus a one-line header naming SliceTag, Scenario, CommitHash, Date
}
- Helper functions SumCounter, AvgCounter, MaxCounter, plus the CPU% computation
are private to the module.
- Export-ModuleMember -Function ConvertTo-MeasurementRow.
- The module's output for an existing capture (e.g. docs/captures/demo-baseline-2026-04-23.csv)
must produce numbers that match row 0's existing values to within ±0.5% on every
numeric cell. This is the verification that the lifted module is faithful to the
original embedded script. Document this comparison run in the commit message.
7. Tests under tests/InspectionPrototype.Tests/Scenarios/:
- DemoBaselineScenarioTests.cs:
* In-memory FakeOperatorCommands (a stub IOperatorCommands that records each
Execute call and exposes flags for "should the next state transition fire?")
* In-memory FakeAppStateStore that lets the test push state transitions
* Test: scenario advances through Connect → RefreshCatalog → LoadRecipe →
Home → first StartRun → Idle, then the test cancels the token and
scenario calls Stop + Disconnect cleanly
* Test: when WaitForStateAsync times out (10 s expected state, fake never
transitions), scenario throws with a message naming the expected state
- MultiTagSoakScenarioTests.cs:
* Test: scenario sets SelectedSimulatorProfileName = "MultiTag" and
calls ApplySimulatorProfile BEFORE Connect
* Test: rest of sequence mirrors DemoBaselineScenario
- WaitForStateAsyncTests.cs (or inlined in the scenario tests):
* Predicate already true at entry: completes immediately without waiting
for a StateChanged event
* Predicate becomes true after one event: completes
* Timeout reached: throws TimeoutException naming the timeout duration
## Constraints
- Do NOT capture row 0a in this pass. The capture run is Pass 3.
- Do NOT modify docs/runbook/capturing-measurements.md beyond adding the script's
invocation example as a code block in the existing §5 (a one-paragraph
forward-reference is fine; the §3a section lands in Pass 3).
- Do NOT use Thread.Sleep or Task.Delay as a substitute for state-transition
waits. The whole point is determinism via observable state.
- Do NOT change the bounded-channel policies, AppMetrics, or any other
cross-cutting infrastructure.
- The orchestrator script must be Windows-PowerShell-7-compatible (use forward
slashes in paths where possible, single-quoted here-strings for any inlined
text). It does NOT need to run on PowerShell 5.1.
## Verification before you report done
dotnet build --configuration Release
dotnet test --configuration Release
Plus:
- run `tools/Capture-Measurements.ps1 -Scenario DemoBaseline -Duration 60
-OutputCsv docs/captures/_smoke.csv -CommitHash $(git rev-parse --short HEAD)`
end-to-end. Confirm the CSV is non-empty, the printed markdown block has
16 rows, and the app exited with code 0. Delete the smoke CSV before commit.
- run the same against MultiTagSoak with --duration 120 and confirm the
rate-check .txt prints "PASS" (the 2-minute window is enough to verify
plumbing; the real 30-min capture is Pass 3-equivalent for SLICE-1.1, not
this slice).
- confirm ConvertTo-MeasurementRow against docs/captures/demo-baseline-2026-04-23.csv
produces numbers matching row 0 within ±0.5%.
## Report format when finished
- files created and files modified
- confirmation that all tests pass plus new concrete-scenario tests
- the smoke-test stdout (the markdown block) included in the report
- the row-0 fidelity comparison results (which cells matched, which differed and by how much)
- a single commit hash
- commit message: "feat(scenarios): add DemoBaseline + MultiTagSoak scenarios and capture orchestrator (pass 2/3 of TASK-1.5)"Pass 3 — Row 0a capture and runbook §3a
You are implementing Pass 3 of TASK-1.5, the final pass. Passes 1 and 2 are
already merged. This pass runs the orchestrator end-to-end, captures the row 0a
demo-baseline-automated reference, writes the new §3a runbook section, and
updates the session-handoff documents.
## Authoritative references
Read these before making changes:
- docs/specs/SLICE-1.5-automated-measurement-capture.md (criteria 5, 6, 7, 10)
- docs/runbook/capturing-measurements.md (existing §3, §4.1, §4.2 to extend)
- docs/reviews/phase-1-measurements.md (row 0 to mirror; row 0a appended below it)
- CLAUDE.md (current-position block to update)
- docs/reviews/roadmap-progress.md (session log to append)
## Scope of this pass
Capture, table edit, runbook §3a, scenario-class cross-references in §4.1 and
§4.2, session-handoff updates. NO code changes — Passes 1 and 2 own those.
## Deliverables
1. Run the row 0a capture:
- command:
tools/Capture-Measurements.ps1 -Scenario DemoBaseline -Duration 600 `
-OutputCsv docs/captures/demo-baseline-automated-$(Get-Date -Format yyyy-MM-dd).csv `
-CommitHash $(git rev-parse --short HEAD) -OperatorDelayMs 0
- duration is 600 s to match row 0's 10-minute scenario length
- operator-delay is 0 — see SLICE-1.5 §"Verification Notes" for why this is
the production setting and not a fudged human-pacing value
- if the run fails (any exit code other than 0), capture the bootstrap.log
and the partial CSV in the report, do NOT proceed to step 2 — fix the
failure, re-run, and report both attempts
2. Append row 0a block to docs/reviews/phase-1-measurements.md:
- new "### Row 0a — demo-baseline (automated, pre-Phase-1 reference)" header
under "## Row 0 — demo baseline (pre-Phase-1)" and BEFORE "## Phase 1 rows"
- mirror the row 0 header (Scenario / Capture / Commit / Date lines)
- 16-metric table with Slice column "demo-baseline-automated (pre-Phase-1)"
and Capture method naming the new CSV file plus "§3a"
- "### Notes on row 0a" block with two notes:
(a) why row 0a exists and which row Phase 1 deltas now compare against
(row 0a; row 0 is preserved as historical evidence only)
(b) the comparability data from criterion 6:
samples.ingested ÷ runs.completed: row 0a vs row 0 = N% delta
(must be within ±5%; if outside, STOP and investigate before
committing — do not paper over a real difference)
(c) the runs.completed delta (row 0a will exceed row 0 because automation
paces tighter; document the absolute numbers and note this is
expected, not a regression)
3. Add §3a to docs/runbook/capturing-measurements.md:
- section title: "## 3a. Automated capture (preferred)"
- placed AFTER §3 ("The capture procedure") and BEFORE §4 ("Scenarios")
- content covers:
* one-paragraph rationale (links back to SLICE-1.5 spec for the design
tradeoff discussion)
* the one-command invocation:
tools/Capture-Measurements.ps1 -Scenario <name> -Duration <seconds> `
-OutputCsv docs/captures/<name>-<date>.csv -CommitHash <hash>
* a table of supported scenarios mapping <name> -> IScenario class:
DemoBaseline -> DemoBaselineScenario
MultiTagSoak -> MultiTagSoakScenario
(future slices add rows here when their scenarios land)
* a "when to fall back to manual §3" note:
- verifying a UI-binding regression specifically (the automated
path bypasses the dispatcher loop)
- debugging a capture that the automated path produced and you
cannot reproduce
- first capture against a brand-new scenario class (one manual
run as a sanity check before committing the automated row)
* the operator-delay knob: documented as a diagnostic-only flag for
comparing against the human baseline; default 0 is the production
setting
4. Update §4.1 and §4.2 of docs/runbook/capturing-measurements.md:
- §4.1: add a one-line "Implemented by: DemoBaselineScenario
(src/InspectionPrototype.Application/Scenarios/DemoBaselineScenario.cs)"
under the Steps block
- §4.2: same, naming MultiTagSoakScenario
- DO NOT delete the manual step list. Both step lists remain authoritative;
a divergence between the manual step list and the IScenario implementation
is a bug in whichever was changed without updating the other.
5. Update CLAUDE.md "Current position" block:
- Phase: 1 (Simulator to scale) — SLICE-1.5 complete
- Last completed slice: TASK-1.5 Pass 3 — automated capture orchestrator,
row 0a baseline, runbook §3a; commit <hash>
- Next action: kick off SLICE-1.2 (Real frame payloads) — first capture
under the new automation can be done in one command
- Blocked on: nothing
- Last updated: <today's date>
6. Append session-log entry to docs/reviews/roadmap-progress.md under today's
date: "TASK-1.5 Pass 3 — captured row 0a (CSV: …), added runbook §3a, scenarios
§4.1/§4.2 cross-referenced their IScenario classes. Commit <hash>. SLICE-1.5
marked Completed."
7. Update the SLICE-1.5 progress-table row in docs/reviews/roadmap-progress.md
from Proposed/In-progress to Completed.
## Constraints
- Do NOT add a row for SLICE-1.5 itself in docs/reviews/phase-1-measurements.md.
This slice produces tooling, not a Phase 1 performance row (criterion 10).
The only edit to the table is row 0a.
- Do NOT delete or rewrite row 0 or its notes block. Row 0 is historical
evidence; row 0a sits below it.
- Do NOT modify any code in this pass. If a defect surfaces during the capture
run, stop and file a follow-up task — do not patch in Pass 3.
- Do NOT skip the comparability check in deliverable 2(b). The whole point of
row 0a is that it is comparable to row 0; if the per-run-normalized delta is
outside ±5%, that is a finding worth investigating before declaring the
slice done.
## Verification before you report done
dotnet build --configuration Release
dotnet test --configuration Release
Plus:
- the docs/captures/demo-baseline-automated-<date>.csv file exists and is
committed
- the row 0a block is present in docs/reviews/phase-1-measurements.md with
all 16 metrics filled
- the comparability footnote shows samples.ingested÷runs.completed within ±5%
- the runbook §3a renders correctly (no broken markdown tables)
- CLAUDE.md current-position block reflects SLICE-1.5 closure
## Report format when finished
- files created and modified
- the row 0a markdown block included in the report
- the comparability footnote numbers (row 0 vs row 0a, normalized and absolute)
- the docs/captures/ CSV path
- a single commit hash
- commit message: "docs(measurements): capture row 0a automated baseline and add runbook §3a (pass 3/3 of TASK-1.5)"Operator notes
- One pass per Copilot session. Same protocol as TASK-1.1. Do not feed all three prompts into a single agent.
- Pass 1 is the riskiest for regressions. Touching
App.OnStartupis touching the single-instance mutex, Serilog bootstrap, and crash-handler registration. Verify the no-flag interactive path still launches identically before reporting Pass 1 done — a regression in startup sequencing is exactly the kind of thing the existing test suite cannot catch. - Pass 2's row-0 fidelity check is non-negotiable. Lifting the §5 extraction script into a module is a refactor; the comparison against row 0's known-good numbers is the only verification that the refactor is faithful. If numbers drift by more than ±0.5%, find the divergence before adding new functionality on top.
- Pass 3's capture is the slice's exit gate. Without row 0a in the table and the §3a runbook section, this slice has not delivered its purpose. The orchestrator script being green on a 60-second smoke run (Pass 2) is not a substitute for the real 600-second capture (Pass 3).
- Update the index files only at the end of the phase, not per-slice. Same rationale as TASK-1.1's operator notes.