Skip to content

SLICE-1.6 Design Notes — FlaUI-driven Measurement Capture

  • Slice: SLICE-1.6
  • Implementation status: Completed (2026-04-27, pass 3/3)
  • Audience: anyone modifying the capture rig, the FlaUI driver, the scenario classes, the PowerShell orchestrator, or the AutomationIds in MainWindow.xaml

This doc explains how the FlaUI capture rig actually works in code — why it's structured as a driver abstraction with two implementations (a fake for unit tests and FlaUI for live captures), how the PowerShell orchestrator drives the whole pipeline (build → launch → attach → run → collect → extract), why scenarios are exposed as xUnit [Fact]s with a Category=Capture trait, and the four hardening fixes that ChaosMonkey's chaos rate forced into the rig. Read this if you're adding a new capture scenario, hardening an existing one against profile-specific failures, or porting the rig to a new automation framework.

This slice has a different shape from the other Phase 1 slices: it's infrastructure scaffolding, not a runtime subsystem. There's no producer, no consumer, no channel. There's a framework that lets a PowerShell script tell a running WPF application "click this button, wait for that text to appear, then run for 10 minutes" and produces a dotnet-counters CSV at the end.

1. Quick reference

Key types:

TypeProjectRole
IUiDriverInspectionPrototype.UiDriverThe driver abstraction — 5 methods
FakeUiDriverInspectionPrototype.UiDriverIn-process test double; ConcurrentDictionary of elements + click log
FlaUiDriverInspectionPrototype.AcceptanceTestsReal driver; FlaUI.UIA3 5.0.0 attached to a running process
DemoBaselineFlaUiInspectionPrototype.AcceptanceTests/Scenarios/The 10-min Demo scenario; setup + iteration + teardown helpers
MultiTagSoakFlaUisameWraps DemoBaselineFlaUi with profile-selection prelude
Capture-Measurements.ps1tools/The orchestrator: build → launch → counters → test → extract
MeasurementExtraction.psm1tools/CSV-math helpers: ConvertTo-MeasurementRow, etc.

Key files:

src/InspectionPrototype.UiDriver/IUiDriver.cs
src/InspectionPrototype.UiDriver/FakeUiDriver.cs
src/InspectionPrototype.UiDriver/InspectionPrototype.UiDriver.csproj          // net10.0, no WPF/FlaUI deps
tests/InspectionPrototype.AcceptanceTests/FlaUiDriver.cs
tests/InspectionPrototype.AcceptanceTests/Scenarios/DemoBaselineFlaUi.cs
tests/InspectionPrototype.AcceptanceTests/Scenarios/MultiTagSoakFlaUi.cs
tests/InspectionPrototype.AcceptanceTests/SmokeTests.cs                       // 5 in-process FakeUiDriver tests
tests/InspectionPrototype.AcceptanceTests/InspectionPrototype.AcceptanceTests.csproj  // net10.0-windows
src/InspectionPrototype.App/MainWindow.xaml                                   // 13 AutomationIds (later: +RecoverButton)
tools/Capture-Measurements.ps1                                                // ~245-line orchestrator
tools/MeasurementExtraction.psm1                                              // CSV → markdown row
tests/InspectionPrototype.Tests/MainWindowAutomationIdRegressionTests.cs      // XAML attr-presence tests
tests/InspectionPrototype.Tests/SimulatorProfilesOptionsBindingTests.cs       // SLICE-1.2 silent-ignore guard

Project structure (the layering matters):

ProjectTFMReferencesAllowed callers
InspectionPrototype.UiDrivernet10.0(none)All test projects + future App layer
InspectionPrototype.AcceptanceTestsnet10.0-windowsUiDriver + FlaUI 5.0.0 + xUnitCaptures only

UiDriver is deliberately TFM-portable and dependency-free so unit tests outside the AcceptanceTests project can use FakeUiDriver without dragging in WPF or FlaUI.

Key tests:

TestPurpose
SmokeTests (5 tests)FakeUiDriver exercises each IUiDriver method
MainWindowAutomationIdRegressionTests (2 tests)Asserts each AutomationId is present in MainWindow.xaml (XAML inline attached-property LocalName is the gotcha — it's "AutomationProperties.AutomationId" not just "AutomationId")
SimulatorProfilesOptionsBindingTests (4 tests)Guards the SLICE-1.2 silent-ignore bug (binder dropped fields)
MultiTagSoakFlaUi.RunAsync ([Trait("Category", "Capture")])The capture entry point — never run in CI; only invoked by the orchestrator

2. Class shape

       PowerShell orchestrator                         WPF App (under test)
       ───────────────────────                          ──────────────────
   Capture-Measurements.ps1                                   │
       │                                                       │
       │ 1. dotnet build --configuration Release               │
       │ 2. Start-Process App.exe                              │
       │     ── captures ProcessId                             ▼
       │ 3. Start-Process dotnet-counters collect              app launches; XAML
       │     ── writes <output>.csv                            renders MainWindow with
       │ 4. Set env vars:                                      13 AutomationIds:
       │      APP_PROCESS_ID=<pid>                             - ConnectButton
       │      DURATION_SECONDS=<seconds>                       - DisconnectButton
       │      SIMULATOR_PROFILE=<name>                         - HomeButton
       │ 5. dotnet test --filter <Scenario>FlaUi               - StartRunButton
       │       ┌────────────────────────────────────────┐      - StopButton
       │       │ xUnit invokes [Fact] [Trait("Capture")]│      - AbortButton
       │       │   Scenarios/MultiTagSoakFlaUi.cs       │      - AcknowledgeFaultButton
       │       │   - reads APP_PROCESS_ID env           │      - RefreshRecipeCatalogButton
       │       │   - Process.GetProcessById(pid)        │      - LoadRecipeButton
       │       │   - new FlaUiDriver(process)           │      - RecoverButton  ◀── added 0f1596a
       │       │     │ Application.Attach(process)      │      - SimulatorProfileSelectorComboBox
       │       │     │ new UIA3Automation()             │      - RecipeCatalogComboBox
       │       │     ▼                                   │     - WorkflowStateText
       │       │ ┌─────────────────────────────────────┐│      - ActiveSimulatorProfileText
       │       │ │ FlaUiDriver : IUiDriver             ││      - (and more readouts)
       │       │ │  ClickByAutomationIdAsync(id)       ││            │
       │       │ │   FindElementAsync(id, timeout)     ││            │
       │       │ │     poll @ 100 ms; return element   ││            │
       │       │ │   wait for IsEnabled; AsButton().Inv││            │
       │       │ │  ReadTextByAutomationIdAsync(id)    ││────────────│ UIA3
       │       │ │   Properties.Name.ValueOrDefault    ││            │ (Windows
       │       │ │  WaitForTextAsync(id, expected, t)  ││            │  Automation)
       │       │ │  SelectComboBoxItemAsync(id, item)  ││            │
       │       │ │   AsComboBox().Expand()             ││            │
       │       │ │   Items.FirstOrDefault(...).Select()││            │
       │       │ │ }                                   ││            │
       │       │ └─────────────────────────────────────┘│            │
       │       │                                         │           │
       │       │ scenario flow:                          │           │
       │       │   await SetupAsync(driver)              │           │
       │       │     SelectComboBoxItem("Profile",…)     │ ──────────┘
       │       │     ApplySimulatorProfileButton click   │   click → AppMetrics counters
       │       │     ConnectButton click (retry-Connect) │           emit; dotnet-counters
       │       │     RefreshRecipeCatalogButton click    │           collects them in CSV.
       │       │     LoadRecipeButton click              │
       │       │     HomeButton click → wait Ready       │
       │       │   while !durationCts:                   │
       │       │     RunSingleIterationAsync(driver, ct) │
       │       │     wait Faulted → not-Faulted (chaos)  │ ──── 4 hardening fixes for ChaosMonkey
       │       │     retry-Home loop (3× attempts)       │      (added during SLICE-1.4 capture)
       │       │   await TearDownAsync(driver)           │
       │       │     DisconnectButton click              │
       │       └────────────────────────────────────────┘
       │                                                       app keeps running
       │                                                       until step 7

       │ 6. Stop-Process dotnet-counters    ◀── ends collect
       │ 7. Stop-Process App                ◀── ends capture
       │ 8. Import-Module MeasurementExtraction.psm1
       │ 9. ConvertTo-MeasurementRow CsvPath ...
       │     produces 22-26 metric markdown row
       │ 10. (optional) -AppendToTable: prepend row to
       │      docs/reviews/phase-1-measurements.md

   exit 0 / non-zero

The two-implementation IUiDriver abstraction:

                 IUiDriver (5 methods)

            ┌──────────┴──────────┐
            │                     │
       FakeUiDriver           FlaUiDriver
       (UiDriver)             (AcceptanceTests)
       net10.0                net10.0-windows
       no deps                FlaUI 5.0.0
            │                     │
            ▼                     ▼
       Used by:                Used by:
       - SmokeTests.cs         - DemoBaselineFlaUi
       - SlicePassN unit tests - MultiTagSoakFlaUi
       (in-process,            (out-of-process,
        deterministic,          attaches to running
        no app launch)          App via Process.Id)

3. Lifecycle — capture session

A capture session is the entire arc from tools/Capture-Measurements.ps1 ... invocation to the row block being written:

   t=0                                                                   t=N+ε
    │                                                                      │
    ▼                                                                      ▼
   ┌──────────┐    ┌──────────┐    ┌──────────────────────────┐    ┌────────────┐
   │  Build   │───▶│  Launch  │───▶│  Run (DurationSeconds)    │───▶│  Cleanup + │
   │  Release │    │  3 procs │    │                           │    │  Extract   │
   │  config  │    │          │    │  scenario [Fact] runs:    │    │            │
   │          │    │  ① App   │    │   - SelectComboBoxItem    │    │ kill       │
   │  dotnet  │    │  ② counters│  │     (profile)             │    │  dotnet-   │
   │  build   │    │     collect│  │   - Connect (retry × 3)   │    │  counters  │
   │          │    │  ③ test  │    │   - LoadRecipe            │    │ kill App   │
   │          │    │     scenario│ │   - Home (retry × 3)      │    │            │
   │  fail →  │    │          │    │   - while !duration:      │    │ Import-    │
   │  exit 1  │    │  set env │    │       Run                 │    │  Module    │
   │          │    │  vars    │    │       wait Faulted→Ready  │    │  Measure-  │
   │          │    │          │    │       Home (retry × 3)    │    │  mentExtr- │
   │          │    │          │    │   - Disconnect            │    │  action    │
   │          │    │          │    │                           │    │            │
   │          │    │          │    │  test exits;              │    │ ConvertTo- │
   │          │    │          │    │  PowerShell waits for     │    │  Measure-  │
   │          │    │          │    │  duration to elapse so    │    │  mentRow   │
   │          │    │          │    │  CSV captures the full    │    │            │
   │          │    │          │    │  window                   │    │ optional:  │
   │          │    │          │    │                           │    │  -Append-  │
   │          │    │          │    │                           │    │  ToTable   │
   └──────────┘    └──────────┘    └──────────────────────────┘    └────────────┘

The [Trait("Category", "Capture")] attribute on capture-mode [Fact]s is what keeps dotnet test (without --filter) from running them in CI. The orchestrator passes --filter "FullyQualifiedName~MultiTagSoakFlaUi" (or similar) to invoke specifically the capture scenario.

4. Runtime flow — single capture iteration

The headline flow inside a MultiTagSoakFlaUi.RunAsync while-loop iteration:

  Test scenario           FlaUiDriver       App (under test)        AppMetrics      dotnet-counters
  ─────────────           ───────────       ────────────────         ──────────      ───────────────

   ┌────┴────┐
   │ Run     │
   │ Single- │
   │ Iter    │
   │ Async   │
   └────┬────┘

        │ ClickByAutomationIdAsync("StartRunButton")
        │ ──────────▶ │
        │             │ FindElementAsync("StartRunButton", 10s)
        │             │   poll @ 100 ms via UIA3
        │             │   ────────────────▶ │ window.FindFirstDescendant
        │             │                     │   ── element returned
        │             │ ◀────────────────── │
        │             │ wait IsEnabled = true (poll 100 ms)
        │             │ AsButton().Invoke() ──────────▶ │ click handler:
        │             │                                  │  WorkflowService.StartRunAsync()
        │             │                                  │  …run loop on background work…
        │             │                                  │  ─────────▶ │ runs.started.Add(1)
        │             │                                  │             │ ────────────▶ csv row
        │             │                                  │  …frame pipeline emits…
        │             │                                  │  ─────────▶ │ frames.ingested.Add(N)
        │             │                                  │             │ ────────────▶ csv row
        │             │                                  │  …tag pipeline emits…
        │             │                                  │  ─────────▶ │ samples.ingested.Add(M)
        │             │                                  │             │ ────────────▶ csv row
        │             │                                  │  …run completes…
        │             │                                  │  WorkflowState ─▶ "Completed"
        │             │                                  │  ─────────▶ │ runs.completed.Add(1)
        │             │                                  │             │ ────────────▶ csv row

        │ WaitForTextAsync("WorkflowStateText", "Completed", 60s)
        │ ──────────▶ │ poll @ 200 ms via UIA3
        │             │   ReadTextByAutomationIdAsync
        │             │     Properties.Name.ValueOrDefault
        │             │     ── eventually returns "Completed"
        │             │ ◀──── true (matched within 60 s)

        │ if "Faulted":  ── chaos handling, see SLICE-1.4 design notes §5(h)
        │   wait Faulted→Idle (poll @ 300 ms, 15 s deadline)

        │ Click("HomeButton") + WaitForText("Ready", 30s)
        │   retry up to 3× — fault during homing forces re-Home

        │ next iteration ────────────────▶

The polling cadence (100 ms find, 200 ms text-wait, 300 ms chaos-recovery) is empirically tuned. Faster polling spikes UIA3 CPU; slower polling misses fast state transitions. None of these intervals is in code as a tunable knob; they're constants in FlaUiDriver and the scenario.

5. Decisions made during implementation

(a) IUiDriver abstraction with two implementations. The drive-by-AutomationId interface is small (5 methods) — small enough that a fake implementation covers every test scenario without dragging in FlaUI/UIA3 runtime dependencies. SmokeTests exercises each method against FakeUiDriver purely in-process; the FlaUI implementation is only used for live captures. The cost was building the InspectionPrototype.UiDriver project as a separate net10.0 (not -windows) target — but that's a one-time cost, and the win is that any future test or tool that wants to script the UI can use IUiDriver without the WPF/FlaUI surface.

(b) Application.Attach, not Application.Launch. The orchestrator launches the app (via Start-Process), captures the PID, and passes it to the test via the APP_PROCESS_ID environment variable. The test calls Process.GetProcessById(pid) and Application.Attach(process). The alternative — Application.Launch(path) from inside the test — would couple the test to the app's launch arguments and make dotnet-counters collect setup harder (counters needs the PID before the test starts). Attach decouples the lifecycle: PowerShell owns process creation and shutdown; xUnit owns scenario logic.

(c) [Trait("Category", "Capture")] on capture-mode [Fact]s. xUnit's --filter can include or exclude by trait. The orchestrator runs dotnet test --filter "Category=Capture"; CI runs dotnet test --filter "Category!=Capture". This means CI never accidentally launches a capture run (which needs an APP_PROCESS_ID env var that isn't set in CI). The trait is the boundary between "tests" (unit/integration, run in CI) and "scenarios" (capture-only, manual or orchestrator-only).

(d) Scenario as xUnit [Fact] rather than a console app. Reusing xUnit means the test runner provides timeout handling, exception-to-failure mapping, parallel-test prevention (Collection attribute), and the --filter mechanism above. A custom console app would need to reimplement all of that. The cost is one xUnit pattern oddity: the [Fact] reads from environment variables instead of method parameters, since xUnit doesn't pass arguments to [Fact] methods.

(e) MainWindowAutomationIdRegressionTests exists as a guard. XAML's AutomationProperties.AutomationId="X" syntax is an attached property — when an XAML reader inspects the attribute at parse time, the LocalName is "AutomationProperties.AutomationId", not "AutomationId". A naive XAML test using attr.LocalName == "AutomationId" would silently miss them all. The regression tests assert each known AutomationId is present using the correct attached-property name. If you add a new AutomationId, add an entry to the regression test.

(f) RecoverButton AutomationId was added during SLICE-1.4 (commit 0f1596a), not during SLICE-1.6 itself. Reason: MainWindow.xaml had no AutomationId on the Recover button — operators clicked it visually, but FlaUI couldn't find it. ChaosMonkey's frequent fault-recovery cycles surfaced the gap. The fix (1 XAML attribute + 1 line in MainWindowAutomationIdRegressionTests) was small but meaningful: every workflow path must have a clickable AutomationId or the rig can't capture under that path.

(g) The four FlaUI hardening fixes (bf32566, 0f1596a, 5462d42, 2108272) all landed during the SLICE-1.4 capture pass, not during SLICE-1.6 implementation. SLICE-1.6 designed for happy-path captures (DemoBaseline, MultiTag); ChaosMonkey was the first profile that broke happy-path assumptions:

  • Connect can fail (ConnectionFailureProbability=0.30) — needs retry-Connect loop
  • Home can be interrupted (AlarmBurst hits during the ~1 s homing window) — needs retry-Home loop
  • Run can end Faulted instead of Completed — needs Faulted→not-Faulted wait before next-Home
  • Profile-specific knobs need to be set BEFORE Connect — needs SIMULATOR_PROFILE env var path

The fixes are scenario-level retry/wait code, not driver-level changes. The driver kept the same API; the scenario got more defensive. This is the right level for the change — driver-level retry would imply blanket retry of every operation, which would mask real bugs in non-chaos profiles.

(h) Scenario uses Environment.GetEnvironmentVariable("SIMULATOR_PROFILE") ?? "MultiTag". Originally hardcoded to "MultiTag". The env-var override was added in commit 874946d so the same scenario can capture different profiles without scenario-class duplication. ChaosMonkey, Soak8h, EncoderRate, HighFrameRate all reuse MultiTagSoakFlaUi with different -Profile arguments to the orchestrator. Don't add per-profile scenario classes unless the scenario logic genuinely differs (e.g., a continuous-streaming scenario that never stops the run loop would need its own class).

6. Invariants and traps

FlaUiDriver.Application.Attach requires the app to be fully launched. UIA3 needs the main window to exist and be reachable; if the test attaches before the WPF render loop has spun up, GetMainWindow returns null. Capture-Measurements.ps1 sleeps 3 seconds after Start-Process App.exe to let the window appear. If you change the app's startup sequence (e.g., add a splash screen), the 3-second sleep may need to grow.

AutomationProperties.AutomationId is an attached property, not a regular attribute. When inspecting XAML programmatically, the LocalName is "AutomationProperties.AutomationId". The MainWindowAutomationIdRegressionTests use this; a future test that uses just "AutomationId" will silently report 0 matches. Same trap applies to any future XAML-inspection code.

FlaUI 5.0.0 is pinned in Directory.Packages.props. FlaUI's APIs have shifted between major versions (the Application.Attach signature changed in 4.x). Directory.Packages.props pins both FlaUI.Core and FlaUI.UIA3 to 5.0.0. Don't bump FlaUI without re-running every capture scenario — even minor version bumps occasionally change the behavior of FindFirstDescendant's timeout semantics.

FlaUiDriver.CaptureScreenshotPngAsync returns Array.Empty<byte>(). The implementation was deferred (see code comment) because FlaUI's Capture() returns System.Drawing.Bitmap which requires extra runtime dependencies on net10.0-windows that the project doesn't currently pull in. Don't rely on screenshots in scenario code — the method is wired but doesn't produce real bytes. If you need it for failure diagnostics, add the deps and implement properly first.

Capture-Measurements.ps1 runs dotnet-counters collect as a background process and kills it via PID at the end. dotnet-counters doesn't have a built-in run-for-N-seconds flag (it does have --duration, but it's been unreliable in v9). The PID-and-kill pattern is robust but means a crashed PowerShell session leaves orphaned dotnet-counters processes. If you ever see port already in use errors, check for orphaned dotnet-counters processes and kill them.

The orchestrator's -AppendToTable flag uses Add-Content to prepend the row block. The order in phase-1-measurements.md is most-recent-first, with rows nested under section headers. The append logic is straightforward but assumes the file's section structure is stable. If you reorganize the measurements file (e.g., split into per-phase files), the append logic needs updating.

SimulatorProfilesOptionsBindingTests exists because of the SLICE-1.2 silent-ignore bug. Pass 1 of TASK-1.2 added 3 new fields to SimulatorProfileOptions (FrameWidth, FrameHeight, BytesPerPixel) but the JSON binder dropped them silently — runtime saw the defaults regardless of appsettings.json. The test guards future regressions by asserting all 9 profile fields bind correctly. It does not currently cover the 7 chaos fields added in SLICE-1.4 — see SLICE-1.4 design notes §5(g) for the same bug pattern that recurred at the hydration-service level. Add coverage if Phase 2 introduces more profile fields.

Application.Attach does NOT release on Dispose. The using var driver = new FlaUiDriver(process) only disposes the UIA3Automation, not the FlaUI Application. The underlying FlaUI documentation is sparse on this; in practice attaching multiple times to the same process from sequential tests works fine, but rapid attach/detach cycles within a single process lifetime have been observed to fail intermittently. Don't run multiple FlaUI scenarios against the same app instance without restarting the process.

7. Test surface

Covered by in-process tests (run in CI):

  • SmokeTests (5 tests): each IUiDriver method via FakeUiDriver. Click logging, text reading, text-wait timeout, ComboBox selection, screenshot stub.
  • MainWindowAutomationIdRegressionTests (2 tests): every known AutomationId is present in MainWindow.xaml, using the correct attached-property LocalName.
  • SimulatorProfilesOptionsBindingTests (4 tests): 9 profile fields round-trip through IConfigurationIOptions<SimulatorProfilesOptions>.

Covered by capture (manual or orchestrator-only, never in CI):

  • Every Phase 1 row from slice-1-2-real-frame-payloads onward (all 5 rows in phase-1-measurements.md Phase 1 section). The captures themselves are the FlaUI rig's integration test.
  • Each row's pre-existing-row reproducibility check (criterion 16 of SLICE-1.4): the rig produces deterministic happy-path captures within the per-tag accuracy bounds amended in SLICE-1.1's criterion 7.

Not covered (intentional gaps):

  • FlaUiDriver itself has no automated tests. The implementation is a thin wrapper over UIA3; testing it directly would require mocking UIA3 (significant work) or running real FlaUI tests (which need a WPF app to attach to — chicken and egg). The capture sessions themselves verify the driver works; if a driver-level bug surfaces in a capture, debug from the capture log + the failure exception.
  • PowerShell orchestrator has no Pester tests for the orchestration logic. MeasurementExtraction.psm1 has Pester tests for the math helpers; the orchestrator has only manual smoke verification. The orchestrator is short (~245 lines) and mostly composes other tools; the cost-benefit on Pester-testing it is low.
  • AutomationId-presence at runtime. The XAML regression test verifies the attribute is in the XAML source. It doesn't verify FlaUI can actually find it at runtime (e.g., if a control is inside a Visibility=Collapsed container, AutomationId is technically present but UIA3 won't find it). Capture sessions exercise this path; a unit test would require launching the app.
  • Chaos resilience of the rig. The four hardening fixes (retry-Connect, retry-Home, Faulted→not-Faulted wait, RecoverButton AutomationId) all landed reactively during the ChaosMonkey capture. There's no automated test that asserts "rig handles a fault during homing." Phase 2 may want to add a chaos-targeted unit test that exercises the retry loops with a fake driver, but the bar is low — the fixes are simple and the failure mode is loud (capture fails with TimeoutException).

Notably absent test: there is no test for "what happens if the app crashes during a capture?" The orchestrator's cleanup logic (kill processes, write CSV) handles a clean shutdown but doesn't exit cleanly on app crash — dotnet-counters keeps running, the PowerShell test command times out, and the operator has to manually kill orphaned processes. If a capture surprises by hanging, check Task Manager for orphaned dotnet-counters and InspectionPrototype.App processes first.

Docs-first project memory for AI-assisted implementation.