Skip to content

TASK-2.0: Implement Store Allocation & Contention Profiling

Objective

Add per-call instrumentation to AppStateStore.Update (allocation, lock-wait, calls/sec, caller distribution) and capture a 30-minute MultiTag row that quantifies store-side load. The row's Notes section produces an explicit Phase 2 prioritization recommendation: open SLICE-2.1 / 2.2 / 2.3 / 2.4 conditionally, or defer Phase 2 entirely. Ship measurement only — no architectural refactoring.

Scope

  • AppStateStore.Update instrumented with [CallerMemberName] + [CallerFilePath], Stopwatch for lock-wait, GC.GetAllocatedBytesForCurrentThread() for alloc-delta
  • IAppStateStore.Update signature updated with two optional [Caller*] parameters (no callsite changes required)
  • Three new AppMetrics members: StoreUpdateCalls (Counter<long>, dim caller), StoreUpdateAllocBytes (Counter<long>), StoreUpdateLockWaitMicros (Histogram<double>)
  • Test fakes (RecordingAppStateStore, etc. under tests/.../Stubs/) updated to match the new interface signature
  • New MeasurementExtraction.psm1 helpers: Get-StoreUpdateRate, Get-StoreUpdateAllocShare, Get-StoreUpdateLockWaitP95, Get-StoreUpdateCallerDistribution
  • ConvertTo-MeasurementRow extended to emit four new rows; existing 22 metrics preserved
  • New file docs/reviews/phase-2-measurements.md with slice-2-0-store-profiling row block
  • New runbook §5.1 (replaces the §4.7+ Phase-2 placeholder added in SLICE-1.4 close-out)
  • 30-minute MultiTag capture committed under docs/captures/slice-2-0-store-profiling-<date>.csv
  • Row's Notes section includes a Phase 2 prioritization recommendation with measured-numbers citation
  • Tests: AppStateStoreInstrumentationTests, Pester tests for the four new helpers, regression check that slice-1-1-multi-tag-telemetry reproduces

Non-Scope

  • Refactoring AppStateStore (slicing, immutable collections, observables, lift-out) — that's Phase 2.1/2.2/2.3/2.4, opened conditionally based on this slice's evidence
  • Instrumenting subscribers of StateChanged — Phase 2.4 measurement, not Phase 2 baseline
  • Per-AppState-field allocation breakdown — out of scope; if 2.0 evidence motivates 2.1, the 2.1 task can add per-field accounting then
  • Stack-walk-based caller resolution — [CallerMemberName] + [CallerFilePath] is sufficient
  • Removing or renaming the StateChanged event — semantics unchanged
  • Adding new profiles — uses existing MultiTag
  • A new IScenario class — MultiTagSoakFlaUi with -Profile MultiTag is the capture path
  • Writing the actual Phase 2 specs (2.1/2.2/2.3/2.4) — out of scope; the row's recommendation names which Phase 2 slice to write next, but writing it is itself a separate slice-opening session
  • Changing IAppStateStore.Update's required arguments — the two new parameters are optional with [Caller*] defaults

Touched Projects

  • src/InspectionPrototype.ApplicationAppStateStore.cs (instrumentation), IAppStateStore.cs (signature update), AppMetrics.cs (three new members)
  • tests/InspectionPrototype.TestsAppStateStoreInstrumentationTests, updates to existing RecordingAppStateStore and any other IAppStateStore test fakes under Stubs/
  • tools/MeasurementExtraction.psm1 — four new helpers, ConvertTo-MeasurementRow extension
  • tests/Tools/MeasurementExtraction.Tests.ps1 — Pester tests for the new helpers
  • docs/runbook/capturing-measurements.md — new §5.1 (replaces §4.7+ placeholder)
  • docs/reviews/phase-2-measurements.md — new file
  • docs/captures/ — new CSV evidence
  • (no changes to) IFrameSource, ITagStream, IEncoderStream, IMachineConnection, IMotionController, any UI code, AppState record, simulator profiles

AI Tool Guidance

Three Copilot passes; one-pass-per-session protocol as in TASK-1.3 / TASK-1.4.

  1. Instrumentation + AppMetrics + interface signature + tests. Update AppStateStore.Update and IAppStateStore.Update signatures with the two [Caller*] parameters. Add three new AppMetrics members. Update test fakes. Tests for the instrumentation including caller-dimension verification, alloc-delta accuracy, and lock-wait distribution. NO measurement-extraction work. NO captures.
  2. MeasurementExtraction helpers + phase-2-measurements.md skeleton. Four new helpers + ConvertTo-MeasurementRow extension + Pester tests. Create docs/reviews/phase-2-measurements.md (Conventions section + Fixed metric set + empty Phase 2 rows section). NO captures.
  3. 30-min MultiTag capture + row + Phase 2 prioritization recommendation + runbook §5.1. Run the capture, append the row, write the recommendation citing measured numbers + the SLICE-2.0 decision rubric, write §5.1, replace §4.7+ placeholder, update CLAUDE.md + roadmap-progress. NO code changes.

Acceptance Criteria Mapping

The implementation must satisfy all acceptance criteria from SLICE-2.0:

  • Pass 1 covers criteria 1, 2, 3, and the C#-test portions of 10
  • Pass 2 covers criteria 4, 5, 6, and the Pester portions of 10
  • Pass 3 covers criteria 7, 8, 9, 11, 12

Copilot Agent Prompts

Pass 1 — Instrumentation + AppMetrics + interface signature + tests

You are implementing Pass 1 of TASK-2.0 in this repository: instrument
AppStateStore.Update with allocation, lock-wait, calls/sec, and caller-
distribution counters, plus update the IAppStateStore interface signature
and any test fakes. NO measurement-extraction work, NO captures.

## Authoritative references

Read these before making changes:
- docs/specs/SLICE-2.0-store-allocation-profiling.md   (the requirements)
- docs/tasks/TASK-2.0-implement-store-allocation-profiling.md   (this task)
- src/InspectionPrototype.Application/Services/AppStateStore.cs
- src/InspectionPrototype.Application/Abstractions/IAppStateStore.cs
- src/InspectionPrototype.Application/Diagnostics/AppMetrics.cs
- tests/InspectionPrototype.Tests/Stubs/   (folder — every IAppStateStore fake here gets updated)
- tests/InspectionPrototype.Tests/AppMetricsTagDimensionTests.cs   (parallel pattern)
- tests/InspectionPrototype.Tests/AppMetricsEncoderDimensionTests.cs   (parallel pattern)

Spec acceptance criteria 1, 2, 3, and the C#-test portions of 10 are the
definition of done for this pass.

## Scope of this pass

Instrumentation in AppStateStore + interface signature update + three new
AppMetrics counters + test-fake updates + new tests. NO MeasurementExtraction.psm1
changes, NO captures, NO new runbook sections.

## Deliverables

1. AppMetrics (src/InspectionPrototype.Application/Diagnostics/AppMetrics.cs):
   Add three new members:
       Counter<long> StoreUpdateCalls       = _meter.CreateCounter<long>("store.update.calls")
       Counter<long> StoreUpdateAllocBytes  = _meter.CreateCounter<long>("store.update.alloc.bytes")
       Histogram<double> StoreUpdateLockWaitMicros = _meter.CreateHistogram<double>(
           "store.update.lock_wait.micros",
           unit: "us",
           description: "Time spent waiting to acquire the AppStateStore lock, in microseconds.")

   Update the XML doc summary to mention the new counters and the `caller`
   dimension on StoreUpdateCalls. The existing 11 members and 1 gauge are
   unchanged.

2. IAppStateStore (src/InspectionPrototype.Application/Abstractions/IAppStateStore.cs):
   Update the Update signature:
       void Update(
           Func<AppState, AppState> reducer,
           [System.Runtime.CompilerServices.CallerMemberName] string callerMember = "",
           [System.Runtime.CompilerServices.CallerFilePath]   string callerFile   = "");

   The two new parameters are optional with empty-string defaults. The C#
   compiler injects the actual values at every callsite via [CallerMemberName] /
   [CallerFilePath]. NO callsite changes required.

3. AppStateStore (src/InspectionPrototype.Application/Services/AppStateStore.cs):
   Replace the current Update implementation with:

       public void Update(
           Func<AppState, AppState> reducer,
           [CallerMemberName] string callerMember = "",
           [CallerFilePath]   string callerFile   = "")
       {
           string caller = string.IsNullOrEmpty(callerFile)
               ? callerMember
               : $"{Path.GetFileNameWithoutExtension(callerFile)}.{callerMember}";

           var sw = Stopwatch.StartNew();
           long allocBefore = GC.GetAllocatedBytesForCurrentThread();
           AppState next;
           lock (_lock)
           {
               // Lock-wait time = elapsed up to lock acquisition.
               _metrics.StoreUpdateLockWaitMicros.Record(sw.Elapsed.TotalMicroseconds);

               next = reducer(_current);
               _current = next;
           }
           long allocBytes = GC.GetAllocatedBytesForCurrentThread() - allocBefore;
           _metrics.StoreUpdateAllocBytes.Add(allocBytes);
           _metrics.StoreUpdateCalls.Add(
               1, new KeyValuePair<string, object?>("caller", caller));

           StateChanged?.Invoke(next);
       }

   Constructor: AppStateStore now takes `AppMetrics metrics` (DI-injected).
   The lock is held for the same logical duration as before plus one
   Histogram.Record call (~50 ns).

4. DI wiring: AppStateStore is currently registered (probably in
   ApplicationServiceCollectionExtensions). Add the AppMetrics dependency
   to the registration if it's not already a constructor parameter.

5. Test fakes (tests/InspectionPrototype.Tests/Stubs/):
   Every IAppStateStore implementation in this folder gets the two new
   optional parameters in its Update method signature. The defaults match
   IAppStateStore (empty strings). Existing test setups continue to call
   Update without passing the new parameters; the C# compiler resolves them.

   Specifically:
   - RecordingAppStateStore (used in EncoderStreamPipelineServiceTests)
   - Any other IAppStateStore fake under Stubs/

6. Tests under tests/InspectionPrototype.Tests/:
   AppStateStoreInstrumentationTests:

   [Fact] Update_IncrementsStoreUpdateCalls_WithCallerDimension:
     Construct AppStateStore with a fresh AppMetrics; attach a MeterListener
     to the InspectionPrototype meter for store.update.calls; call store.Update
     from a method named e.g. CallerOfUpdate(); assert exactly one event
     with caller="AppStateStoreInstrumentationTests.CallerOfUpdate" was emitted.
     (Verify both the file-name and method-name halves of the dimension.)

   [Fact] Update_IncrementsStoreUpdateAllocBytes_WhenReducerAllocates:
     Reducer creates `new List<int>(1024)` (~4 KB allocation in the lock).
     Call GC.Collect() before the test setup; subscribe the listener to
     store.update.alloc.bytes; call Update once; assert the recorded
     increment is in [3 KB, 8 KB] (loose bounds to absorb runtime overhead).
     Note: the test must NOT use `new List<int> { … }` because the collection
     initialiser would also allocate inside the reducer; the size argument
     constructor is more deterministic.

   [Fact] Update_RecordsStoreUpdateLockWaitMicros_UnderNoContention:
     Single thread; call Update 1000 times serially; subscribe to
     store.update.lock_wait.micros via MeterListener that captures every
     measurement; assert p95 of recorded values < 100.0 (under 100 µs).

   [Fact] Update_LockWait_RegistersUnderContention:
     10 threads each call Update 100 times concurrently with reducers that
     pause briefly under the lock (e.g., `Thread.SpinWait(1000)`); assert
     at least one recorded measurement exceeds 100 µs (lock was contended).
     Bound: assert no measurement exceeds 100 ms (1e5 µs) — that would
     indicate a bug, not just contention.

   [Fact] Update_StateChanged_FiresOutsideTheLock:
     Hook a StateChanged handler that calls AppStateStore.Current; the
     getter takes the same lock; if StateChanged fired *inside* the lock,
     Current would deadlock (SynchronizationLockException is the typical
     failure on .NET's Lock; assert no exception is thrown). This is a
     regression check on the existing semantics, not new behavior.

7. AppMetricsTests / AppMetricsStoreUpdateDimensionTests (or extend
   AppMetricsTagDimensionTests):
   Verify the three new counters/histogram exist on the InspectionPrototype
   meter with the names "store.update.calls", "store.update.alloc.bytes",
   "store.update.lock_wait.micros".

## Constraints

- Do NOT change the AppState record, IAppStateStore.Current, IAppStateStore.StateChanged,
  or AppStateStore.StateChanged firing semantics (still fires outside the lock).
- Do NOT add a new opt-in flag for the instrumentation. It is always-on.
  The performance budget (~250 ns/call × 50 Hz = 12.5 µs/sec total) is
  below the noise floor.
- Do NOT use stack-walk to derive caller. [CallerMemberName] +
  [CallerFilePath] is the spec's chosen mechanism.
- Do NOT add per-field allocation accounting (out of scope for SLICE-2.0).
- Do NOT modify any callsite of AppStateStore.Update outside the test fakes.
  The compiler injects [Caller*] values; explicit callsite changes would
  defeat the purpose of the optional parameters.
- StoreUpdateLockWaitMicros uses the .NET 9+ Histogram<double> API. If the
  project targets a TFM where this is not available, escalate before
  proceeding (check Directory.Build.props for the TFM).

## Verification before you report done

  dotnet build --configuration Release
  dotnet test --configuration Release

The encoder-cadence test SimulatedEncoderSourceTests.ProduceAsync_At200Hz_*
may flake on busy hosts (SLICE-1.3 documented issue). If it fails, re-run
once; if it persistently fails, report and continue — it's not blocking
this pass.

Manual smoke test:
  - Launch interactively (no --scenario flag); app starts with Normal profile.
  - Open dotnet-counters monitor on the running process and confirm
    "store.update.calls", "store.update.alloc.bytes", and
    "store.update.lock_wait.micros" appear in the output and increment.
  - Switch to MultiTag profile; confirm the call rate visibly increases.

## Report format when finished

- files created and modified
- confirmation that all existing tests pass plus new tests
- a single commit hash
- commit message: "feat(app): instrument AppStateStore.Update with alloc / lock-wait / caller counters (pass 1/3 of TASK-2.0)"

Pass 2 — MeasurementExtraction helpers + phase-2-measurements.md skeleton

You are implementing Pass 2 of TASK-2.0. Pass 1 (AppStateStore
instrumentation, three new AppMetrics counters, IAppStateStore signature
update, test fakes, instrumentation tests) is already merged. This pass
adds the four MeasurementExtraction helpers, extends ConvertTo-MeasurementRow
to emit four new rows, and creates the phase-2-measurements.md skeleton.

## Authoritative references

Read these before making changes:
- docs/specs/SLICE-2.0-store-allocation-profiling.md   (criteria 4, 5, 6)
- src/InspectionPrototype.Application/Diagnostics/AppMetrics.cs   (Pass 1 counters)
- tools/MeasurementExtraction.psm1
- tests/Tools/MeasurementExtraction.Tests.ps1
- docs/reviews/phase-1-measurements.md   (mirror its structure)

Pass 1 must be merged. Confirm by inspecting AppMetrics.StoreUpdateCalls /
StoreUpdateAllocBytes / StoreUpdateLockWaitMicros before starting.

## Scope of this pass

PowerShell helpers, ConvertTo-MeasurementRow extension, Pester tests, new
phase-2-measurements.md file. NO captures. NO C# changes.

## Deliverables

1. tools/MeasurementExtraction.psm1:

   function Get-StoreUpdateRate {
       [CmdletBinding()]
       param([Parameter(Mandatory)][object[]] $Csv,
             [Parameter(Mandatory)][double]   $DurationSeconds)
       if (-not $Csv -or $Csv.Count -eq 0 -or $DurationSeconds -le 0) { return $null }
       $rows = @($Csv | Where-Object {
           $_.'Counter Name' -match 'store\.update\.calls'
       })
       if ($rows.Count -eq 0) { return $null }
       $total = ($rows | Measure-Object -Property 'Mean/Increment' -Sum).Sum
       return [math]::Round([double]$total / $DurationSeconds, 1)
   }

   function Get-StoreUpdateAllocShare {
       [CmdletBinding()]
       param([Parameter(Mandatory)][object[]] $Csv)
       if (-not $Csv -or $Csv.Count -eq 0) { return $null }
       $storeRows = @($Csv | Where-Object {
           $_.'Counter Name' -match 'store\.update\.alloc\.bytes'
       })
       $totalRows = @($Csv | Where-Object {
           $_.'Counter Name' -match 'dotnet\.gc\.heap\.total_allocated'
       })
       if ($storeRows.Count -eq 0 -or $totalRows.Count -eq 0) { return $null }
       $storeBytes = ($storeRows | Measure-Object -Property 'Mean/Increment' -Sum).Sum
       $totalBytes = ($totalRows | Measure-Object -Property 'Mean/Increment' -Sum).Sum
       if ($totalBytes -le 0) { return $null }
       return [math]::Round(100.0 * [double]$storeBytes / [double]$totalBytes, 1)
   }

   function Get-StoreUpdateLockWaitP95 {
       [CmdletBinding()]
       param([object[]] $Csv)
       if (-not $Csv -or $Csv.Count -eq 0) { return $null }
       # dotnet-counters emits histogram counters as percentile rows in the
       # Counter Name column, e.g.
       #   "store.update.lock_wait.micros (us)[Percentile=95]"
       # If the runtime emits a different format, this helper falls back to
       # parsing every row matching store.update.lock_wait.micros and taking
       # the 95th percentile of the Mean/Increment values.
       $p95Rows = @($Csv | Where-Object {
           $_.'Counter Name' -match 'store\.update\.lock_wait\.micros.*Percentile=95'
       })
       if ($p95Rows.Count -gt 0) {
           $vals = $p95Rows | ForEach-Object { [double]$_.'Mean/Increment' }
           return [math]::Round(($vals | Measure-Object -Average).Average, 1)
       }
       # Fallback: aggregate all histogram-named rows.
       $allRows = @($Csv | Where-Object {
           $_.'Counter Name' -match 'store\.update\.lock_wait\.micros'
       })
       if ($allRows.Count -eq 0) { return $null }
       $vals = $allRows | ForEach-Object { [double]$_.'Mean/Increment' } | Sort-Object
       $idx = [int][math]::Ceiling($vals.Count * 0.95) - 1
       if ($idx -lt 0) { $idx = 0 }
       return [math]::Round($vals[$idx], 1)
   }

   function Get-StoreUpdateCallerDistribution {
       [CmdletBinding()]
       param([object[]] $Csv)
       # Returns top 5 callers as [pscustomobject]@{Caller=…; Calls=…; SharePct=…}
       if (-not $Csv -or $Csv.Count -eq 0) { return @() }
       $rows = @($Csv | Where-Object {
           $_.'Counter Name' -match 'store\.update\.calls'
       })
       if ($rows.Count -eq 0) { return @() }
       # Group by caller dimension extracted from Counter Name, e.g.
       #   "store.update.calls (Count / 1 sec)[caller=FramePipelineService.ExecuteAsync]"
       $byCaller = @{}
       foreach ($row in $rows) {
           if ($row.'Counter Name' -match 'caller=([^,\]]+)') {
               $caller = $matches[1]
               $val = [double]$row.'Mean/Increment'
               if ($byCaller.ContainsKey($caller)) {
                   $byCaller[$caller] += $val
               } else {
                   $byCaller[$caller] = $val
               }
           }
       }
       $totalCalls = ($byCaller.Values | Measure-Object -Sum).Sum
       if ($totalCalls -le 0) { return @() }
       $sorted = $byCaller.GetEnumerator() |
           Sort-Object -Property Value -Descending |
           Select-Object -First 5
       return $sorted | ForEach-Object {
           [pscustomobject]@{
               Caller   = $_.Key
               Calls    = [int]$_.Value
               SharePct = [math]::Round(100.0 * $_.Value / $totalCalls, 1)
           }
       }
   }

   Update ConvertTo-MeasurementRow to call all four helpers and emit FOUR
   new rows after the existing 22-metric block:

   | store.update rate (calls/s)         | <Get-StoreUpdateRate or "—"> |
   | store.update alloc share (%)        | <Get-StoreUpdateAllocShare or "—"> |
   | store.update lock-wait p95 (µs)     | <Get-StoreUpdateLockWaitP95 or "—"> |
   | store.update top-caller             | <top entry from Get-StoreUpdateCallerDistribution as "Caller (Calls × Share%)" or "—"> |

   The full Get-StoreUpdateCallerDistribution output (top 5) is NOT written
   to the table; the row's Notes section consumes it.

   Update Export-ModuleMember to include the four new functions.

2. tests/Tools/MeasurementExtraction.Tests.ps1:
   Six new Pester tests:

   - StoreUpdateRate_OnFixtureWith600Calls_Returns20:
       synthetic CSV with 600 store.update.calls rows over 30 s; expect 20.0
   - StoreUpdateRate_OnEmptyCsv_ReturnsNull
   - StoreUpdateAllocShare_OnFixture_ComputesPercent:
       synthetic CSV with store.update.alloc.bytes summing to 100 MB and
       dotnet.gc.heap.total_allocated summing to 1000 MB; expect 10.0
   - StoreUpdateLockWaitP95_OnPercentileRow_ReadsP95:
       synthetic CSV with a Percentile=95 row of value 47.5; expect 47.5
   - StoreUpdateLockWaitP95_OnFallbackPath_ComputesFromAllRows:
       synthetic CSV with 100 lock_wait.micros rows of values 1..100 (no
       percentile rows); expect ≈ 95.0 from manual percentile compute
   - StoreUpdateCallerDistribution_OnFixture_ReturnsTopFiveSorted:
       synthetic CSV with 7 distinct caller dimensions with known counts;
       expect 5 entries, sorted descending by Calls, with SharePct summing
       to ≤ 100 (the omitted 2 callers' share is the remainder)

   Also: ConvertTo-MeasurementRow_AppendsFourNewRows_WhenStoreCountersPresent
   and ConvertTo-MeasurementRow_EmitsDashWhenStoreCountersAbsent (mirrors
   the SLICE-1.3 / 1.4 sentinel pattern).

3. docs/reviews/phase-2-measurements.md:
   New file, mirrors phase-1-measurements.md structure:

       # Phase 2 Measurements
       ...one-paragraph header explaining the file's purpose: companion to
       phase-1-measurements.md; rows added as Phase 2 slices land...
       ## Conventions
       ...same conventions text as phase-1-measurements.md...
       ### Fixed metric set
       ...same 22 base metrics + 4 store-side metrics from SLICE-2.0...
       ## Phase 2 rows
       (placeholder; first row added by Pass 3)

   Cross-reference phase-1-measurements.md from the header. The file is
   committed empty-of-rows in this pass; Pass 3 fills the slice-2-0 row.

## Constraints

- Do NOT make any C# changes.
- Do NOT make any captures.
- The four new helpers must NOT throw on empty / malformed CSV input;
  return $null or @() per the existing convention.
- Pester tests must NOT depend on real captures in docs/captures/. Use
  synthetic in-memory PSCustomObject arrays (existing pattern in the file).
- The histogram-percentile parsing is best-effort. dotnet-counters' CSV
  format for histograms is not fully stable across versions; the
  fallback-to-manual-percentile path covers older releases.

## Verification before you report done

  dotnet build --configuration Release
  dotnet test --configuration Release

  Pester: Invoke-Pester tests/Tools/MeasurementExtraction.Tests.ps1
  All six new tests pass plus the existing tests.

  Manual smoke capture (60 seconds, MultiTag profile):
    tools/Capture-Measurements.ps1 -Scenario MultiTagSoak `
      -DurationSeconds 60 -Profile MultiTag `
      -OutputCsv docs/captures/_smoke.csv `
      -CommitHash $(git rev-parse --short HEAD) -AllowDirty
  Verify:
    * exit code 0
    * the printed row block has 26 metrics; the four new store.update rows
      are present with non-"—" values
    * store.update top-caller is one of the expected callers
      (FramePipelineService.* / TagStreamPipelineService.* / WorkflowService.* /
      MainViewModel.*)
  Delete the smoke CSV before commit.

## Report format when finished

- files created and modified
- confirmation all C# tests + Pester tests pass
- the smoke-capture printed row block (the 26-metric markdown table)
- a single commit hash
- commit message: "feat(tools): add store.update measurement helpers and phase-2-measurements skeleton (pass 2/3 of TASK-2.0)"

Pass 3 — 30-min MultiTag capture + row + Phase 2 prioritization recommendation + runbook §5.1

You are implementing Pass 3 of TASK-2.0, the final pass. Passes 1 and 2
are merged. This pass runs the 30-minute MultiTag capture, appends the
slice-2-0-store-profiling row to phase-2-measurements.md, writes the
Phase 2 prioritization recommendation in the row's Notes section, writes
runbook §5.1 (replacing the §4.7+ placeholder added in SLICE-1.4 close-out),
and updates session-handoff documents. NO code changes.

## Authoritative references

Read these before making changes:
- docs/specs/SLICE-2.0-store-allocation-profiling.md   (criteria 7, 8, 9, 11, 12 + decision rubric)
- docs/runbook/capturing-measurements.md   (existing §3a, §4.x, current §4.7+ placeholder)
- docs/reviews/phase-2-measurements.md   (Pass 2 skeleton)
- docs/reviews/phase-1-measurements.md   (slice-1-1-multi-tag-telemetry baseline)
- CLAUDE.md, docs/reviews/roadmap-progress.md
- tools/Capture-Measurements.ps1

## Scope of this pass

Capture, table edit, runbook §5.1, session-handoff updates. No code.

## Deliverables

1. Disable system sleep before capture (TASK-1.5.1 follow-up runbook
   discipline; reaffirmed across SLICE-1.2/1.3/1.4):
       powercfg /change standby-timeout-ac 0
       powercfg /change monitor-timeout-ac 0
   Note previous values for restoration.

2. Run the 30-minute MultiTag capture:
       $date = Get-Date -Format 'yyyy-MM-dd'
       tools/Capture-Measurements.ps1 -Scenario MultiTagSoak `
         -DurationSeconds 1800 -Profile MultiTag `
         -OutputCsv "docs/captures/slice-2-0-store-profiling-$date.csv" `
         -CommitHash $(git rev-parse --short HEAD) `
         -SliceTag slice-2-0-store-profiling

   Verify:
       * exit code 0; CSV span ≥ 1700 s
       * the printed row block has 26 metrics; the four new store.update
         rows have non-"—" values
       * store.update top-caller is one of the expected callers
       * regression check: the first 22 metrics (frames/tags/encoder/GC/CPU)
         are within ±10% of slice-1-1-multi-tag-telemetry's values
         (criterion 11). If they aren't, the instrumentation is too heavy;
         STOP and file a follow-up.

3. Append the row block to docs/reviews/phase-2-measurements.md (in the
   Phase 2 rows section that Pass 2 left empty):

   ### Row — slice-2-0-store-profiling

   - **Scenario:** §5.1 Store-allocation profiling (30 min, `MultiTag`) via `MultiTagSoakFlaUi`
   - **Capture:** `docs/captures/slice-2-0-store-profiling-<date>.csv` (<span> s)
   - **Profile:** `MultiTag`
   - **Commit:** <hash>
   - **Date:** <today>

   26-metric table mirroring the slice-1-1-multi-tag-telemetry format:
   Slice | Metric | Baseline | After | Delta | Capture method | Date
   - Baseline column = slice-1-1-multi-tag-telemetry values for the 22 base
     metrics; "—" for the four new SLICE-2.0 metrics.

   ### Notes on slice-2-0-store-profiling

   Mandatory subsections:

   (a) **Why slice-1-1 is the baseline.** Both rows use the `MultiTag`
       profile and the same scenario (`MultiTagSoakFlaUi`); the only
       intentional difference is SLICE-2.0's instrumentation. Reproducibility
       is criterion 11.

   (b) **Caller distribution (top 5).** A markdown table with columns
       Caller / Calls / Share% from Get-StoreUpdateCallerDistribution. Walk
       through what each top caller does and why it appears at that rank.

   (c) **Phase 2 prioritization recommendation.** Apply the SLICE-2.0
       decision rubric using the captured numbers. Cite each measured value
       and the matching rubric line. Pick exactly one outcome:
         - "SLICE-2.1 opens (alloc share = X%, ≥ 10% threshold)"
         - "SLICE-2.3 (tags) opens (top caller = TagStreamPipelineService.*, share = X%)"
         - "SLICE-2.3 (frames) opens (top caller = FramePipelineService.*, share = X%)"
         - "SLICE-2.4 opens (lock-wait p95 = X µs, ≥ 100 µs threshold)"
         - "Phase 2 deferred entirely (alloc share = X%, lock-wait p95 = Y µs;
            store is doing exactly what it was designed for)"
       The recommendation must be reproducible from the row's data alone —
       a reviewer reading only the Notes section should be able to derive
       the same conclusion.

   (d) **What surprised, if anything.** Working-set drift over the 30 min,
       any unexpected callers in the top-5, anomalous lock-wait outliers, etc.

4. Add §5.1 to docs/runbook/capturing-measurements.md:
   Replace the existing "### 4.7+ — pending Phase 2 scenarios" placeholder
   with a new "## §5 Phase 2 scenarios" section heading and "### 5.1 Store-
   allocation profiling — SLICE-2.0, MultiTag profile" subsection:
       - one-paragraph rationale linking back to SLICE-2.0
       - 30-minute step list mirroring §4.5/§4.6 with profile = MultiTag
       - sanity checks: 26-metric row block, store.update top-caller is
         identifiable, alloc share is in [0, 100]
       - what the captured numbers MEAN (cross-reference SLICE-2.0 decision
         rubric)
       - Implemented by: `MultiTagSoakFlaUi` with `--profile MultiTag`
   Add a new "### 5.2+ pending Phase 2 scenarios" placeholder at the end
   listing the slices the Pass 3 recommendation pointed at (or "Reserved
   for future Phase 2 slices" if Phase 2 was deferred).

5. Update CLAUDE.md "Current position" block:
   - Phase: 2 (Store under pressure) — SLICE-2.0 complete; <decision-outcome>
   - Last completed action: TASK-2.0 Pass 3 — captured 30-min MultiTag
     row (alloc share = X%, lock-wait p95 = Y µs, top caller = Z). Phase 2
     prioritization recommendation: <one-line summary>. Commit <hash>.
   - Next action: <if a Phase 2 slice opens> Open SLICE-2.X (next slice
     name) — write spec + task. <or if deferred> Phase 2 deferred; consider
     opening Phase 3 evaluation.
   - Blocked on: nothing
   - Last updated: <today>

6. Append a session-log entry to docs/reviews/roadmap-progress.md:
       ### <today's date> — TASK-2.0 Pass 3 closed; Phase 2 prioritization decided
       - SLICE-2.0 closed; CSV at docs/captures/slice-2-0-store-profiling-<date>.csv
         (<span> s).
       - Headline: store.update rate = X calls/s; alloc share = Y%;
         lock-wait p95 = Z µs; top caller = <caller> (<share>%).
       - Phase 2 prioritization recommendation: <decision per rubric>.
       - Reproducibility check vs slice-1-1-multi-tag-telemetry: passed
         (≤ 10% drift on each base metric).
       - Commit <hash>. Next: <next slice or deferral>.

7. Mark SLICE-2.0 row in the progress table (Phase 2 section) as Completed.
   Add the SLICE-2.X row(s) the recommendation points at as Proposed (or
   leave Phase 2 section empty + add a "Phase 2 deferred" banner if that
   was the decision).

8. Restore powercfg settings:
       powercfg /change standby-timeout-ac <previous-minutes>
       powercfg /change monitor-timeout-ac <previous-minutes>

## Constraints

- Do NOT make any code or test changes.
- Do NOT skip the 30-minute capture.
- Do NOT capture without disabling sleep first.
- Do NOT pre-decide the Phase 2 prioritization. Apply the rubric to the
  measured numbers; write whichever outcome the data supports.
- Do NOT proceed to step 3 (table edit) if the reproducibility check
  (criterion 11) fails. If the instrumentation perturbs the data plane
  by more than 10%, STOP and file a follow-up.

## Verification before you report done

  dotnet build --configuration Release
  dotnet test --configuration Release

Plus:
  - docs/captures/slice-2-0-store-profiling-<date>.csv exists and is committed
  - the slice-2-0-store-profiling row block is in phase-2-measurements.md
    with all 26 metrics filled
  - the row's Notes section has all four mandatory subsections
  - §5.1 renders correctly (no broken markdown / links)
  - CLAUDE.md current-position reflects SLICE-2.0 closure and the Phase 2
    decision

## Report format when finished

- files created and modified
- the captured 26-metric row block included verbatim
- the four headline numbers (rate, alloc share, lock-wait p95, top caller)
  with the matching decision-rubric outcome
- a single commit hash
- commit message: "feat(measurements): SLICE-2.0 row + Phase 2 prioritization recommendation; runbook §5.1 (pass 3/3 of TASK-2.0)"

Operator notes

  • One pass per Copilot session. Same protocol as TASK-1.3 / TASK-1.4.
  • Pass 1's load-bearing detail is the [Caller*] parameters on the interface. If those are missed, every existing callsite breaks at compile time and Copilot will instinctively "fix" by passing strings explicitly at every callsite. That defeats the slice's purpose and pollutes the diff. Reviewers must confirm IAppStateStore.Update carries the two optional parameters and no callsite outside tests/.../Stubs/ was modified.
  • Pass 2's histogram parsing is fragile across dotnet-counters versions. The fallback-to-manual-percentile path is intentional. If the captured CSV contains neither Percentile=95 rows nor regular lock_wait.micros rows, the helper returns $null and the row reads "—" — that's a runtime issue (instrumentation didn't fire) not a parser issue.
  • Pass 3's prioritization recommendation is the slice's whole point. A row that documents the numbers but defers the decision back to "future judgment" is not done. The decision rubric in the SLICE-2.0 spec is deliberately mechanical so the recommendation reads from the data, not from intuition.
  • The deferred-Phase-2 outcome is a first-class success. If alloc share < 10% AND lock-wait p95 < 100 µs AND top caller is workflow-state-machine code, the right Phase 2 plan is "do nothing, open Phase 3." That outcome saves weeks of work the roadmap hadn't budgeted for. Don't try to manufacture a refactor.
  • Update the index files only at the end of the phase, not per-slice. Same rationale as earlier tasks. Phase 2's exit-gate banner (or deferral banner) goes into roadmap-progress.md after Pass 3 lands.

Docs-first project memory for AI-assisted implementation.