TASK-3.3: Implement SQLite Persistence + Schema Versioning
- Status: Proposed (no passes started)
- Date: 2026-05-07
- Spec: SLICE-3.3: SQLite Persistence
- Depends on: SLICE-002: Persistent Run History (the JSON store being replaced), SLICE-1.6: FlaUI Capture (the rig that captures the row block)
Objective
Replace the JSON file-backed run history with a SQLite-backed implementation using Dapper (not EF Core), introduce a forward-only migration infrastructure that future Phase 3 slices can extend, add alarm-history persistence (currently in-memory only), and produce a measurement row that quantifies persistence pressure under a 10 000-row pre-populated database. One-time import of existing run-history.json content runs on first SQLite-mode startup.
Scope
- Add
Dapper+Microsoft.Data.SqlitetoDirectory.Packages.props - New
Infrastructure/Data/Migrations/folder withM001_initial_schema.sql(embedded resource) andMigrationRunner(IHostedService) - New
SqlitePersistenceOptions(Persistence:Sqliteconfig block) IRunHistoryStoreinterface evolution:LoadAsync()removed;LoadRecentAsync(int),LoadPageAsync(int, int),CountAsync(),GetAsync(Guid)addedSqliteRunHistoryStore(Dapper) replacesJsonRunHistoryStore(the old file is deleted)- New
IAlarmHistoryStore+SqliteAlarmHistoryStore;AlarmHistoryEntryrecord WorkflowServiceintegration with the alarm-history store (OnFaultInjected/OnFaultCleared/AcknowledgeFault)JsonImportService(IHostedService) — one-time idempotent import of existing JSON fileAppState.TotalRunCountfield added;HistoryHydrationServiceandWorkflowService.RunLoopAsync.finallyupdated- New
MeasurementExtraction.psm1helpers:Get-RunsPersistedCount,Get-AlarmsPersistedCount,Get-RecentHistoryLoadP95 - New file
docs/reviews/phase-3-measurements.md; new runbook §5.2 entry - 30-min
MultiTagcapture against a 10K-row pre-populated database; row block appended
Non-Scope
- Per-run
TagSamplesnapshot persistence (deferred to a follow-up after SLICE-3.1 ships rich defects) - Rich defect persistence (
Defect(Id, FrameId, BoundingBox, Classification, Confidence, ImageRef)) — SLICE-3.1's whole point InspectionResultper-frame persistence — in-memory only continues- EF Core or any other ORM — explicitly Dapper
- Concurrent-writer scenarios across multiple WPF processes — single-instance mutex (SLICE-006) prevents this
- Backup / restore tooling —
inspection.dbcopy is the entire procedure - Schema rollback — migrations are forward-only
- UI changes to display
TotalRunCountor paginate older history — Phase 3.1 / 3.2 concern - Removing
RunHistoryStoreOptions— kept as the import source path; removal in a follow-up - Encryption-at-rest, role-based access — Phase 3.4 / Phase 4 territory
Touched Projects
src/InspectionPrototype.Application—IRunHistoryStoreinterface evolution; newIAlarmHistoryStore+AlarmHistoryEntry;AppState.TotalRunCount;HistoryHydrationServiceupdated;WorkflowServicealarm-history callssrc/InspectionPrototype.Infrastructure— newData/Migrations/MigrationRunner.cs,Data/Migrations/M001_initial_schema.sql(embedded),Data/SqlitePersistenceOptions.cs,Data/SqliteRunHistoryStore.cs,Data/SqliteAlarmHistoryStore.cs,Data/JsonImportService.cs; deleteData/JsonRunHistoryStore.cs; DI wiring inInfrastructureServiceCollectionExtensionssrc/InspectionPrototype.Application/State/AppState.cs— addlong TotalRunCounttests/InspectionPrototype.Tests—MigrationRunnerTests,SqliteRunHistoryStoreTests,SqliteRunHistoryStorePerformanceTests,SqliteAlarmHistoryStoreTests,JsonImportServiceTests,WorkflowServiceAlarmHistoryIntegrationTests; update existing tests that referencedLoadAsync()tools/MeasurementExtraction.psm1— three new helpers;ConvertTo-MeasurementRowextensiontests/Tools/MeasurementExtraction.Tests.ps1— Pester tests for the new helpersdocs/runbook/capturing-measurements.md— new §5.2 entrydocs/reviews/phase-3-measurements.md— new filedocs/captures/— new CSV evidenceDirectory.Packages.props— addDapperandMicrosoft.Data.Sqlite- (no changes to)
IFrameSource,ITagStream,IEncoderStream, simulator profiles, FlaUI rig, any UI code
AI Tool Guidance
Three Copilot passes; one-pass-per-session protocol as in TASK-1.3 / TASK-1.4.
- SQLite + migration runner + run history. Add packages; write M001 SQL; implement
MigrationRunner; implementSqliteRunHistoryStore(Dapper); evolveIRunHistoryStore; deleteJsonRunHistoryStore; implementJsonImportService; updateAppState,HistoryHydrationService,WorkflowService.RunLoopAsync.finally; tests including the 10K-row pagination benchmark. NO alarm-history work. NO captures. - Alarm history persistence. Implement
IAlarmHistoryStore+SqliteAlarmHistoryStore; integrateWorkflowService.OnFaultInjected/OnFaultCleared/AcknowledgeFault; tests including a no-op verification on missing rows. NO captures. - Capture + row block + runbook §5.2. Pre-populate the database to 10K rows; run 30-min MultiTag capture; append row block to
phase-3-measurements.md(new file); write runbook §5.2; update CLAUDE.md + roadmap-progress; assess whether the row triggers any Phase 2 slice. NO code changes.
Acceptance Criteria Mapping
The implementation must satisfy all acceptance criteria from SLICE-3.3:
- Pass 1 covers criteria 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17 (the run-history portions)
- Pass 2 covers criteria 6, 7, and the alarm-history portions of 17
- Pass 3 covers criteria 14, 15, 16
Copilot Agent Prompts
Pass 1 — SQLite + migration runner + run history
You are implementing Pass 1 of TASK-3.3 in this repository: replace the JSON
file-backed run history with a SQLite-backed implementation using Dapper,
introduce a forward-only migration infrastructure, and update the AppState
shape + WorkflowService integration to use pagination instead of loading
all history into memory.
NO alarm-history work (Pass 2). NO captures (Pass 3).
## Authoritative references
Read these before making changes:
- docs/specs/SLICE-3.3-sqlite-persistence.md (the requirements)
- docs/tasks/TASK-3.3-implement-sqlite-persistence.md (this task)
- src/InspectionPrototype.Application/Abstractions/IRunHistoryStore.cs
- src/InspectionPrototype.Infrastructure/Data/JsonRunHistoryStore.cs (the file being replaced)
- src/InspectionPrototype.Infrastructure/Data/RunHistoryStoreOptions.cs (kept; import-source path)
- src/InspectionPrototype.Application/Services/HistoryHydrationService.cs
- src/InspectionPrototype.Application/Services/WorkflowService.cs (RunLoopAsync.finally)
- src/InspectionPrototype.Application/State/AppState.cs
- src/InspectionPrototype.Domain/Contracts/RunSummary.cs
- src/InspectionPrototype.Infrastructure/InfrastructureServiceCollectionExtensions.cs
- Directory.Packages.props
## Scope of this pass
Migration infrastructure + SQLite run-history store + interface evolution +
JSON import + AppState/HydrationService/WorkflowService updates + tests.
NO alarm-history. NO captures. NO new measurement file.
## Deliverables
1. Directory.Packages.props:
Add (alphabetical placement appropriate):
<PackageVersion Include="Dapper" Version="2.1.66" />
<PackageVersion Include="Microsoft.Data.Sqlite" Version="10.0.0" />
Reference both from src/InspectionPrototype.Infrastructure/InspectionPrototype.Infrastructure.csproj
and from tests/InspectionPrototype.Tests/InspectionPrototype.Tests.csproj.
2. Embedded migration SQL:
src/InspectionPrototype.Infrastructure/Data/Migrations/M001_initial_schema.sql:
(the schema from spec §"Tables (initial migration M001)")
- schema_version table + initial row insert (version 1)
- run_summaries table + idx_run_summaries_started_at_utc index
- alarm_history table + 2 indexes (idx_alarm_history_raised_at_utc,
idx_alarm_history_run_id)
Configure the .csproj to embed the .sql file as a resource:
<ItemGroup>
<EmbeddedResource Include="Data\Migrations\*.sql" />
</ItemGroup>
3. SqlitePersistenceOptions
(src/InspectionPrototype.Infrastructure/Data/SqlitePersistenceOptions.cs):
public sealed class SqlitePersistenceOptions {
public const string SectionName = "Persistence:Sqlite";
public string DatabasePath { get; set; } = string.Empty;
public string JournalMode { get; set; } = "WAL";
}
4. MigrationRunner
(src/InspectionPrototype.Infrastructure/Data/Migrations/MigrationRunner.cs):
- Implements IHostedService.
- StartAsync:
1. Build connection string from SqlitePersistenceOptions.DatabasePath.
Format: "Data Source={path};Mode=ReadWriteCreate;Foreign Keys=True;Cache=Shared"
2. Open SqliteConnection; PRAGMA journal_mode = <opts.JournalMode>.
3. Ensure schema_version table exists (CREATE TABLE IF NOT EXISTS) —
this is the bootstrap step before applying M001 itself.
4. Read MAX(version) FROM schema_version, default 0.
5. Enumerate embedded resources matching "Migrations/M*.sql" pattern,
sort by file name (M001 before M002 etc.), parse out the version
number from the file name.
6. For each migration with version > current:
a. BeginTransaction
b. ExecuteAsync(sqlText, transaction)
c. INSERT INTO schema_version (version, applied_at) VALUES (@v, datetime('now'))
d. Commit
e. Log Info "Applied migration M{NNN}"
7. On any exception during a migration: rollback that migration's
transaction, log Critical with the migration name + exception,
and rethrow so host startup aborts.
- StopAsync: Task.CompletedTask.
5. IRunHistoryStore (interface evolution):
src/InspectionPrototype.Application/Abstractions/IRunHistoryStore.cs:
Task<IReadOnlyList<RunSummary>> LoadRecentAsync(int count = 50, CancellationToken ct = default);
Task<IReadOnlyList<RunSummary>> LoadPageAsync(int skip, int take, CancellationToken ct = default);
Task<long> CountAsync(CancellationToken ct = default);
Task<RunSummary?> GetAsync(Guid runId, CancellationToken ct = default);
Task SaveAsync(RunSummary summary, CancellationToken ct = default);
The previous LoadAsync() (returning all) is REMOVED. Update all
callers (HistoryHydrationService is the only production caller; one
or two test fakes may also need updating).
6. SqliteRunHistoryStore
(src/InspectionPrototype.Infrastructure/Data/SqliteRunHistoryStore.cs):
- Implements IRunHistoryStore.
- Constructor: IOptions<SqlitePersistenceOptions> options, ILogger<SqliteRunHistoryStore> logger
- Each method opens a fresh SqliteConnection, awaits, disposes via using.
- SaveAsync: parameterized INSERT OR REPLACE INTO run_summaries (...) VALUES (...).
Serialize MajorAlarms list to JSON via System.Text.Json.JsonSerializer.Serialize.
- LoadRecentAsync(count): SELECT ... ORDER BY started_at_utc DESC LIMIT @count
- LoadPageAsync(skip, take): SELECT ... ORDER BY started_at_utc DESC LIMIT @take OFFSET @skip
- CountAsync: SELECT COUNT(*) FROM run_summaries
- GetAsync(runId): SELECT ... WHERE run_id = @runId
- Dapper mapping: define a private DTO record matching the column shape;
the mapper deserializes major_alarms_json back to IReadOnlyList<string>
and converts terminal_status string back to RunTerminalStatus enum.
7. JsonImportService
(src/InspectionPrototype.Infrastructure/Data/JsonImportService.cs):
- IHostedService. Constructor takes IRunHistoryStore + IOptions<RunHistoryStoreOptions>
+ IOptions<SqlitePersistenceOptions> + ILogger.
- StartAsync:
1. Check if run_summaries already has rows: var count = await _store.CountAsync()
If count > 0: log Debug "Skipping JSON import (SQLite already has rows)"; return.
2. Check if RunHistoryStoreOptions.FilePath exists: if no, return.
3. Try to JsonSerializer.DeserializeAsync<List<RunSummary>>(stream).
On JsonException: rename file to FilePath + ".malformed-{utcTs}";
log Warning; return.
On any other exception: rename file to FilePath + ".error-{utcTs}";
log Warning; return.
4. Iterate (in original order — JsonRunHistoryStore writes newest-first;
we want oldest first into SQLite so default ordering matches):
foreach (var summary in items.AsEnumerable().Reverse())
await _store.SaveAsync(summary);
5. Rename FilePath to FilePath + ".imported-{utcTs}". Log Info
"Imported {N} run summaries from legacy JSON store".
- StopAsync: Task.CompletedTask.
8. DI wiring (InfrastructureServiceCollectionExtensions):
Replace the current JsonRunHistoryStore registration:
services.AddSingleton<IRunHistoryStore, JsonRunHistoryStore>();
With:
services.AddOptions<SqlitePersistenceOptions>().Configure(o => {
o.DatabasePath = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData),
"LcnWaferInspection",
"inspection.db");
o.JournalMode = "WAL";
});
services.AddSingleton<IRunHistoryStore, SqliteRunHistoryStore>();
services.AddHostedService<MigrationRunner>(); // priority: very high
services.AddHostedService<JsonImportService>(); // priority: between MigrationRunner and HistoryHydrationService
IHostedService priority is determined by REGISTRATION ORDER. Verify the
sequence: AddInfrastructure → AddApplication. Inside AddInfrastructure,
register MigrationRunner BEFORE HistoryHydrationService. Confirm by
tracing the AddInfrastructure / AddApplication call sites in
App.OnStartup.cs (or wherever the DI is composed).
9. Delete file: src/InspectionPrototype.Infrastructure/Data/JsonRunHistoryStore.cs
Verify no remaining references via solution-wide search before commit.
10. AppState evolution:
src/InspectionPrototype.Application/State/AppState.cs:
Add a new property at the end of the record (after CrashBanner):
long TotalRunCount = 0L
Update AppState.Initial accordingly.
11. HistoryHydrationService (Application.Services):
Replace the LoadAsync() call:
var history = await _historyStore.LoadAsync(cancellationToken);
_appState.Update(s => s with {
RunHistory = history,
LastRunSummary = history.Count > 0 ? history[0] : s.LastRunSummary,
});
With:
var recent = await _historyStore.LoadRecentAsync(50, cancellationToken);
var total = await _historyStore.CountAsync(cancellationToken);
_appState.Update(s => s with {
RunHistory = recent,
TotalRunCount = total,
LastRunSummary = recent.Count > 0 ? recent[0] : s.LastRunSummary,
});
Update the log line to include TotalRunCount.
12. WorkflowService.RunLoopAsync.finally:
Find the `_store.Update` call that builds `updatedHistory`:
var updatedHistory = s.RunHistory
.Where(r => r.RunId != runId)
.Prepend(summary)
.ToList();
Replace with a sliding window of 50:
var updatedHistory = s.RunHistory
.Where(r => r.RunId != runId)
.Prepend(summary)
.Take(50)
.ToList();
And add the TotalRunCount increment in the same `s with { ... }`:
TotalRunCount = s.TotalRunCount + 1,
13. Tests under tests/InspectionPrototype.Tests/:
- MigrationRunnerTests:
* AppliesM001OnFreshDatabase: temp file path; run; assert
schema_version contains version=1; sqlite_master query confirms
run_summaries + alarm_history + indexes.
* IsIdempotentOnPopulatedDatabase: run twice; assert version count
stays 1 (no double-apply).
* StubM002_AppliedInOrder: write a temp M002 SQL file; assert
MAX(version) = 2 after run; rerun and confirm idempotent.
* RollsBackOnFailure: write a malformed M002 SQL; assert version
stays at 1; runner throws; host startup would abort.
- SqliteRunHistoryStoreTests:
* SaveAsync_Then_GetAsync_RoundTrips: persist 1 summary; GetAsync
returns equivalent record.
* SaveAsync_Upserts_OnDuplicateRunId: persist same RunId twice;
CountAsync returns 1; second values are stored.
* LoadRecentAsync_Returns_NewestFirst: persist 5 summaries with
increasing started_at_utc; LoadRecentAsync(3) returns the 3
newest in descending order.
* LoadPageAsync_Pages_Correctly: 10 summaries; LoadPageAsync(0, 3)
and LoadPageAsync(3, 3) and LoadPageAsync(6, 3) cover all rows.
* MajorAlarmsRoundtrips: persist a summary with MajorAlarms = ["A","B","C"];
GetAsync returns the same list in order.
* EmptyDatabase_LoadRecent_ReturnsEmpty / CountAsync_ReturnsZero.
- SqliteRunHistoryStorePerformanceTests:
[Fact]
[Trait("Category", "Performance")]
public async Task LoadRecent50_AfterPopulating10K_CompletesUnder200Ms()
{
// 1. Create temp DB; run migrations.
// 2. Populate 10 000 RunSummary rows (vary started_at_utc so
// the index is exercised; vary RunId via Guid.NewGuid()).
// Use a single transaction for bulk insert: ~3-5s acceptable.
// 3. Create a fresh SqliteRunHistoryStore instance (cold start).
// 4. var sw = Stopwatch.StartNew();
// var page = await store.LoadRecentAsync(50);
// sw.Stop();
// 5. Assert page.Count == 50.
// 6. Assert sw.Elapsed.TotalMilliseconds < 200.
}
Note: this test runs Release-only (or in CI it's skipped via
Trait filter). Document the trait in the Phase 1 §3a/§3b runbook
the same way the `Category=Capture` trait works for FlaUI tests.
- JsonImportServiceTests:
* ImportsFromJson_ToEmptyDatabase: write a fixture JSON file with
3 summaries; run service; SqliteRunHistoryStore.CountAsync = 3;
JSON file is renamed to *.imported-*.
* SkipsImport_WhenDatabaseHasRows: pre-populate SQLite; run; JSON
file is unchanged; CountAsync still equals pre-existing count.
* SkipsImport_WhenJsonMissing: no JSON file; service returns
cleanly; no log error.
* HandlesMalformedJson: write a malformed JSON file; service
renames to *.malformed-*; SQLite stays empty; logged Warning.
- Update existing tests that referenced LoadAsync():
* Any RecordingRunHistoryStore or fake under Stubs/ — update its
interface implementation to match the new shape.
* HistoryHydrationServiceTests — update to mock LoadRecentAsync
and CountAsync; assert TotalRunCount is set.
14. Verification before you report done:
dotnet build --configuration Release
dotnet test --configuration Release --filter "Category!=Performance"
dotnet test --configuration Release --filter "Category=Performance"
Expected: both runs green. Performance test must consistently complete
under 200 ms; if it flakes, the index `idx_run_summaries_started_at_utc`
is the prime suspect (ensure ORDER BY uses it).
Manual smoke test:
- Delete %LocalAppData%\LcnWaferInspection\inspection.db if present.
- If %LocalAppData%\LcnWaferInspection\run-history.json exists with
prior runs, the import path runs on first launch.
- Launch app; observe in logs: "Applied migration M001",
"Imported N run summaries" (if applicable), "Run history hydrated:
N record(s) loaded; total persisted: N".
- Run a couple of complete cycles (Connect → Home → Run → ... → terminal).
- Confirm TotalRunCount increments in AppState (peek via diagnostics
timeline or restart and observe the hydration log).
- Inspect inspection.db with a sqlite client; verify
`SELECT COUNT(*) FROM run_summaries` matches expected.
## Constraints
- Do NOT touch the alarm-history concept in this pass. The M001 SQL
creates the alarm_history table for Pass 2; Pass 1 only writes to
run_summaries.
- Do NOT break the existing FaceUiDriver / capture rig — it depends on
WorkflowService behavior, not on the storage backend.
- Do NOT change the AppState.RunHistory TYPE (still IReadOnlyList<RunSummary>).
Only its semantic meaning narrows ("recent N").
- Do NOT introduce concurrent transactions. SqliteConnection is per-call
(Dapper's default); WAL mode handles reader/writer concurrency
internally.
- Do NOT add `using Dapper;` to the Application layer. Dapper stays in
Infrastructure only — the abstraction (IRunHistoryStore) keeps the
Application layer pure.
- Do NOT change any UI-side code. The CurrentFrame binding, the
diagnostics timeline projection, the recipe combo all stay as-is.
- One-time JSON import MUST be idempotent. Re-running on a populated
SQLite DB is a no-op (verified by JsonImportServiceTests).
## Report format when finished
- files created / modified / deleted (especially: deletion of JsonRunHistoryStore.cs)
- confirmation all tests pass including the performance suite
- the actual measured `LoadRecentAsync(50)` time after 10K-row populate
(e.g., "12.3 ms; well under the 200 ms ceiling")
- a single commit hash
- commit message: "feat(persistence): replace JSON run-history with SQLite + Dapper; add migration runner (pass 1/3 of TASK-3.3)"Pass 2 — Alarm history persistence
You are implementing Pass 2 of TASK-3.3. Pass 1 (SQLite + Dapper + run history)
is already merged. The alarm_history table already exists in M001. This pass
adds the IAlarmHistoryStore abstraction, the SQLite implementation, and wires
WorkflowService to persist alarm raise / clear / acknowledge events.
NO captures (Pass 3).
## Authoritative references
Read these before making changes:
- docs/specs/SLICE-3.3-sqlite-persistence.md (criteria 6, 7)
- src/InspectionPrototype.Infrastructure/Data/SqliteRunHistoryStore.cs (Pass 1 — same patterns)
- src/InspectionPrototype.Application/Services/WorkflowService.cs (OnFaultInjected/OnFaultCleared/AcknowledgeFault)
- src/InspectionPrototype.Domain/Contracts/Alarm.cs (existing record)
- src/InspectionPrototype.Application/Services/AlarmBursterService.cs (parallel pattern: fire-and-forget tolerant of failures)
Pass 1 must be merged. Confirm: SqliteRunHistoryStore exists,
JsonRunHistoryStore is deleted, AppState.TotalRunCount exists, M001 SQL
includes alarm_history table.
## Scope of this pass
IAlarmHistoryStore + AlarmHistoryEntry + SqliteAlarmHistoryStore + WorkflowService
integration + tests. NO captures. NO measurement-extraction work.
## Deliverables
1. AlarmHistoryEntry
(src/InspectionPrototype.Application/State/AlarmHistoryEntry.cs):
public record AlarmHistoryEntry(
long Id,
string AlarmCode,
AlarmSeverity Severity,
string Message,
DateTimeOffset RaisedAtUtc,
DateTimeOffset? ClearedAtUtc,
DateTimeOffset? AcknowledgedAtUtc,
Guid? RunId);
2. IAlarmHistoryStore
(src/InspectionPrototype.Application/Abstractions/IAlarmHistoryStore.cs):
public interface IAlarmHistoryStore {
Task<long> SaveAsync(Alarm alarm, DateTimeOffset raisedAt, Guid? runId, CancellationToken ct = default);
Task MarkClearedAsync(string alarmCode, DateTimeOffset clearedAt, CancellationToken ct = default);
Task MarkAcknowledgedAsync(string alarmCode, DateTimeOffset acknowledgedAt, CancellationToken ct = default);
Task<IReadOnlyList<AlarmHistoryEntry>> LoadRecentAsync(int count = 100, CancellationToken ct = default);
Task<long> CountAsync(CancellationToken ct = default);
}
3. SqliteAlarmHistoryStore
(src/InspectionPrototype.Infrastructure/Data/SqliteAlarmHistoryStore.cs):
- Implements IAlarmHistoryStore; constructor takes IOptions<SqlitePersistenceOptions> + ILogger.
- Same connection pattern as SqliteRunHistoryStore (per-call connection).
- SaveAsync: INSERT INTO alarm_history (alarm_code, severity, message,
raised_at_utc, cleared_at_utc, acknowledged_at_utc, run_id)
VALUES (...)
RETURNING id;
Use ExecuteScalarAsync<long> to get the assigned id.
- MarkClearedAsync: UPDATE alarm_history
SET cleared_at_utc = @clearedAt
WHERE id = (
SELECT id FROM alarm_history
WHERE alarm_code = @alarmCode AND cleared_at_utc IS NULL
ORDER BY raised_at_utc DESC LIMIT 1
);
ExecuteAsync returns affected-row count. If 0: log Debug
"MarkClearedAsync: no open occurrence of {AlarmCode} found"; do not
throw.
- MarkAcknowledgedAsync: same shape, replacing cleared_at_utc with
acknowledged_at_utc.
- LoadRecentAsync(count): SELECT ... ORDER BY raised_at_utc DESC LIMIT @count.
- CountAsync: SELECT COUNT(*) FROM alarm_history.
4. DI wiring (InfrastructureServiceCollectionExtensions):
Add right after the SqliteRunHistoryStore registration:
services.AddSingleton<IAlarmHistoryStore, SqliteAlarmHistoryStore>();
5. WorkflowService integration
(src/InspectionPrototype.Application/Services/WorkflowService.cs):
- Add IAlarmHistoryStore _alarmHistoryStore as a constructor parameter
and field.
- In OnFaultInjected, AFTER the existing _store.Update call:
_ = SafeAlarmHistoryAsync(() => _alarmHistoryStore.SaveAsync(
alarm, DateTimeOffset.UtcNow, _store.Current.ActiveRun?.RunId));
- In OnFaultCleared, AFTER the existing _store.Update call:
_ = SafeAlarmHistoryAsync(() => _alarmHistoryStore.MarkClearedAsync(
alarmCode, DateTimeOffset.UtcNow));
- In AcknowledgeFault, AFTER the existing _store.Update call:
_ = SafeAlarmHistoryAsync(() => _alarmHistoryStore.MarkAcknowledgedAsync(
alarmCode, DateTimeOffset.UtcNow));
Implement private async Task SafeAlarmHistoryAsync(Func<Task> op):
try { await op(); }
catch (Exception ex) { _logger.LogWarning(ex,
"Alarm history persistence failed; in-memory state is unaffected."); }
The _ = ... discards the Task — fire-and-forget. Exceptions inside are
swallowed by SafeAlarmHistoryAsync. This is the same swallow-and-log
pattern AlarmBursterService uses.
6. Tests under tests/InspectionPrototype.Tests/:
- SqliteAlarmHistoryStoreTests:
* SaveAsync_Returns_AssignedId: persist 1; assert returned id > 0.
* SaveAsync_Then_LoadRecent_RoundTrips: persist 3; LoadRecentAsync(10)
returns all 3 newest-first.
* MarkClearedAsync_Updates_LatestOpenRow: insert 2 entries with
same alarm_code; MarkClearedAsync; assert only the second row
(latest) has non-null cleared_at_utc.
* MarkClearedAsync_NoOp_OnMissing: empty DB; MarkClearedAsync
doesn't throw; logs Debug.
* MarkClearedAsync_NoOp_OnAlreadyCleared: insert + clear + clear
again; second clear is no-op (cleared_at_utc unchanged).
* MarkAcknowledgedAsync_*: same three cases as MarkClearedAsync.
* LoadRecentAsync_Returns_NewestFirst.
* CountAsync_ReturnsZero_OnEmpty.
- WorkflowServiceAlarmHistoryIntegrationTests:
* OnFaultInjected_WritesToHistory: inject; await Task.Delay(50)
to let the fire-and-forget complete; LoadRecentAsync(10)
returns the entry with matching alarm_code, raised_at_utc set,
cleared_at_utc/acknowledged_at_utc null.
* OnFaultCleared_UpdatesHistory: inject + clear; LoadRecentAsync
shows cleared_at_utc set.
* AcknowledgeFault_UpdatesHistory: inject + acknowledge;
LoadRecentAsync shows acknowledged_at_utc set.
* StoreFailure_DoesNotPropagate: stub IAlarmHistoryStore that
throws on SaveAsync; OnFaultInjected does NOT throw to caller;
AppState.ActiveAlarms still has the alarm (in-memory state unchanged).
## Constraints
- Do NOT add IAlarmHistoryStore to the Application layer's csproj as a
hard dependency — it's an interface in Application.Abstractions, just
like IRunHistoryStore. Application/Services/WorkflowService.cs imports
it via the same namespace pattern.
- Do NOT add Dapper to the Application layer. Dapper stays in
Infrastructure (SqliteAlarmHistoryStore is the only file that uses it).
- Do NOT block WorkflowService event handlers on alarm-history persistence.
The fire-and-forget (`_ = ...`) pattern + SafeAlarmHistoryAsync's
exception swallow are load-bearing — workflow state transitions must
not depend on persistence success.
- Do NOT add any new SQL migration. M001 from Pass 1 already creates
alarm_history. If Pass 2 needs to alter the schema (it shouldn't),
that's an M002 — but the spec scope keeps the schema fixed.
## Verification before you report done
dotnet build --configuration Release
dotnet test --configuration Release --filter "Category!=Performance"
Manual smoke:
- Run app; trigger a fault from the engineering panel; clear it; ack.
- Inspect inspection.db: SELECT * FROM alarm_history;
Expect 1 row with all three timestamps set.
- Trigger ChaosMonkey (manually flip FlakySdk Enabled=true, switch to
ChaosMonkey profile) for 60 seconds — observe SELECT COUNT(*) FROM
alarm_history grows by ~1-2 entries (one cycle takes ~46 s under
AlarmBurstEveryMs=45000).
## Report format when finished
- files created / modified
- confirmation all tests pass
- a single commit hash
- commit message: "feat(persistence): add alarm history store; wire WorkflowService (pass 2/3 of TASK-3.3)"Pass 3 — Capture + row block + runbook §5.2
You are implementing Pass 3 of TASK-3.3, the final pass. Passes 1 and 2 are
merged. This pass pre-populates the database to 10 000 rows, runs a 30-min
MultiTag capture under the SQLite store, appends the slice-3-3-sqlite-
persistence row to a new phase-3-measurements.md file, writes runbook
§5.2, and assesses whether the row triggers any Phase 2 slice.
NO code changes — Passes 1 and 2 own those.
## Authoritative references
Read these before making changes:
- docs/specs/SLICE-3.3-sqlite-persistence.md (criteria 14, 15, 16)
- docs/runbook/capturing-measurements.md (§5.1 placeholder added by SLICE-2.0;
§5.2 is this pass's contribution)
- docs/reviews/phase-1-measurements.md (mirror its structure for phase-3)
- docs/reviews/phase-1-capabilities-and-limits.md (claims-table style for the new doc's notes)
- tools/Capture-Measurements.ps1
- tools/MeasurementExtraction.psm1
## Scope of this pass
Database pre-population helper, 30-min capture, new phase-3-measurements.md
file, three new MeasurementExtraction helpers, runbook §5.2, session-handoff
updates. No code or test changes.
## Deliverables
1. Pre-populate the database. A small one-off helper script under
tools/Populate-SyntheticHistory.ps1:
- Parameter: -DatabasePath, -RowCount (default 10000)
- Connects via System.Data.SQLite or sqlite3 CLI
- Inserts {RowCount} synthetic RunSummary rows with:
run_id = (new Guid each row)
recipe_name = "Synthetic Recipe N" cycling N=1..50
started_at_utc / ended_at_utc spread across the past 30 days
terminal_status alternating Completed / Stopped / Aborted / Faulted
defect counts: random 0-50
major_alarms_json: '[]'
- Uses a single transaction; ~3-5 s for 10K rows.
- Idempotent: if RowCount rows already exist, no-op.
Store the helper in tools/ next to Capture-Measurements.ps1; it's a
capture-prep utility, not part of the production app.
2. tools/MeasurementExtraction.psm1:
Add and export three helpers:
function Get-RunsPersistedCount {
[CmdletBinding()]
param([Parameter(Mandatory)][string]$DatabasePathBefore,
[Parameter(Mandatory)][string]$DatabasePathAfter)
# Open both DBs; SELECT COUNT(*) FROM run_summaries;
# return after - before.
}
function Get-AlarmsPersistedCount {
[CmdletBinding()]
param([Parameter(Mandatory)][string]$DatabasePathBefore,
[Parameter(Mandatory)][string]$DatabasePathAfter)
# Same shape; SELECT COUNT(*) FROM alarm_history.
}
function Get-RecentHistoryLoadP95 {
[CmdletBinding()]
param([Parameter(Mandatory)][string]$LogPath)
# Read app log; find lines matching "Run history hydrated:"
# extract the TotalMilliseconds value (must be added to
# HistoryHydrationService.StartAsync's log line in Pass 1).
# If multiple log lines (e.g., across restarts within the capture
# window), take p95.
}
The HistoryHydrationService log line should be enriched in Pass 1 to
include the wall-clock cost. If Pass 1 didn't add it, this pass adds
the log enrichment minimally — but the spec criterion expects Pass 1
to have it. Verify before adding.
Update ConvertTo-MeasurementRow to optionally accept -DatabaseBefore
and -DatabaseAfter parameters (for the persisted-count metrics) and
to look up the recent-history-load p95 from the log. Append three
new rows to the markdown output:
| runs.persisted (count) | <Get-RunsPersistedCount> |
| alarms.persisted (count) | <Get-AlarmsPersistedCount> |
| recent-history-load p95 (ms) | <Get-RecentHistoryLoadP95> |
Use "—" sentinel when the helper returns null.
3. tests/Tools/MeasurementExtraction.Tests.ps1:
Three new Pester tests using synthetic SQLite databases (created via
sqlite3 CLI in the test setup). Tests cover:
- Get-RunsPersistedCount on (5 rows before, 12 rows after) returns 7.
- Get-AlarmsPersistedCount on parallel fixture returns expected count.
- Get-RecentHistoryLoadP95 parses a synthetic log line and returns the value.
4. Disable system sleep before capture (Phase 1 discipline reaffirmed):
powercfg /change standby-timeout-ac 0
powercfg /change monitor-timeout-ac 0
5. Capture procedure:
$date = Get-Date -Format 'yyyy-MM-dd'
$dbPath = "$env:LOCALAPPDATA\LcnWaferInspection\inspection.db"
# Step 1: backup any pre-existing database
if (Test-Path $dbPath) { Move-Item $dbPath "$dbPath.preCapture-$date" }
# Step 2: pre-populate to 10K rows (the migration runner creates
# the schema on first app launch; we need it created before populate.
# Workaround: launch app briefly to trigger migration, exit, then
# populate. OR — better — copy a seed DB from a known-good fixture
# if one is committed. For this pass, the launch-to-trigger-migration
# path is the pragmatic option.)
# Build first; launch app; observe "Applied migration M001" in log;
# close app; then run:
tools/Populate-SyntheticHistory.ps1 -DatabasePath $dbPath -RowCount 10000
# Step 3: capture before-state
Copy-Item $dbPath "$env:TEMP\inspection.db.before-$date"
# Step 4: run the capture
tools/Capture-Measurements.ps1 -Scenario MultiTagSoak `
-DurationSeconds 1800 -Profile MultiTag `
-OutputCsv "docs/captures/slice-3-3-sqlite-persistence-$date.csv" `
-CommitHash $(git rev-parse --short HEAD) `
-SliceTag slice-3-3-sqlite-persistence
# Step 5: capture after-state
Copy-Item $dbPath "$env:TEMP\inspection.db.after-$date"
Verify:
* exit code 0; CSV span >= 1700 s
* inspection.db has 10 000 + (runs from this capture) rows
* alarm_history has > 0 rows (under MultiTag the AlarmBurstEveryMs=0
so 0 chaos alarms; but ConnectionFailureProbability=0.20 may
still produce some workflow-driven faults — confirm)
* The first 22 metrics (frames / tags / encoder / GC / CPU /
working-set drift) are within ±10% of slice-1-1-multi-tag-telemetry
(criterion 13 reproducibility). If they aren't, the persistence
change perturbed the data plane materially — STOP and investigate.
6. Append the row block to a NEW file docs/reviews/phase-3-measurements.md:
Mirror phase-1-measurements.md's structure exactly:
- Conventions section
- Fixed metric set section (the standard 22 + 4 from SLICE-1.4 + 3
new persistence metrics introduced here = 29 total available;
individual rows can omit any not applicable)
- "## Phase 3 rows" heading
- Row block tagged slice-3-3-sqlite-persistence
The row's Notes section must include:
(a) Baseline reference: slice-1-1-multi-tag-telemetry (most recent
MultiTag-profile capture); the persistence change is the only
intentional difference.
(b) Persistence evidence: runs.persisted / alarms.persisted / recent-
history-load p95 numbers, with interpretation.
(c) Reproducibility check: confirm the first 22 metrics are within
±10% of slice-1-1; cite specific deltas.
(d) Phase 2 trigger assessment: compare cpu-usage avg, working-set
growth, alloc-rate avg vs slice-2-0-store-profiling (if 2.0's row
has landed). If alloc-rate grew materially under SQLite persistence,
that's potentially a SLICE-2.1 trigger; if not, persistence pressure
is being absorbed without store-side cost. The recommendation:
open or defer specific Phase 2 slices, citing measured numbers.
7. Add §5.2 to docs/runbook/capturing-measurements.md:
- Title: "### 5.2 SQLite-backed persistence row — SLICE-3.3, MultiTag profile"
- Placement: after §5.1 (SLICE-2.0)
- Content covers:
* one-paragraph rationale linking back to SLICE-3.3 spec
* Database pre-population step with the new helper script
* Standard MultiTag 30-min capture procedure
* Sanity checks (row block, persistence counts, reproducibility)
* Phase 2 trigger assessment as the row's expected output
- Implemented by: MultiTagSoakFlaUi with --profile MultiTag, plus
tools/Populate-SyntheticHistory.ps1 pre-step.
8. Update CLAUDE.md "Current position" block:
- Phase: 1 complete; Phase 2/3 in parallel per 2026-05-07 strategy;
SLICE-3.3 complete (criterion 12 met: 10K-row pagination < 200 ms;
reproducibility passed)
- Last completed action: TASK-3.3 Pass 3 — captured slice-3-3-sqlite-
persistence row; runs.persisted = X, alarms.persisted = Y, recent-
history-load p95 = Z ms. Phase 2 trigger assessment: <decision>.
Commit <hash>.
- Next action: <if Phase 2 trigger fired> Open SLICE-2.1 or whichever
<else> Open SLICE-3.1 (rich defect model) — write spec + task.
9. Append session-log entry to docs/reviews/roadmap-progress.md.
Mark SLICE-3.3 row as Completed (or Proposed → Completed if not yet
in the table; add a Phase 3 progress table mirroring Phase 1's).
10. Restore powercfg settings.
## Constraints
- Do NOT make any code or test changes in this pass.
- Do NOT skip the 10K-row pre-population — the persistence-pressure
evidence depends on it.
- Do NOT proceed to step 6 (table edit) if criterion 13 reproducibility
fails. The pre-existing rows must reproduce within ±10%.
- The Phase 2 trigger assessment in Notes (d) is the row's whole
decision-making purpose. Don't punt it to "future analysis"; apply
the SLICE-2.0 decision rubric to whatever data this row produces and
state the conclusion explicitly.
## Verification before you report done
dotnet build --configuration Release
dotnet test --configuration Release --filter "Category!=Performance"
Plus:
- docs/captures/slice-3-3-sqlite-persistence-<date>.csv exists and is committed
- docs/reviews/phase-3-measurements.md exists with the row block
- §5.2 renders correctly in the runbook
- CLAUDE.md current-position reflects SLICE-3.3 closure and the
Phase 2 trigger decision
## Report format when finished
- files created / modified
- the captured row block (markdown table) included verbatim
- the three new persistence metrics' values + one-sentence interpretation each
- the Phase 2 trigger assessment outcome
- a single commit hash
- commit message: "feat(measurements): SLICE-3.3 row + Phase 2 trigger assessment; runbook §5.2 (pass 3/3 of TASK-3.3)"Operator notes
- One pass per Copilot session. Same protocol as TASK-1.3 / TASK-1.4 / TASK-2.0.
- Pass 1's load-bearing detail is the migration runner's idempotency. A migration that re-applies because the runner mis-reads
MAX(version)will corrupt the schema. TheMigrationRunnerTestscover this; reviewers must confirm the test suite has bothIsIdempotentOnPopulatedDatabaseandRollsBackOnFailurecases passing. - Pass 1's other load-bearing detail is the JSON import path. It must be idempotent (re-running on a populated SQLite is no-op) AND fail-soft (malformed JSON renames the file and continues). The combination of
JsonImportServiceTestscases verifies both. - Pass 2's fire-and-forget pattern is intentional. Workflow state transitions must NOT depend on alarm-history persistence success. If a future review asks "why isn't this
await-ed?", the answer is in the spec § "WorkflowService integration" — alarm history is an audit-trail concern, not a workflow-correctness one. - Pass 3's reproducibility bar is the gate. Criterion 13 (first 22 metrics within ±10% of
slice-1-1-multi-tag-telemetry) must hold. If the persistence change perturbs the data plane materially, that's a real architectural finding — likely the SQLite write path on the run-summary thread is not as cheap as the JSON write was, and something needs to move off the workflow'sfinallythread. Don't bypass this check. - Phase 2 trigger assessment in Pass 3's Notes section is the slice's whole strategic value. This row block is the first concrete Phase 3 evidence under the 2026-05-07 strategy doc's broadened trigger conditions. The assessment in the row's Notes section determines whether SLICE-2.1 / 2.2 open next or stay deferred. Apply the rubric mechanically.
- Update the index files only at the end of the phase, not per-slice. Same rationale as earlier tasks. When Phase 3 is done (3.3 + 3.1 + 3.2 + 3.4), do an index sweep across
docs/specs/index.mdanddocs/tasks/index.md.