Skip to content

SLICE-3.2: Wafer Loop / Cassette Cadence

Goal

Compose SLICE-3.3 (run history) + SLICE-3.1 (defect persistence) + the existing workflow state machine into a wafer-cassette scheduler that runs N wafers back-to-back through a Load → Align → Run → Unload phase sequence. Each wafer gets a unique WaferId; each cassette gets a LotId; both flow through RunSummary. The slice's exit gate is a 25-wafer cassette completing under the Soak8h profile with correct per-wafer records persisted, defects correlated to the right wafer, and no FK / orphan-row issues in the SQLite schema. This slice closes the FK band-aid filed in the SLICE-3.1 follow-ups by introducing a stub-row-at-run-start pattern so the defects.run_id foreign key relationship can be enforced.

This is the third Phase 3 slice opened under the 2026-05-07 strategy and the slice where the prototype starts to genuinely look like a wafer inspection tool. Each cassette is a long-duration scenario (25 wafers × ~30 s/wafer ≈ 12-15 min wall-clock) that exercises the full Phase 1 + Phase 3 stack at production-shaped cadence. The row's Phase 2 trigger assessment is informed by the cumulative store-write rate of an entire cassette, not just a single run.

Why This Slice

Today the workflow state machine treats each run as an independent operator-driven event: operator clicks Load Recipe, Home, Start Run; one run completes; operator decides what to do next. Real wafer inspection tools batch wafers into cassettes (typically 25-wafer FOUPs or 13-wafer carriers) and the tool processes each wafer through a fixed phase sequence without operator intervention between wafers:

  1. Load — wafer transfer from cassette slot to chuck (motion + vacuum sequence)
  2. Align — locate the wafer notch / pre-aligner stage
  3. Run — execute the recipe (the existing WorkflowService.StartRunAsync path)
  4. Unload — return the wafer to its cassette slot

The current WorkflowService covers only step 3 in isolation. Steps 1, 2, 4 don't exist as concepts; the operator manually re-clicks Home / Load Recipe / Start Run for each wafer. That works for a demo; it doesn't model how real tools run.

This slice introduces a CassetteService that drives the loop: pull a wafer from the cassette, transition through Load/Align/Run/Unload phases, advance to the next wafer, repeat until the cassette is empty or aborted. WaferId and LotId are added to RunSummary so the run history is queryable per-wafer and per-lot. The simulator gets stub Load/Align/Unload phases (each a Task.Delay plus a state transition) — real machine integration is deferred to Phase 4.1.

The slice also closes the FK band-aid filed in SLICE-3.1's follow-ups (item 3 of the 2026-05-08 follow-up entry in roadmap-progress.md). Currently SqliteDefectStore.OpenAsync disables FK enforcement because defects are persisted before their parent run_summary row exists. SLICE-3.2 introduces a stub-row pattern: at run-start, a run_summary row is inserted with TerminalStatus = "Pending"; defects can then be persisted with FK enforcement on; at run-end the existing SaveAsync upsert (INSERT OR REPLACE) updates the row to its terminal status. Same architectural fit as the cassette scheduler — both create run_summary rows at known lifecycle points.

Requirements Coverage

In Scope

New domain shapes

csharp
namespace InspectionPrototype.Domain.Contracts;

public enum WaferPhase
{
    Idle,           // no wafer in progress
    Loading,        // wafer being moved to chuck
    Aligning,       // pre-aligner finding the notch
    Running,        // executing the recipe (composes existing WorkflowState.Running)
    Unloading,      // wafer returning to cassette slot
    Complete,       // wafer finished; ready for next
}

public record WaferId(string Value)
{
    public static WaferId Generate(string lotId, int slotIndex) =>
        new($"{lotId}-W{slotIndex:D2}");
}

public record LotId(string Value)
{
    public static LotId GenerateForToday() =>
        new($"LOT-{DateTimeOffset.UtcNow:yyyyMMdd-HHmmss}");
}

public record CassetteSlot(int Index, WaferId? WaferId);

public record CassetteState(
    LotId             LotId,
    int               TotalSlots,                      // 25 for FOUP
    int               CurrentSlotIndex,                // 0-based
    WaferPhase        CurrentPhase,
    IReadOnlyList<CassetteSlot> Slots);

WaferId is a value type (record) deliberately, not just a string, so the type system prevents mixing wafer identifiers with run identifiers (Guid).

RunSummary extension

csharp
public record RunSummary(
    Guid                 RunId,
    string               RecipeName,
    DateTimeOffset       StartedAtUtc,
    DateTimeOffset       EndedAtUtc,
    RunTerminalStatus    TerminalStatus,
    int                  DefectCount,
    IReadOnlyList<string> MajorAlarms,
    int                  CompletedScanPoints,
    int                  TotalScanPoints,
    string?              SimulatorProfileName = null,
    int                  DefectsMinor = 0,
    int                  DefectsMajor = 0,
    int                  DefectsCritical = 0,
    string?              WaferId = null,                // ◀ new
    string?              LotId  = null);                // ◀ new

WaferId and LotId are nullable strings (not the records) so non-cassette-mode runs (operator-driven single runs from the existing UI) can still produce RunSummary records without manufacturing fake wafer identities. The cassette scheduler always populates them; legacy operator-driven flow leaves them null.

M003 migration

sql
-- M003_cassette_columns.sql
ALTER TABLE run_summaries ADD COLUMN wafer_id TEXT;
ALTER TABLE run_summaries ADD COLUMN lot_id TEXT;
CREATE INDEX idx_run_summaries_lot_id ON run_summaries(lot_id);
CREATE INDEX idx_run_summaries_wafer_id ON run_summaries(wafer_id);
INSERT INTO schema_version (version, applied_at) VALUES (3, datetime('now'));

Existing rows have NULL for both columns — that's correct (they predate cassette mode).

M004 migration — close the FK band-aid

sql
-- M004_enable_defects_fk.sql
-- (Schema already declares REFERENCES; this migration is a no-op for schema
--  but documents that the application-level invariant is now enforced via the
--  stub-row pattern introduced in SLICE-3.2's WorkflowService changes.)
INSERT INTO schema_version (version, applied_at) VALUES (4, datetime('now'));

The actual FK fix is in code, not SQL: SqliteDefectStore.OpenAsync removes the explicit PRAGMA foreign_keys = OFF. The new WorkflowService.StartRunAsync (cassette-aware) inserts a stub run_summary before the run loop starts; defects can then be persisted with FK enforcement on. The existing SaveAsync upsert at run-end updates the row to its terminal status.

A simpler alternative (no migration) is to just drop the PRAGMA foreign_keys = OFF and add the stub-row code change. The M004 migration exists to make the version-history record explicit ("at version 4, the FK band-aid was retired"); reviewers querying schema_version see the audit trail.

ICassetteScheduler + SimulatedCassetteScheduler

csharp
namespace InspectionPrototype.Application.Abstractions;

public interface ICassetteScheduler
{
    /// <summary>The current cassette state, or null if no cassette is loaded.</summary>
    CassetteState? Current { get; }

    /// <summary>Loads a fresh cassette with the given LotId and slot count. Idempotent rejection if a cassette is already loaded.</summary>
    Task LoadCassetteAsync(LotId lotId, int slotCount = 25, CancellationToken ct = default);

    /// <summary>Starts the wafer-loop scheduler. Drives Load → Align → Run → Unload for each slot until the cassette is empty or stopped.</summary>
    Task StartCassetteRunAsync(CancellationToken ct = default);

    /// <summary>Cooperative stop — finishes the current wafer's phase, then halts.</summary>
    void RequestCassetteStop();

    /// <summary>Immediate abort — cancels the current phase's CTS.</summary>
    Task AbortCassetteAsync();

    /// <summary>Unloads the cassette (resets state). Only allowed when scheduler is Idle or Complete.</summary>
    Task UnloadCassetteAsync(CancellationToken ct = default);

    /// <summary>Raised when the cassette state transitions (slot index advances, phase changes, etc.). Subscribers project state into the UI.</summary>
    event Action<CassetteState>? CassetteStateChanged;
}

SimulatedCassetteScheduler (Infrastructure.Simulator):

  • Constructor takes IWorkflowService, IRunHistoryStore, ISimulatorProfileProvider, IAppStateStore, ILogger.
  • LoadCassetteAsync builds the initial CassetteState with TotalSlots = 25, CurrentSlotIndex = 0, all 25 CassetteSlot.WaferId populated via WaferId.Generate(lotId, i). Updates AppState.Cassette = newState.
  • StartCassetteRunAsync starts the scheduler loop on a background task. The loop visits each slot in order:
    1. Transition to Loading; Task.Delay(profile.WaferLoadMs); advance.
    2. Transition to Aligning; Task.Delay(profile.WaferAlignMs); advance.
    3. Transition to Running; call WorkflowService.StartRunAsync() with the wafer's WaferId + LotId (the workflow's internal run-loop runs as before); wait for terminal state.
    4. Transition to Unloading; Task.Delay(profile.WaferUnloadMs); advance.
    5. Slot complete; advance CurrentSlotIndex; loop until cassette empty or stop/abort requested.
  • Each phase transition fires CassetteStateChanged so the UI can update.
  • On stop: finishes current phase, transitions cassette to Complete. On abort: cancels current phase's CTS; cassette state goes to Complete with current slot in Idle (didn't finish).
  • Catches and logs each wafer's run faults but does not abort the cassette on a single-wafer fault; advances to the next slot. Operator can manually abort if cascading faults indicate a bigger problem.

WorkflowService.StartRunAsync extension

csharp
public Task StartRunAsync(WaferId? waferId = null, LotId? lotId = null)
{
    // ... existing guard checks ...

    var summary = new RunSummary(
        // ... existing fields ...
        WaferId: waferId?.Value,
        LotId:  lotId?.Value);

    // **NEW: Stub-row insert at run start** — closes the SLICE-3.1 FK band-aid.
    // Persists a Pending row so defects can FK-reference run_summaries(run_id) safely.
    var stubSummary = summary with {
        EndedAtUtc = DateTimeOffset.MinValue,
        TerminalStatus = RunTerminalStatus.Pending,
    };
    _ = SafeRunHistoryPersistAsync(() => _runHistoryStore.SaveAsync(stubSummary));

    // ... existing run-loop start ...
}

The RunTerminalStatus enum gains a Pending value:

csharp
public enum RunTerminalStatus
{
    Pending,        // ◀ new — stub row at run start; updated to one of the below at run end
    Completed,
    Stopped,
    Aborted,
    Faulted,
}

At run-end, the existing _runHistoryStore.SaveAsync(summary) call updates the row to its terminal status (INSERT OR REPLACE on run_id primary key handles the upsert).

SqliteDefectStore.OpenAsync cleanup

The explicit PRAGMA foreign_keys = OFF is removed. With the stub-row pattern in place, FK enforcement now works correctly: defects are always inserted after their parent run_summary row exists. The connection string keeps Cache=Shared (matches the other stores; pooling is fine because every connection now expects FK enforcement).

AppState evolution

csharp
public record AppState(
    // ... existing fields ...
    CassetteState? Cassette,                   // ◀ new
    // ... existing fields ...
);

Cassette is null when no cassette is loaded. MainViewModel projects it to a UI panel showing the current slot index and phase.

Cassette UI

A new CassetteStatusPanel (UserControl) shows:

  • LotId
  • Progress: "Wafer 7 / 25"
  • Current phase: "Running" (with a phase-specific icon or text)
  • A 25-cell grid showing each slot's status (empty / pending / done / current)

AutomationIds: CassetteStatusPanel, CassetteCurrentSlotText, CassetteCurrentPhaseText. New buttons in MainWindow.xaml: LoadCassetteButton, StartCassetteRunButton, StopCassetteButton, UnloadCassetteButton.

Profile fields

SimulatorProfile gains three new fields for cassette-mode timing:

FieldDefaultPurpose
WaferLoadMs2 000Simulated wafer-load duration
WaferAlignMs1 500Simulated pre-align duration
WaferUnloadMs1 500Simulated wafer-unload duration

For a 25-wafer cassette: 5 s of overhead per wafer × 25 = 125 s of cassette-overhead. With ~30 s of recipe-run time per wafer, total cassette wall-clock ≈ 14-15 min.

Measurement

A new runbook §5.4 entry: "Cassette cadence capture — SLICE-3.2, 25-wafer cassette under Soak8h profile." 25-wafer run takes ~14 min wall-clock; capture covers Connect → Load Cassette → StartCassetteRun → wait for completion → Unload Cassette → Disconnect.

MeasurementExtraction.psm1 gains:

  • Get-WafersCompletedCount -DatabasePath $db -LotId $lotId — count from SELECT COUNT(*) FROM run_summaries WHERE lot_id = @lotId AND terminal_status != 'Pending'.
  • Get-CassetteWallClockSeconds -DatabasePath $db -LotId $lotIdMAX(ended_at_utc) - MIN(started_at_utc) for the lot.

ConvertTo-MeasurementRow adds two rows when -LotId is provided: wafers.completed (count) and cassette.wall-clock (s).

Row block tagged slice-3-2-cassette-cadence in phase-3-measurements.md. Baseline is slice-3-1-rich-defect-model for the 32 overlapping metrics; the two new cassette-specific metrics have no baseline.

Out of Scope

  • Multi-cassette / lot-batching beyond 25 wafers. The scheduler handles one cassette at a time. A lot-batching layer is a Phase 4 concern.
  • Real wafer-handler robotics. Load / Align / Unload are Task.Delay simulations; real motion sequences live in the SDK swap (Phase 4.1).
  • Pre-align failure modes (notch not found, wafer mis-clamp). Simulated phases always succeed.
  • Mid-cassette pause / resume. The scheduler supports stop and abort; pause-and-resume across cassette boundaries is deferred.
  • Wafer-level retry policy. A faulted wafer's run advances to the next slot; per-wafer retry is operator-initiated by re-loading the same slot, not by automatic retry.
  • OperatorId — Phase 3.4's territory. SLICE-3.2 leaves operator identity off RunSummary; SLICE-3.4 adds it.
  • Defect correlation across wafers — defects link to a single run_id; cross-wafer pattern detection is Phase 4 ML work.
  • Cassette UI virtualization — 25 slots is small; no virtualization needed. Larger cassettes (FOSB 13, sample 4) work without changes.

Runtime Behavior

Cassette flow

Operator clicks "Load Cassette" button → ICassetteScheduler.LoadCassetteAsync(LotId.GenerateForToday(), 25)
  AppState.Cassette = new CassetteState(lotId, 25, 0, WaferPhase.Idle, [25 slots])

Operator clicks "Start Cassette Run" button → ICassetteScheduler.StartCassetteRunAsync()
  Background loop starts:

    For each slot in 0..24:
      AppState.Cassette transitions phase: Loading
      await Task.Delay(profile.WaferLoadMs)                    [2 s default]

      AppState.Cassette transitions phase: Aligning
      await Task.Delay(profile.WaferAlignMs)                   [1.5 s default]

      AppState.Cassette transitions phase: Running
      await _workflow.StartRunAsync(slot.WaferId, cassette.LotId)

        WorkflowService inserts stub run_summary (TerminalStatus = Pending)

        Run loop executes — defects persisted with FK enforcement on

        WorkflowService updates run_summary (terminal status, defects, etc.)
      await wait-for-terminal-state                            [~30 s typical]

      AppState.Cassette transitions phase: Unloading
      await Task.Delay(profile.WaferUnloadMs)                  [1.5 s default]

      AppState.Cassette.CurrentSlotIndex++

    AppState.Cassette transitions phase: Complete

Cassette-vs-direct workflow distinction

The existing direct-workflow path (Connect → Home → Load Recipe → Start Run → ...) continues to work without the cassette scheduler. MainViewModel distinguishes cassette mode from direct mode:

  • Cassette mode: AppState.Cassette is non-null. Direct-mode buttons (Start Run, Stop, etc.) are visually de-emphasised; cassette-control buttons are foregrounded.
  • Direct mode: AppState.Cassette is null. Cassette panel hides; direct buttons foregrounded.

Switching between modes requires the workflow to be Idle (MainViewModel enables / disables Load Cassette accordingly).

Acceptance Criteria

This slice is satisfied only if all of the following are true:

  1. New domain shapes (WaferId, LotId, CassetteSlot, CassetteState, WaferPhase enum) exist in InspectionPrototype.Domain.Contracts. RunTerminalStatus enum gains Pending.
  2. RunSummary gains nullable WaferId and LotId string properties (defaults null).
  3. M003 migration adds wafer_id and lot_id columns to run_summaries plus the two indexes; existing rows have NULL for both. Idempotent re-run is no-op.
  4. M004 migration records the FK-band-aid retirement in schema_version (no schema change). Idempotent.
  5. SqliteRunHistoryStore.SaveAsync parameter list includes the two new columns; INSERT OR REPLACE upserts on run_id.
  6. SqliteDefectStore.OpenAsync no longer issues PRAGMA foreign_keys = OFF. The connection-string default Foreign Keys=True (or unspecified, which defaults to enforced) is preserved.
  7. ICassetteScheduler + SimulatedCassetteScheduler exist per the In Scope shapes. Loop drives Load → Align → Run → Unload for each slot. Single-wafer faults log Warning and advance to next slot; cassette-level abort cancels the current phase's CTS and transitions to Complete.
  8. WorkflowService.StartRunAsync(WaferId?, LotId?) accepts optional wafer/lot identifiers; persists a stub run_summary row at run start with TerminalStatus = Pending; updates the row at run end with the terminal values.
  9. AppState.Cassette field exists; CassetteService updates it on each phase transition; MainViewModel projects it.
  10. SimulatorProfile gains WaferLoadMs, WaferAlignMs, WaferUnloadMs (all default per spec).
  11. CassetteStatusPanel UserControl shows LotId, progress (current / total), current phase, 25-slot grid. AutomationIds present (CassetteStatusPanel, CassetteCurrentSlotText, CassetteCurrentPhaseText). New buttons LoadCassetteButton, StartCassetteRunButton, StopCassetteButton, UnloadCassetteButton in MainWindow.xaml with AutomationIds.
  12. 25-wafer cassette criterion (slice's exit gate): under Soak8h profile, a single cassette completes 25 wafers in ≤ 20 min wall-clock with 25 distinct RunSummary rows persisted (SELECT COUNT(*) FROM run_summaries WHERE lot_id = @lotId AND terminal_status != 'Pending' = 25). Each wafer's wafer_id follows the LOT-yyyymmdd-hhmmss-W## pattern. No FK errors in the application log; defects (if any) correlate to the correct run_id via the foreign key.
  13. New MeasurementExtraction.psm1 helpers Get-WafersCompletedCount, Get-CassetteWallClockSeconds plus Pester tests.
  14. New runbook §5.4 entry describing the cassette capture procedure.
  15. New row block slice-3-2-cassette-cadence in phase-3-measurements.md covering the 32 standard metrics + 2 new cassette-specific metrics. Baseline is slice-3-1-rich-defect-model.
  16. Phase 2 trigger assessment in row's Notes section: cassette mode runs ~25× the workflow-state-transition rate of single-run mode (each cassette wafer drives Idle → Loading → Aligning → Running → Unloading → ...). Apply the SLICE-2.0 rubric to the captured numbers; document the outcome.
  17. Existing tests pass; FK band-aid removal verified by an integration test that runs a cassette and asserts (a) defects exist for each wafer, (b) SELECT COUNT(*) FROM defects d LEFT JOIN run_summaries r ON d.run_id = r.run_id WHERE r.run_id IS NULL = 0 (no orphans).
  18. Pre-existing direct-mode workflow path continues to work without regressions; Capture-Measurements.ps1 -Profile MultiTag still produces a row equivalent to slice-1-1-multi-tag-telemetry.

Verification Notes

  • Stub-row insertion is on the same fire-and-forget pattern as defects. The stub _runHistoryStore.SaveAsync(stubSummary) runs without await; if it fails, the run still proceeds in memory but defect persistence may fail FK validation on subsequent inserts. A test verifies the stub-row insert is durable enough that subsequent same-thread defect inserts find the parent row.
  • RunTerminalStatus.Pending is never reported as a terminal state to operators. The UI's WorkflowState projection ignores Pending rows; only the database sees them transiently.
  • Cassette stop semantics: "Stop" finishes the current wafer's phase but does not advance to next; "Abort" cancels the current phase's CTS immediately. Same pattern as WorkflowService's Stop / Abort distinction (the cassette layer wraps the run-level distinctions consistently).
  • Multi-wafer fault tolerance: if Wafer 7 faults but Wafer 8 should proceed, the scheduler logs the Wafer-7 fault, transitions to Unloading, and continues to Wafer 8. A run-level fault is NOT a cassette-level fault. This matches real machine behavior.
  • The 20 min wall-clock criterion 12 has slack — typical ~14 min per cassette. The 6-minute margin absorbs slow-host noise and accounts for any per-wafer phase variability under chaos.
  • Pre-existing reproducibility check (criterion 18) is the SLICE-1.4 / SLICE-2.0 pattern — slice-1-1-multi-tag-telemetry row reproduces against the merged commit. The cassette changes affect run-startup and run-end paths; if a non-cassette MultiTag capture produces materially different numbers, the cassette-aware code path is bleeding into the direct-mode path somewhere unintended.
  • The FK band-aid removal is the load-bearing architectural fix. Verify by:
    1. Code: SqliteDefectStore.OpenAsync no longer contains PRAGMA foreign_keys = OFF.
    2. Capture: 25-wafer cassette with HighDefect or chaos-mode defect rates produces zero SqliteException Error 19 log entries.
    3. Schema: PRAGMA foreign_keys reports 1 (enabled) on a fresh connection from the post-fix code.
  • Cassette UI is the second material UI surface expansion (after SLICE-3.1's WaferMapView). Same dispatcher-hop discipline applies — OnStateChanged runs on the producer thread; UI projections marshal via _dispatcher.InvokeAsync.

Docs-first project memory for AI-assisted implementation.