Skip to content

03. Stop, Abort, Fault, and Cancellation

Why These Semantics Matter

One of the best runtime lessons in the repo is that not every way of ending work means the same thing.

The system distinguishes:

  • graceful operator stop
  • immediate operator abort
  • unsafe fault-driven interruption

That distinction is essential in machine-oriented software.

Stop

Stop is graceful.

Conceptually:

  • the operator wants the run to end cleanly
  • no new scan point should begin after the request is accepted
  • the current safe step may complete
  • terminal status becomes Stopped

In the implementation, RequestStop() records the intent and transitions workflow into Stopping.

That is important because it preserves a safe boundary rather than cutting work arbitrarily.

Abort

Abort is immediate.

Conceptually:

  • the operator wants in-flight work interrupted now
  • cancellation should be observed as quickly as possible
  • terminal status becomes Aborted

That makes Abort intentionally harsher than Stop.

Newcomers often blur those two actions together. This repo makes the difference visible in both code and UI behavior.

Fault

Fault is not simply another cancellation path.

A critical fault means:

  • an unsafe condition is active
  • the workflow moves to Faulted
  • blocked commands remain blocked
  • acknowledgment records operator awareness but does not clear the unsafe condition
  • explicit recovery is required after clearance

This is what gives the runtime a more realistic operational character. The system distinguishes operator intent from machine safety.

Cancellation Scope

The runtime uses cancellation token sources inside WorkflowService to control home and run lifetimes.

That is a good teaching example because cancellation here is part of operational semantics:

  • abort cancels immediately
  • fault cancels active work and preserves fault semantics
  • stop does not simply cancel everything right away

This is coordination, not just syntax.

Why This Design Was Chosen

The design fits the explicit workflow-state model:

  • Stopping exists because graceful stop is distinct
  • Aborted exists because immediate interruption is distinct
  • Faulted exists because unsafe conditions require stronger semantics

Without those distinctions, the app would be easier to implement badly and harder to teach honestly.

Common Mistakes

  • treating stop, abort, and fault as equivalent terminal paths
  • using cancellation as a substitute for explicit workflow semantics
  • re-enabling commands too early after a fault
  • assuming acknowledgment means safe recovery

Diagram Brief

  • Title: Stop, abort, and fault state paths
  • Purpose: Contrast graceful termination, immediate interruption, and unsafe-condition handling
  • Audience: newcomer engineer
  • Nodes: Preparing, Running, Stopping, Stopped, Aborted, Faulted, Recover, Idle, Ready
  • Edges: stop transitions through stopping; abort transitions to aborted; fault transitions to faulted; clearance plus explicit recover returns to idle or ready
  • Grouping: Graceful termination, immediate interruption, fault and recovery
  • Caption: These paths all end work, but they mean different things operationally and therefore need different code paths
  • Destination file path: docs/diagrams/source/course-04-03-stop-abort-fault-and-cancellation.drawio

Docs-first project memory for AI-assisted implementation.