Below is the topic write-up for 7. Machine Workflow & Sequencing, aligned to your Domain 1 source of truth: this topic is specifically about step-by-step sequencing, synchronization between subsystems, and deterministic workflow execution in machine control software.
7. Machine Workflow & Sequencing
PART 1 — WHAT A MACHINE WORKFLOW REALLY IS
In industrial machine software, a workflow is the controlled execution of physical operations in a specific order.
That sounds simple, but this is one of the biggest mindset shifts from business software.
In business software, a workflow often means something like:
- approve request
- send email
- update status
- wait for human response
Those workflows are mostly about information moving through systems.
A machine workflow is different. It is about physical actions happening in the real world:
- move a stage
- turn on vacuum
- confirm part present
- trigger image capture
- wait for motion settle
- inspect
- move out
- release part
So a machine workflow is not just “process logic.” It is software coordinating reality.
That is why order matters so much.
In a wafer inspection machine, a simplified flow might be:
- load wafer
- clamp wafer
- align wafer
- move stage to inspection start
- autofocus
- scan and capture images
- inspect data
- move to unload position
- release and unload wafer
In a robotic station, a simplified flow might be:
- check destination clear
- open gripper
- move to pick position
- close gripper
- verify part acquired
- move to place position
- place part
- verify destination occupied
This is not just a list of function calls. Each step depends on real physical completion.
For example:
MoveToAlignmentPosition()returning does not necessarily mean the stage is already thereClamp()returning does not necessarily mean clamp force is stableCaptureImage()returning does not necessarily mean image data is valid and committedPickPart()returning does not necessarily mean the part is truly held
That is why experienced machine engineers think in this model:
command -> physical execution -> feedback -> confirmation -> next step
Not:
call method -> assume done
That difference is the foundation of machine sequencing.
PART 2 — STEP-BY-STEP SEQUENCING
A step in machine software is usually a well-defined unit of operational intent.
Examples:
- Move stage to load position
- Wait for vacuum sensor on
- Trigger camera
- Wait for image complete
- Evaluate inspection result
- Move robot to place position
A good machine step usually has four things:
- intent — what this step is trying to achieve
- entry action — what command is issued at the start
- completion condition — how the system knows it is truly done
- failure/timeout rule — what happens if expected completion never comes
So a real step is not just:
DoMoveStage()
It is more like:
- issue move command
- monitor in-position or motion-complete feedback
- verify no motion fault occurred
- verify timeout not exceeded
- only then transition to next step
That is why machine sequencing is usually strict and deterministic. The sequence engine should make it very obvious:
- which step is active
- why it moved to the next step
- what it is waiting for
- what condition caused failure
Simple step flow
+--------------------+
| Start Step |
| "MoveToInspectPos" |
+---------+----------+
|
v
+---------------------------+
| Issue command to stage |
| MoveAbs(InspectPosition) |
+------------+--------------+
|
v
+---------------------------+
| Wait for completion |
| - InPosition == true |
| - No motion fault |
| - Within timeout |
+------+-----------+--------+
| |
| |
v v
+-------------+ +------------------+
| Step done | | Step failed |
| go next | | alarm / recovery |
+-------------+ +------------------+What this diagram means
The key point is that the step does not advance because the code line after MoveAbs(...) executed. It advances only when the machine-level completion condition is satisfied.
This is why “continue when method returns” is often wrong in machine software.
In normal application code, method return often means the work is done. In machine control, method return often means only one of these:
- command accepted
- command queued
- command transmitted
- controller acknowledged request
None of those means the physical world has reached the required state.
That misunderstanding causes many beginner mistakes.
PART 3 — SYNCHRONIZATION BETWEEN SUBSYSTEMS
A machine workflow rarely controls only one thing.
Real workflows coordinate multiple subsystems together:
- motion
- sensors
- cameras
- vacuum
- clamps
- doors / interlocks
- robot end effectors
- illumination
- PLC handshakes
So workflow execution is largely about synchronization points.
Examples:
- do not capture image until stage is in position and settled
- do not move robot until vacuum confirms part is held
- do not start process until safety condition is valid
- do not open clamp until stage is parked and robot ready
- do not unload wafer until inspection pipeline has flushed required data
Example sequence diagram: motion + vision synchronization
Sequence: Move to position, then capture image
Workflow MotionCtrl Stage Camera Vision
| | | | |
| MoveTo(P1) | | | |
|--------------->| | | |
| | command move | | |
| |-------------> | | |
| | | moving... | |
| |<------------- | in-position | |
| wait settle | | | |
|----time------->| | | |
| TriggerCapture | | | |
|--------------------------------------------->| |
| | expose |
| | readout |
|<---------------------------------------------| image ready |
|----------------------------------------------------------->|
| inspect |
|<-----------------------------------------------------------|
| next step |What this diagram means
The workflow is doing more than just “call move then call camera.”
It is enforcing a dependency chain:
- stage command issued
- motion completes
- settle period passes
- capture triggered
- image is actually ready
- downstream processing can continue
This is machine sequencing in practice: subsystem dependencies turned into explicit wait points.
Another common example:
- robot picks a part
- vacuum sensor must confirm part present
- only then can robot leave pick location
- only then can destination transfer continue
If you skip that confirmation, the software may behave logically while the machine behaves physically wrong.
That is a very common pattern in industrial systems: the software path looks correct, but the physical assumptions were false.
PART 4 — DETERMINISTIC EXECUTION
Deterministic execution means the workflow behaves in a reproducible, controlled, and explainable way.
Given the same machine state, recipe, inputs, and conditions, you want the sequence to:
- run steps in the same order
- use the same decision rules
- wait on the same conditions
- fail in the same controlled way when something goes wrong
In machine software, determinism matters because the machine is not just producing output data. It is producing motion, energy, force, timing, and product contact.
If the behavior is not deterministic, you get systems that are very hard to trust:
- sometimes camera fires before settle
- sometimes clamp verification is skipped
- sometimes retry path leaves hidden flags set
- sometimes stop request is handled only after two more steps
- sometimes event ordering changes final behavior
That is not just messy engineering. In machine systems, that can become:
- quality drift
- intermittent scrap
- unexplained downtime
- damaged fixtures
- unsafe recovery states
Predictable step logic vs uncontrolled async behavior
Predictable step logic means:
- one orchestrator owns step progression
- conditions are explicit
- transitions are visible
- asynchronous device activity is wrapped into controlled completion signals
Uncontrolled asynchronous behavior means:
- callbacks independently change workflow state
- background tasks mutate flags from many places
- UI buttons directly manipulate device state mid-sequence
- completion comes from whichever event happens first, without a clear contract
That second model is where “ghost bugs” come from.
Experienced engineers design workflows so asynchronous hardware interaction is absorbed into a controlled sequencing model, not sprayed across the codebase.
A good rule is:
devices may be asynchronous, but workflow progression should be explicit
PART 5 — WAITING, TIMEOUTS, AND CONDITIONAL PROGRESS
Waiting is central in machine sequencing.
A large percentage of industrial workflow code is really about this:
- wait for motion complete
- wait for sensor on
- wait for pressure stable
- wait for camera ready
- wait for PLC acknowledge
- wait for door closed
- wait for timeout
- wait until either success, failure, or operator interruption occurs
This is why machine workflows are not just “do A, then B, then C.” They are usually:
- issue command
- wait for expected condition
- verify absence of abnormal condition
- enforce timeout
- decide next transition
Timing diagram: waiting with timeout
Time ------------------------------------------------------------->
Workflow: Issue Move [ Waiting for InPosition ................ ]
| |
| +--> timeout -> fault
+--> if InPosition=true -> next step
Stage: moving ---------------------------> in-position
Sensor: false false false false false true
Alarm: none none none none none noneWhat this diagram means
The workflow is not waiting “for a delay.” It is waiting for a condition within an allowed time window.
That distinction matters.
Bad sequencing code often uses arbitrary sleeps:
- sleep 200 ms after move
- sleep 500 ms after clamp
- sleep 1000 ms before read sensor
This sometimes works in lab conditions and then fails in production because:
- hardware timing varies
- load changes
- temperature changes
- motion path changes
- controller latency changes
- maintenance wear changes
- network jitter exists
Experienced engineers prefer condition-based progress, with timeout protection.
So instead of:
- move
- sleep 500 ms
- continue
They prefer:
- move
- wait until in-position is true
- optionally wait settle time if physically required
- timeout if condition never arrives
- alarm if fault observed
Why missing timeout logic is dangerous
Without timeout handling, a step can wait forever.
In a real machine, that causes:
- hung production
- unclear operator state
- blocked recovery
- hidden deadlocks between subsystems
- partial hardware actuation left active
And worse, if engineers add retries blindly on top of missing timeout discipline, they often create inconsistent state rather than resilience.
PART 6 — INTERRUPTION & PARTIAL COMPLETION
This is where machine workflows become much harder than they first appear.
A workflow can be interrupted by:
- stop
- pause
- abort
- alarm
- safety event
- operator intervention
- device disconnect
- controller error
- upstream/downstream station problem
And interruption rarely happens at a convenient boundary.
It often happens mid-step or between physical sub-actions.
Examples:
- robot picked part but stop occurred before placement
- stage reached target but image capture failed
- clamp engaged but process aborted
- vacuum turned on but part-presence confirmation never came
- wafer aligned but inspection recipe load failed
- camera triggered but acquisition callback never arrived
This creates partial completion states.
That is one of the most important realities in machine software:
A machine is often not in a clean “done” or “not done” state. It is in a messy partially executed physical condition.
Example: partial completion risk
Normal:
Pick -> VerifyHeld -> MoveToPlace -> Place -> VerifyReleased
Interrupted:
Pick -> VerifyHeld -> [STOP / ALARM]
Machine reality now:
- robot may still hold part
- destination may still be empty
- vacuum may still be on
- next auto-cycle cannot safely assume clean startThis is why strong engineers do not think only about forward flow. They also think about:
- what physical state may exist if interrupted here?
- what must be made safe?
- what must be preserved for recovery?
- what must be re-verified before resume?
- what requires operator intervention?
That is also why “resume” is much harder than people expect. Resume is only safe if the software knows exactly where the machine really is, not just where the sequence variable says it was.
PART 7 — REAL-WORLD FAILURE SCENARIOS
Here are some of the classic production failures in machine sequencing.
1. Sequence proceeds before subsystem is truly ready
What it looks like
- image captured while stage is still settling
- robot leaves pick point before vacuum is stable
- process starts before clamp force is achieved
Why it happens
- completion signal interpreted too early
- command acknowledgment confused with physical completion
- missing settle-time concept
- weak synchronization contract between software layers
How experienced engineers handle it
- define exact “ready” semantics per subsystem
- separate
CommandAccepted,Busy,Complete,ReadyForNextAction - add explicit settle or stabilization rules only where physically justified
- log command time, completion time, and confirmation source
2. Missed trigger or missed sensor event
What it looks like
- camera never reports capture complete
- part-present sensor changed briefly but software missed it
- PLC handshake pulse was too short for polling loop
Why it happens
- polling interval too slow
- event registration timing incorrect
- edge-triggered signal not latched
- asynchronous event race during step transition
How experienced engineers diagnose it
- compare controller logs, software logs, and timestamps
- inspect whether event came before step entered wait state
- verify signal latching vs transient pulse behavior
- review polling design and event buffering assumptions
How they handle it
- prefer latched acknowledgments for critical signals
- design handshake protocols that survive timing variance
- record raw signal history for diagnosis
- avoid assuming “if it happened, we must have seen it”
3. Step stuck forever waiting
What it looks like
- UI shows “Waiting for ready”
- machine appears idle but cannot continue
- operator repeatedly presses stop/start with no effect
Why it happens
- no timeout
- wrong wait condition
- condition impossible in current physical state
- hidden prerequisite was not satisfied earlier
- event consumed by another component
How experienced engineers handle it
- every wait has a reason, source, and timeout
- logs show exactly what condition is awaited
- timeout alarm message names the missing condition
- workflow snapshot shows active step and dependency states
A strong sequence engine should make “what are we waiting for?” immediately answerable.
4. Wrong step order due to race condition
What it looks like
- step N+1 starts before step N has truly finalized
- retry path overlaps with previous command completion
- stop request and completion callback both try to advance state
Why it happens
- step transitions controlled from multiple threads
- event callback directly modifies orchestrator state
- no serialization of workflow progression
- two completion paths both believe they own the transition
How experienced engineers handle it
- centralize transition authority
- serialize sequence progression through one orchestrator context
- treat callbacks as signals, not decision-makers
- make step transitions atomic and logged
This is a huge practical lesson: events should inform the workflow, not secretly become the workflow
5. Repeated retries create inconsistent machine state
What it looks like
- second retry fails differently from first
- flags say gripper open but hardware is half-closed
- software thinks part absent, but robot still holds part
- recoveries get worse each time
Why it happens
- retry reissues command without restoring a known baseline
- sequence variable reset, physical state not reset
- idempotency not designed
- fault path does not model partial completion
How experienced engineers handle it
- define safe recovery points
- distinguish retryable steps from non-retryable ones
- re-verify physical state before retry
- require operator-guided recovery when state certainty is lost
In machine software, retries are not always cheap. Sometimes retry means “repeat the same physical action.” That can be dangerous if the world is already partially changed.
PART 8 — SOFTWARE DESIGN IMPLICATIONS
Machine workflow logic must be explicit.
This is probably the biggest software design lesson in this topic.
If sequencing is scattered across:
- UI button handlers
- device callback code
- timer events
- random booleans
- background tasks
- PLC message handlers
then the system becomes extremely hard to reason about.
You can no longer answer basic production questions:
- what step are we in?
- why did we advance?
- what are we waiting for?
- who owns the next transition?
- what happens if stop arrives now?
- what physical state assumptions does this step make?
Bad approach
- UI button starts action
- device callback sets
IsDone = true - timer checks
IsDone - another service clears
IsDone - retry path sets
CurrentStep = 3 - alarm path jumps to cleanup from inside device wrapper
This creates hidden sequencing and hidden ownership.
Good approach
- one central orchestrator owns workflow progression
- each step has explicit entry, wait condition, timeout, and failure path
- device layers expose commands and status, but do not secretly advance workflow
- all transitions are traceable
- interruptions are first-class workflow events
- partial completion is considered in design
Example control-flow structure
+---------------------+
| Operator / Host |
| Start / Stop / Pause|
+----------+----------+
|
v
+---------------------------+
| Workflow Orchestrator |
| - current step |
| - transition rules |
| - interruption handling |
| - timeout supervision |
+----+-----------+----------+
| |
| |
v v
+----------+ +----------------+
| Motion | | Device/Sensor |
| Service | | Service |
+----------+ +----------------+
| |
v v
motion status signals / ready / fault
\ /
\ /
+------------+
| status fed |
| back to |
| orchestrator
+------------+What this diagram means
The orchestrator owns the sequence.
Subsystems do their own jobs:
- motion service executes motion commands
- device service talks to hardware
- sensor layer reports conditions
But the decision when to move to the next workflow step belongs to the orchestrator.
That separation is one of the main differences between code that survives real machine complexity and code that collapses into timing bugs.
Practical design principles
Keep these principles in mind:
- sequence state must be visible
- step ownership must be unambiguous
- wait conditions must be explicit
- timeout handling must be built in, not added later
- failure transitions must be designed, not improvised
- resume/recovery must consider actual physical state
- asynchronous signals should be normalized before affecting workflow decisions
PART 9 — INTERVIEW / REAL-WORLD TALKING POINTS
Here is a clean way to explain this topic in interviews or real engineering discussion.
How to explain machine workflow sequencing clearly
Machine workflow sequencing is the logic that coordinates physical operations step by step. Unlike business workflows, the system cannot assume work is complete when a method returns. It must wait for physical completion, sensor confirmation, safety conditions, and subsystem readiness before advancing.
Why machine workflows are different from business workflows
Business workflows mostly move information. Machine workflows coordinate motion, devices, timing, and physical state. That makes them long-running, asynchronous, interruption-prone, and dependent on real-world confirmation.
Common mistakes software engineers make when entering machine software
They often:
- assume command return means done
- scatter sequencing across callbacks and UI code
- use arbitrary delays instead of condition-based waits
- ignore partial completion during stop/fault paths
- treat retries as free
- let multiple components advance workflow state
What strong engineers understand about deterministic sequencing
Strong engineers understand that:
- devices are asynchronous, but workflow progression must still be controlled
- every step needs explicit completion and failure semantics
- synchronization points are where correctness lives
- interruption handling is part of the normal design, not an edge case
- physical state matters more than software intent
- the system must make “what is happening now?” and “why?” easy to answer
A strong summary sentence
A good machine workflow engine is not just a step runner. It is a controlled orchestrator of physical reality, where progress happens only when the machine has truly reached the expected condition.
Final mental model
If you remember only one thing, remember this:
Machine sequencing is the discipline of turning asynchronous, failure-prone physical behavior into explicit, deterministic, and diagnosable software progression.
That is why this topic sits at the center of industrial machine software. It connects motion, devices, sensors, timing, faults, and operational safety into one controlled execution model. That focus is directly aligned with your Domain 1 source of truth for Machine Workflow & Sequencing.
If you want, I can continue with Topic 8 — State Machines for Machine Control in the same format and depth.