Skip to content

Below is the refined topic content.


Operator Workflows & Screen Design

Industrial HMI screen design is not about making a “nice desktop app.” It is about helping a human operate a physical machine correctly under time pressure, uncertainty, noise, alarms, partial failures, and production targets.

Your source roadmap correctly frames HMI as a high-priority domain because industrial UI must expose machine state, alarms, controls, recipes, device health, live images, and workflow progress clearly and safely; bad HMI design causes mistakes, slow troubleshooting, poor recovery, and downtime.


PART 1 — Why Screen Design Must Follow Operator Workflow

In normal business software, screens are often organized around data: Customers, Orders, Products, Reports, Settings.

In industrial machine software, that is usually the wrong mental model.

A machine operator does not think:

“I need to open the Axis module, then the Camera module, then the Recipe entity, then the Result table.”

They think:

“Can I start the run?” “Is the machine ready?” “Which recipe is active?” “Why did it stop?” “What should I do next?” “Is it safe to restart?”

That means the UI should be organized around work, not around internal software modules.

A wafer inspection machine, for example, may internally have motion control, camera acquisition, recipe management, alignment, inspection, storage, alarms, and host communication. But the operator workflow is simpler:

text
Load job

Confirm recipe

Check machine readiness

Start inspection

Monitor progress

Review result / handle alarm

End run or recover

The screen design should make that workflow obvious.

The most important question is not:

“Which screen shows the axis status?”

The better question is:

“At this point in the operator’s work, what decision must they make, and what information do they need to make it safely?”

A production operator and a service engineer use the same machine very differently.

A production operator usually needs:

text
Is the machine ready?
What job is running?
Is production healthy?
What should I do if it stops?
Can I safely resume?

A service engineer usually needs:

text
Which device is failing?
Can I inspect low-level state?
Can I jog an axis?
Can I test a camera trigger?
Can I override or reset a subsystem safely?

If both user groups are forced through the same screens, the UI becomes dangerous. Production operators see too many technical controls. Service engineers cannot find detailed diagnostics quickly. The result is confusion on both sides.

Good HMI design reduces decision-making under stress. During a normal run, the UI should guide. During a fault, it should narrow choices. During recovery, it should tell the operator what state the machine is in, what is safe, and what the next valid actions are.


PART 2 — Common Operator Workflows

A strong industrial HMI starts by modeling workflows before drawing screens.

1. Startup / Login

Startup is not just “open the app.”

The machine may need to initialize devices, check safety state, connect controllers, load configuration, verify calibration status, and determine operator role.

The UI should answer:

text
Who is using the machine?
What permissions do they have?
Is the machine initialized?
Are required subsystems online?
Is the machine safe to operate?

Bad design shows the main screen before readiness is known.

Good design shows a clear startup/readiness state:

text
Application Started

User Login

Load Configuration

Connect Devices

Check Safety / Interlocks

Machine Ready or Not Ready

2. Machine Readiness Check

Before production starts, the operator needs to know whether the machine is ready.

Readiness is not one boolean. It may include:

text
Safety OK
Motion system homed
Recipe loaded
Devices connected
No blocking alarms
Required consumables available
Wafer/job loaded
Host communication ready

The screen should not force the operator to inspect ten different technical pages. A readiness panel should summarize what blocks production.

Example:

text
READY TO RUN: NO

Blocking items:
- X/Y stage not homed
- No active recipe
- Door interlock open

This is much better than showing raw device states and expecting the operator to infer readiness.

3. Recipe / Job Selection

Recipe selection is a high-risk workflow.

The operator is not just opening a file. They are selecting the parameters that control inspection behavior, motion positions, thresholds, exposure, alignment rules, and possibly product-specific tolerances.

The UI should make the active recipe extremely visible:

text
Current Job: LOT-2026-0426-A
Product: WAFER_TYPE_300MM_A
Active Recipe: INSPECT_TOPSIDE_V12
Recipe Status: Validated

The operator should not be able to quietly edit or switch critical recipe parameters while a run is active.

4. Run Start

Starting a run is a transition from preparation into controlled machine execution.

The UI should clearly show:

text
What will run?
Which recipe?
Which lot/job?
Is the machine ready?
What will happen after pressing Start?

A good production start flow looks like this:

text
Select Job

Select / Confirm Recipe

Readiness Check

Start Confirmation

Run Active

The confirmation should not be generic. “Are you sure?” is weak.

Better:

text
Start inspection for:
Lot: LOT-2026-0426-A
Recipe: INSPECT_TOPSIDE_V12
Wafers: 25
Mode: Auto Production

5. Live Monitoring

During production, the operator should not need to navigate everywhere.

The run screen should show the operating picture:

text
Machine state
Current run/job
Current step
Progress
Throughput
Active alarms
Recent results
Next expected action

The operator should be able to answer within seconds:

text
Is it running?
Is it healthy?
Where is it in the process?
Has anything failed?
Do I need to intervene?

6. Alarm Response

Alarm response is a workflow, not just a popup.

The UI must help the operator move from:

text
Something happened

What happened?

Is the machine safe?

What caused it?

What can I do?

Can I resume, retry, abort, or call service?

We will not deep dive into alarm system internals here, but screen design must make alarms visible and connected to the current workflow.

7. Result Review

Result review should be separated from production control.

The operator may need to inspect:

text
Pass/fail result
Defect count
Wafer/map status
Inspection summary
Recent failed items
Historical run data

But result review should not accidentally expose dangerous machine controls.

A common mistake is putting review tools, recipe editing, manual controls, and production commands into one overloaded screen.

8. Stop / End Run

Stopping is not one thing.

There may be:

text
Pause
Stop after current wafer
Stop after current step
Abort immediately
Emergency stop
End lot
Complete run

The UI must use precise language because each action has different physical meaning.

A bad button label:

text
Stop

A better design:

text
Pause after current step
Stop after current wafer
Abort run

Each command should be enabled only when valid for the current machine state.

9. Recovery and Restart

Recovery is where many HMIs fail.

After an alarm, abort, or manual intervention, the operator needs help understanding:

text
What is the current physical state?
Was the part moved?
Was inspection partially completed?
Is the machine homed?
Is the recipe still valid?
Can the run resume?
Must the wafer be reloaded?

The recovery screen should provide a guided path, not just leave the operator on the alarm page.


PART 3 — Screen Hierarchy

A common industrial HMI screen hierarchy looks like this:

text
+------------------------------------------------+
|                 HMI Application                |
+------------------------------------------------+
                    |
                    v
+------------------------------------------------+
| Home / Machine Overview                        |
+------------------------------------------------+
   |             |              |             |
   v             v              v             v
+---------+   +---------+   +----------+   +----------+
| Run /   |   | Alarms  |   | Recipes  |   | Results  |
| Produce |   |         |   | Config   |   | Review   |
+---------+   +---------+   +----------+   +----------+
   |                            |              |
   v                            v              v
+---------+                +----------+   +----------+
| Recovery|                | Setup    |   | History  |
| Flow    |                | Validate |   | Trace    |
+---------+                +----------+   +----------+

                    |
                    v
+------------------------------------------------+
| Service / Manual / Diagnostics                 |
| Restricted by role and machine mode            |
+------------------------------------------------+

The hierarchy should reflect operator work.

Home / Machine Overview

The home screen is the operator’s “where am I?” screen.

It should show:

text
Machine state
Current mode
Readiness
Current job/run
Active recipe
Blocking alarms
High-level device health
Next recommended action

It should not show every low-level parameter, every axis position, every camera register, or every internal service status.

The home screen is for orientation, not engineering debugging.

Production / Run Screen

The run screen is the main screen during automatic operation.

It should show:

text
Current run
Current workflow step
Progress
Cycle status
Throughput summary
Recent result summary
Start / pause / stop actions
Blocking status

It should not contain recipe editing, service jog controls, detailed logs, or deep diagnostics.

Alarm Screen

The alarm screen should show what is currently blocking operation and what requires attention.

It should show:

text
Active alarms
Affected subsystem
Machine state impact
Operator guidance
Allowed recovery actions
History/context link

It should not become a generic log viewer with thousands of messages.

Recipe / Configuration Screen

This screen is for preparing and validating machine behavior before production.

It should show:

text
Recipe list
Recipe details
Validation status
Version
Compatibility
Change permissions
Activation flow

It should not allow uncontrolled edits during active production.

Review / Results Screen

This screen is for understanding what happened in production.

It should show:

text
Run summary
Inspection result
Pass/fail status
Defect summary
Historical results
Traceability data

It should not mix result review with physical machine controls.

Manual / Service Screen

This is restricted and intentional.

It may include:

text
Manual device commands
Axis jog
Device test actions
Calibration utilities
Maintenance procedures

It should not be casually accessible to production operators during normal auto mode.

Diagnostics Screen

Diagnostics are for engineers, service, and advanced troubleshooting.

It should show:

text
Device health
Communication state
Internal service status
Logs
Counters
Timing data
Subsystem status

It should not replace operator-facing guidance. Raw diagnostics are not operator workflow.


PART 4 — Navigation Design

Navigation in industrial HMI is not just a menu problem.

It is a control and safety problem.

Critical screens must be reachable quickly. Dangerous screens must require intentional access. Current run context must not be lost when navigating.

A good navigation design supports this:

text
                  +------------------+
                  | Machine Overview |
                  +------------------+
                         |
        +----------------+----------------+
        |                |                |
        v                v                v
+---------------+ +---------------+ +---------------+
| Production    | | Active Alarms | | Result Review |
| Run Screen    | | Screen        | | Screen        |
+---------------+ +---------------+ +---------------+
        |                |                |
        v                v                v
+---------------+ +---------------+ +---------------+
| Pause / Stop  | | Recovery Flow | | Run History   |
| Actions       | | Wizard        | | Detail        |
+---------------+ +---------------+ +---------------+


Restricted / Intentional Access:

+------------------+
| Login / Role     |
+------------------+
        |
        v
+------------------+
| Service Mode     |
+------------------+
        |
        v
+------------------+
| Manual Controls  |
| Diagnostics      |
| Calibration      |
+------------------+

Mode-Aware Navigation

The machine mode should influence available screens and actions.

Example:

text
Auto Production Mode:
- Run screen available
- Review screen available
- Recipe view available
- Manual control disabled
- Dangerous setup actions disabled

Manual / Service Mode:
- Manual control available
- Diagnostics available
- Auto run disabled or restricted
- Service actions logged

Recovery Mode:
- Recovery screen prioritized
- Normal start disabled
- Resume/retry/abort options controlled by workflow state

The UI should not pretend all screens are equally valid at all times.

Role-Aware Navigation

Different roles need different access.

Example:

text
Operator:
- Start approved job
- Monitor production
- Acknowledge allowed alarms
- Follow recovery instructions
- View results

Engineer:
- Edit recipes
- Access diagnostics
- Perform validation
- Review logs

Service:
- Manual controls
- Calibration
- Device-level tests
- Maintenance procedures

Admin:
- User management
- Permission setup
- System configuration

Role-aware navigation reduces accidental misuse.

Context-Preserving Navigation

This is extremely important.

If the operator navigates from the run screen to the alarm screen, the UI must preserve the current run context.

Bad:

text
Operator opens Alarm screen
Alarm screen has no job/run context
Operator does not know which wafer/step failed
Operator navigates back and loses current selection

Good:

text
Current Run Context:
- Lot: LOT-2026-0426-A
- Wafer: W12
- Step: Alignment
- Recipe: INSPECT_TOPSIDE_V12
- Machine State: PausedByAlarm

Visible across Run, Alarm, Recovery, and Review screens

Navigation should not erase the operator’s mental model.


PART 5 — Contextual UI Design

Every important screen should show enough context to prevent wrong actions.

The minimum shared context usually includes:

text
Machine state
Machine mode
Current job/run
Active recipe
Current workflow step
Active alarms
User role
Selected unit/wafer/part

A production screen without context is dangerous.

For example, imagine a recipe screen that lets the operator activate a recipe but does not show whether a run is active. That invites mistakes.

A better screen shows:

text
Machine State: Running
Current Recipe: INSPECT_TOPSIDE_V12
Recipe editing: Locked during active run
Allowed action: View only

Context prevents wrong assumptions.

A common HMI failure is that each screen maintains its own local state. The run screen knows the current job. The recipe screen has another selected recipe. The result screen has another selected wafer. The alarm screen only knows alarm IDs.

This creates inconsistency.

A better design uses a shared context model:

text
+------------------------+
| Machine Context        |
+------------------------+
| MachineState           |
| Mode                   |
| CurrentRun             |
| ActiveRecipe           |
| CurrentStep            |
| ActiveAlarms           |
| SelectedWafer/Part     |
| UserRole               |
+------------------------+
        |
        +------> Run Screen
        |
        +------> Alarm Screen
        |
        +------> Recipe Screen
        |
        +------> Review Screen
        |
        +------> Recovery Screen

The screen may have local UI state, but it should not invent its own truth about the machine.


PART 6 — Designing for Recovery Flows

Recovery is often more important than the happy path.

In demos, people focus on:

text
Select recipe
Start run
Show progress
Complete successfully

In production, operators spend a lot of stressful time in abnormal states:

text
Alarm occurred
Run paused
Wafer partially inspected
Device disconnected
Motion failed
Door opened
Abort pressed
Manual intervention performed
Machine state uncertain

The recovery UI must answer:

text
What happened?
Where did it happen?
What is the current state?
What is safe now?
What options are allowed?
What action should the operator take next?

A good recovery workflow is explicit:

text
+-------------------+
| Run Interrupted   |
+-------------------+
          |
          v
+-------------------+
| Determine Cause   |
| Alarm / Abort /   |
| Device Fault      |
+-------------------+
          |
          v
+-------------------+
| Show Current      |
| Machine Context   |
+-------------------+
          |
          v
+-------------------+
| Evaluate Recovery |
| Options           |
+-------------------+
   |          |          |
   v          v          v
+------+   +------+   +------+
|Retry |   |Resume|   |Abort |
+------+   +------+   +------+
   |          |          |
   v          v          v
+-------------------+
| Confirm Safe      |
| Transition        |
+-------------------+
          |
          v
+-------------------+
| Continue / End    |
+-------------------+

The key is that recovery actions should not be random buttons. They should be derived from workflow state.

Example:

text
State: PausedDuringInspection
Allowed:
- Resume inspection
- Abort current wafer
- Stop lot after current wafer

Not allowed:
- Edit active recipe
- Start new run
- Manual jog axis without entering service flow

Recovery screens should be more guided than production screens because the operator is under more stress.


PART 7 — Real-World Failure Scenarios

Scenario 1 — Operator Cannot Find Needed Action Under Alarm

What it looks like:

text
Machine stops.
Alarm popup appears.
Operator closes popup.
They do not know where to go next.
They search through menus.
Production is down.
Eventually they call service for a simple recovery.

Why it happens:

The alarm is treated as a message, not as part of a workflow. The UI says what happened but does not guide what to do.

How experienced engineers fix it:

They connect alarm state to workflow navigation.

text
Alarm occurs

Alarm banner remains visible

Operator clicks alarm

Alarm screen shows affected run/step

Recovery action is shown if available

The UI should lead the operator from alarm to recovery.


Scenario 2 — Screen Shows Controls Without Enough Machine Context

What it looks like:

text
Operator sees a Start button.
They press it.
Machine rejects command.
Or worse, command is accepted in a bad context.

Why it happens:

The screen exposes commands without showing machine state, mode, readiness, or active recipe.

How experienced engineers fix it:

They make command availability context-driven.

text
Start button enabled only when:
- Machine is Ready
- Auto mode is active
- Recipe is valid
- No blocking alarms
- Required job is selected

The screen also explains why the action is disabled:

text
Start disabled:
- No active recipe
- Door interlock open

This reduces guessing.


Scenario 3 — Production and Service Actions Mixed Together

What it looks like:

text
Production operator sees jog controls, reset buttons, calibration actions, recipe edits, and run commands on the same screen.

Why it happens:

The UI was built around device modules instead of user workflows.

How experienced engineers fix it:

They separate production, setup, service, and diagnostics flows.

text
Production:
- Start, pause, stop
- Monitor
- Basic recovery

Setup:
- Select recipe
- Validate job
- Prepare machine

Service:
- Manual control
- Device tests
- Calibration

Diagnostics:
- Logs
- Health
- Communication status

Service controls should require role and mode changes.


Scenario 4 — Navigation Loses Current Run Context

What it looks like:

text
Operator is monitoring Wafer 12.
An alarm occurs.
They open the alarm screen.
The alarm screen does not show wafer/run/step.
They go to results screen.
The selected wafer resets to Wafer 1.
Now they are unsure what failed.

Why it happens:

Each screen owns isolated state. There is no shared workflow context.

How experienced engineers fix it:

They introduce a central machine/run context.

text
CurrentRunContext:
- Lot
- Wafer
- Recipe
- Step
- Run state
- Active alarm correlation

All relevant screens consume this context.


Scenario 5 — Operator Edits Recipe While Run Is Active

What it looks like:

text
Run is active.
Operator opens recipe screen.
Changes threshold or position.
Machine behavior becomes unclear.
Results are hard to explain later.

Why it happens:

Recipe editing was treated like normal CRUD editing.

How experienced engineers fix it:

They separate:

text
View active recipe
Edit draft recipe
Validate recipe
Activate recipe
Lock active recipe during run
Audit changes

During production, active recipe parameters should be read-only unless the system has a carefully designed controlled-adjustment workflow.


Scenario 6 — Recovery Flow Unclear After Abort

What it looks like:

text
Operator presses Abort.
Machine stops.
UI returns to idle-looking screen.
Operator does not know:
- Is wafer still loaded?
- Was inspection completed?
- Can I restart?
- Should I unload?
- Is the stage homed?

Why it happens:

Abort was treated as the end of a command, not as a transition into a recovery state.

How experienced engineers fix it:

They model abort as a state:

text
Running

Abort Requested

Stopping Safely

AbortedWithMaterialInProcess

Recovery Required

The UI should not show normal “Ready” until recovery is completed.


Scenario 7 — Too Much Information Causes Missed Critical Status

What it looks like:

text
The screen has hundreds of values:
axis positions, temperatures, IO bits, counters, camera states, recipe fields, logs.
A critical warning appears but is visually buried.

Why it happens:

Engineers expose everything they know because it is easy to bind internal state to the UI.

How experienced engineers fix it:

They design screen levels:

text
Level 1: Operator status
Level 2: Troubleshooting summary
Level 3: Detailed diagnostics
Level 4: Engineering/service raw data

The production operator should see the most important operational truth first.


PART 8 — Software Design Implications

Screen design directly affects architecture.

If the UI is workflow-driven, the software cannot just have random screens calling random services.

You need architectural concepts such as:

text
Workflow-aware view models
Central navigation service
Shared machine context
Role/mode-aware access control
Screen contracts
Command gating
Context-aware actions

A useful component structure looks like this:

text
+--------------------------------------------------+
| Machine State / Workflow Context                 |
|--------------------------------------------------|
| MachineState, Mode, CurrentRun, ActiveRecipe,    |
| CurrentStep, ActiveAlarms, UserRole              |
+--------------------------------------------------+
                         |
                         v
+--------------------------------------------------+
| Navigation + Screen Context Service              |
|--------------------------------------------------|
| Determines allowed screens, preserves context,   |
| handles mode/role-aware navigation               |
+--------------------------------------------------+
        |                  |                  |
        v                  v                  v
+---------------+  +---------------+  +---------------+
| Production VM |  | Alarm VM      |  | Recipe VM     |
| Run Screen    |  | Alarm Screen  |  | Config Screen |
+---------------+  +---------------+  +---------------+
        |                  |                  |
        v                  v                  v
+---------------+  +---------------+  +---------------+
| Recovery VM   |  | Review VM     |  | Diagnostics VM|
| Recovery Flow |  | Result Screen |  | Service Views |
+---------------+  +---------------+  +---------------+

Bad Approach

text
Screen = Database table or device module

AxisScreen owns axis state
CameraScreen owns camera state
RecipeScreen owns active recipe state
RunScreen owns run state
AlarmScreen owns alarm state

This causes duplicated state, inconsistent decisions, and poor navigation.

The UI becomes a collection of technical panels.

Good Approach

text
Screen = Operator task boundary

Production screen supports run execution
Recipe screen supports preparation
Alarm screen supports interruption handling
Recovery screen supports safe restart
Review screen supports result understanding
Service screen supports restricted maintenance

The UI becomes a workflow tool.

Screen Contract Thinking

Each screen should have a clear contract.

Example:

text
Production Run Screen

Purpose:
- Monitor and control active production run

Inputs:
- MachineContext
- RunContext
- ActiveRecipeSummary
- AlarmSummary

Allowed actions:
- Start
- Pause
- Stop after current step
- Abort, if permitted
- Navigate to alarms/recovery/review

Forbidden responsibilities:
- Edit recipe
- Manual jog
- Device calibration
- Raw log browsing

This kind of contract keeps screens clean.

Workflow-Aware ViewModel

A workflow-aware view model does not simply expose button commands. It knows whether commands are valid for the current state.

Example conceptually:

text
CanStart =
    MachineState == Ready
    && Mode == Auto
    && ActiveRecipe.IsValid
    && !ActiveAlarms.HasBlockingAlarm
    && CurrentRun == null

CanEditRecipe =
    UserRole >= Engineer
    && MachineState != Running
    && Mode != AutoProduction

CanResume =
    MachineState == Paused
    && RecoveryContext.ResumeAllowed

The UI should not be the only protection, but it should reflect the real rules.


PART 9 — Interview / Real-World Talking Points

A strong interview answer could sound like this:

In industrial HMI design, I would not start by listing screens. I would start by modeling operator workflows: startup, readiness, recipe selection, run start, live monitoring, alarm response, review, stop, and recovery. The screens should follow these workflows, not the internal code modules. Operators need to make correct decisions under pressure, so the UI must preserve machine context, show what state the machine is in, make valid actions obvious, and hide or restrict actions that are unsafe for the current mode or role.

Another strong version:

A common mistake is building the HMI around devices: axis screen, camera screen, IO screen, recipe table, log viewer. That may reflect the software structure, but it does not reflect how operators work. A production operator wants to know whether the machine is ready, what job is running, whether there are alarms, what step is active, and what action is safe next. So I would design around task-oriented screens: overview, production, alarm/recovery, recipe preparation, result review, service, and diagnostics.

Common mistakes software engineers make when entering industrial UI:

text
They design screens around data models instead of workflows.
They expose too many low-level controls to operators.
They duplicate machine state in each screen.
They treat alarms as popups instead of workflow interruptions.
They forget recovery flows.
They allow editing configuration during active production.
They lose current run context during navigation.
They make disabled commands mysterious.
They mix production, setup, service, and diagnostics into one screen.

What strong engineers understand:

text
Operators work under stress.
Machine state matters everywhere.
Navigation must preserve context.
Recovery is not an edge case.
Role and mode restrictions are part of safety.
Screen design affects architecture.
The UI should guide valid action, not just display data.

The core principle:

Industrial HMI design is workflow architecture. The screen is not just a view; it is the operator-facing expression of machine state, workflow state, permissions, and safe next actions.

Docs-first project memory for AI-assisted implementation.