Skip to content

HMI System Architecture

In industrial machines, the HMI is not “just the frontend.” It is the operator-facing control surface of a cyber-physical system. The roadmap correctly places UI / HMI / Operator Experience as a high-priority domain because bad HMI design can cause operator mistakes, poor troubleshooting, slow recovery, and production downtime.


PART 1 — WHY HMI IS PART OF THE MACHINE, NOT JUST UI

In a business application, the UI usually helps users view data, submit forms, and trigger business actions. If the UI is slow or confusing, the result may be frustration, wrong data entry, or support tickets.

In an industrial machine, the HMI is connected to physical behavior.

The operator may use the HMI to:

  • start a production run
  • pause a wafer inspection sequence
  • jog an axis manually
  • acknowledge alarms
  • load a recipe
  • enable service mode
  • retry a failed step
  • inspect machine health
  • understand whether the machine is safe to operate

So the HMI participates in machine operation, even if it should not directly control hardware.

A wrong UI decision can become a wrong machine decision.

Example:

text
Operator sees: "Stage Ready"
Actual machine state: stage is still settling
Operator clicks: Start Inspection
Result: image captured during vibration → bad inspection result

Another example:

text
Button enabled: "Move To Load Position"
Actual condition: door is open, clamp is not released
Operator clicks button
Result: command should be rejected before reaching motion control

The key mindset is this:

text
The HMI is not the machine brain.
But it is part of the machine nervous system.

It must reflect the true machine state, guide the operator, and send commands through controlled architectural boundaries.


PART 2 — CORE ARCHITECTURE LAYERS

A typical industrial machine architecture looks like this:

text
+--------------------------------------------------+
|                    UI / HMI Layer                |
|--------------------------------------------------|
| - Displays machine state                         |
| - Shows alarms, recipes, run status, results     |
| - Accepts operator commands                      |
| - Does NOT directly control hardware             |
+--------------------------+-----------------------+
                           |
                           v
+--------------------------------------------------+
|             Application / Workflow Layer         |
|--------------------------------------------------|
| - Owns use cases                                 |
| - Runs machine workflows                         |
| - Validates operator commands                    |
| - Coordinates recipes, alarms, jobs, modes       |
+--------------------------+-----------------------+
                           |
                           v
+--------------------------------------------------+
|                Machine Control Layer             |
|--------------------------------------------------|
| - Owns machine state                             |
| - Executes motion/device commands                |
| - Applies interlocks and permissives             |
| - Coordinates machine subsystems                 |
+--------------------------+-----------------------+
                           |
                           v
+--------------------------------------------------+
|                     Device Layer                 |
|--------------------------------------------------|
| - Cameras                                        |
| - Motion controllers                             |
| - PLCs                                           |
| - Sensors                                        |
| - IO modules                                     |
| - Light controllers                              |
+--------------------------------------------------+

This layering matters because machine software is not only about clean code. It is about preventing the wrong part of the system from making decisions it should not own.

The UI layer should answer:

text
What should the operator see?
What command is the operator requesting?

The workflow layer should answer:

text
Is this command valid in the current operation?
What sequence should run?
What should happen next?

The machine control layer should answer:

text
Can the machine physically and safely execute this?
What is the authoritative machine state?

The device layer should answer:

text
How do I communicate with this specific hardware?
What did the hardware report?

This matches the broader roadmap principle that industrial architecture must separate UI, workflow, and device logic because real machines combine UI, workflows, device control, alarms, data pipelines, and long-running behavior.


PART 3 — DATA FLOW: STATE VS COMMAND

There are two major flows in an HMI system.

1. Machine-to-UI state flow

text
+-------------+      +-----------------+      +------------------+      +---------+
|  Devices    | ---> | Machine Control | ---> | Workflow/App     | ---> |  HMI    |
|             |      | State Model     |      | View State Model |      |         |
+-------------+      +-----------------+      +------------------+      +---------+

This flow tells the operator what is happening.

Examples:

text
MachineState = Running
CurrentStep = AlignWafer
StagePosition = X: 120.5, Y: 80.2
CameraState = Acquiring
DoorState = Closed
AlarmState = None

The UI should not invent this state. It should project state from the application and machine layers.

A strong HMI is state-driven.

That means screens are mostly a reflection of authoritative state:

text
CanStartButton = MachineState == Idle
CanAbortButton = MachineState == Running || MachineState == Paused
CanJogAxis = Mode == Manual && SafetyState == Safe

2. UI-to-machine command flow

text
+---------+      +------------------+      +-----------------+      +----------+
|  HMI    | ---> | Workflow/App     | ---> | Machine Control | ---> | Devices  |
| Command |      | Validation       |      | Execution       |      |          |
+---------+      +------------------+      +-----------------+      +----------+

The UI does not say:

text
Motor.MoveTo(100)

It should say something like:

text
RequestMoveToLoadPosition()

Then the application/workflow layer validates:

text
Is the machine in manual mode?
Is the door closed?
Is the axis homed?
Is there an active alarm?
Is another workflow running?
Is the operator authorized?

Only after that should the command reach machine control.

This distinction is critical:

text
State flows upward.
Commands flow downward.
Validation sits between operator intent and physical action.

PART 4 — WHY UI MUST BE DECOUPLED

The UI must not directly call devices.

Bad design:

text
+---------+        +----------------+
|  HMI    | -----> | Motion Axis SDK|
+---------+        +----------------+

This looks simple at first. It is dangerous later.

Why?

Because the UI does not have the full machine context.

The screen may know that the operator clicked “Move,” but it may not know:

  • whether the axis is homed
  • whether another sequence is active
  • whether a safety door is open
  • whether the wafer is clamped
  • whether the machine is in auto mode
  • whether the command conflicts with an inspection step
  • whether a recovery operation is in progress

Good design:

text
+---------+      +----------------------+      +------------------+
|  HMI    | ---> | Application Command  | ---> | Machine Control  |
|         |      | Handler / Workflow   |      | Layer            |
+---------+      +----------------------+      +------------------+

The UI expresses intent. The application layer decides whether that intent is allowed. Machine control executes only validated actions.

The architectural rule is:

text
The UI may request.
The workflow may decide.
The machine control layer may execute.
The device layer may communicate.

When this boundary is broken, the system becomes fragile.

You see problems like:

text
Screen A allows a command.
Screen B blocks the same command.
Service screen bypasses safety validation.
A hidden button calls the device directly.
A maintenance tool changes hardware state without updating machine state.

That is how machines become unpredictable.


PART 5 — LONG-RUNNING UI SYSTEM CONSTRAINTS

Industrial HMIs are usually long-running applications.

They may run:

text
8 hours
24 hours
several days
several weeks

This is very different from many business apps, where users open a page, perform an action, and close it.

An industrial HMI must survive:

  • continuous status updates
  • alarm bursts
  • image/result streaming
  • device reconnects
  • operator login/logout
  • mode switching
  • recipe changes
  • production runs
  • service operations
  • partial hardware failures
  • network instability
  • memory pressure
  • UI thread overload

A common beginner mistake is designing the HMI like a normal desktop CRUD application.

But an HMI is closer to a live operations console.

The architecture must handle:

text
continuous state updates
bounded event streams
stable subscriptions
controlled resource ownership
clear cleanup on screen changes
safe reconnect behavior
observable command execution

For example, if a status panel subscribes to machine events but never unsubscribes, the system may look fine during a demo. After three days, memory grows, UI slows down, duplicate event handlers run, and operators see delayed updates.

That kind of bug is very common in long-running HMI systems.


PART 6 — MULTIPLE UI MODES

Industrial HMIs often support multiple modes.

Typical modes:

text
Operator Mode
Manual / Service Mode
Engineering Mode

Each mode changes what the UI shows and what commands are allowed.

Operator mode

Used during production.

The UI should be simple, safe, and focused:

text
Start
Stop
Pause
Resume
Current job
Current recipe
Machine status
Alarms
Inspection result
Production progress

Operator mode should not expose dangerous low-level controls.

Manual / service mode

Used for setup, recovery, calibration, and troubleshooting.

It may expose:

text
jog axis
turn light on/off
trigger camera
move to service position
reset device
run diagnostic step

This mode is powerful and dangerous. Commands must be gated by permissions, interlocks, and current machine state.

Engineering mode

Used by engineers for tuning, diagnostics, recipe development, and deeper troubleshooting.

It may expose:

text
advanced parameters
logs
raw device status
calibration data
inspection thresholds
debug tools

The architecture should not implement mode behavior as random if statements scattered across screens.

Better:

text
+------------------+
| Machine Mode     |
+------------------+
| Operator         |
| Manual           |
| Service          |
| Engineering      |
+------------------+
        |
        v
+-----------------------------+
| Permission + Command Policy |
+-----------------------------+
        |
        v
+-----------------------------+
| UI Enabled State / Commands |
+-----------------------------+

The mode system should be centralized enough that command availability is consistent across screens.


PART 7 — REAL-WORLD FAILURE SCENARIOS

Failure 1: UI directly controls a device

What it looks like:

text
Operator clicks "Move Z Up"
UI directly calls ZAxis.MoveRelative(10)

Why it happens:

text
The team wanted a quick manual-control screen.
The device SDK was easy to call.
The workflow layer felt unnecessary.

Why it is dangerous:

text
The command bypasses machine state.
It may run during auto sequence.
It may ignore interlocks.
It may conflict with another command.

How strong engineers fix it:

text
UI sends ManualMoveCommand
Application layer validates mode and permissions
Machine control checks interlocks
Axis controller executes command
State model updates result

Good flow:

text
HMI
 |
 v
ManualMoveCommand
 |
 v
Command Validator
 |
 v
Machine Control
 |
 v
Axis Device Adapter

Failure 2: UI shows cached state

What it looks like:

text
UI shows: Door Closed
Actual signal: Door Open
Operator starts operation
Machine refuses or faults

Why it happens:

text
UI has its own cached state.
Device state changed but UI was not updated.
State source is duplicated.
Polling interval is too slow.
Reconnect logic did not refresh state.

How engineers fix it:

text
Use one authoritative machine state model.
Mark stale state explicitly.
Include timestamp / freshness where needed.
Force state resync after reconnect.
Avoid screen-local copies of critical state.

A better UI state might be:

text
DoorState = Unknown
LastUpdated = 5 seconds ago
MachineReady = false
StartCommandEnabled = false

For machine systems, unknown state should usually be treated as not safe.


Failure 3: UI freeze blocks critical feedback

What it looks like:

text
Machine is running.
Alarm occurs.
UI freezes for 10 seconds.
Operator does not see alarm immediately.

Why it happens:

text
Heavy work runs on UI thread.
Too many events are pushed into UI binding.
Image rendering blocks status updates.
Logging or file IO happens in UI flow.

How engineers fix it:

text
Keep UI as a projection layer.
Throttle non-critical updates.
Prioritize alarms and machine state.
Move processing away from UI thread.
Use bounded event pipelines.
Separate live image display from critical status display.

The principle:

text
The UI may be rich.
But critical feedback must remain responsive.

Failure 4: UI triggers command without validation

What it looks like:

text
Start button is enabled.
Operator clicks Start.
Workflow begins with invalid recipe.
Machine faults halfway through.

Why it happens:

text
UI only checked simple conditions.
Real validation lived elsewhere or was missing.
Recipe validation was delayed until execution.

How engineers fix it:

text
Centralize command validation.
Validate recipe before activation.
Expose validation result to UI.
Disable commands based on authoritative command policy.
Return clear rejection reasons.

Better model:

text
StartCommand:
  Allowed = false
  Reason = "Recipe is not validated for current machine configuration"

This is much better than allowing the command and failing later.


Failure 5: UI logic duplicated across screens

What it looks like:

text
Main screen disables Start when alarm is active.
Service screen still allows Start.
Recipe screen allows activation during running state.

Why it happens:

text
Each screen implemented its own enable/disable logic.
There is no shared command policy.
Machine state rules are spread across view models.

How engineers fix it:

text
Create command models.
Centralize command availability rules.
Make screens consume command state instead of recalculating it.
Test command rules independently from UI.

Good design:

text
+--------------------+
| Machine State      |
+--------------------+
          |
          v
+--------------------+
| Command Policy     |
+--------------------+
          |
          v
+--------------------+
| UI Command Model   |
+--------------------+
          |
          v
+--------------------+
| Screens / Buttons  |
+--------------------+

PART 8 — SOFTWARE DESIGN IMPLICATIONS

A strong HMI architecture should be:

text
decoupled
state-driven
command-oriented
observable
safe by default

MVVM, adapted for machine systems

MVVM is useful, especially in WPF-based industrial systems, but only if you adapt it correctly.

Bad MVVM:

text
ViewModel calls camera SDK.
ViewModel starts motion.
ViewModel owns workflow sequence.
ViewModel contains safety rules.
ViewModel stores machine state separately.

Good MVVM:

text
ViewModel displays application state.
ViewModel sends application commands.
Workflow layer owns sequencing.
Machine control owns physical state.
Device layer owns hardware communication.

A good structure:

text
+-----------------------------+
| View                        |
| - XAML / screen             |
+--------------+--------------+
               |
               v
+-----------------------------+
| ViewModel                   |
| - UI state projection       |
| - user intent commands      |
| - no direct hardware calls  |
+--------------+--------------+
               |
               v
+-----------------------------+
| Application Services        |
| - command handlers          |
| - workflow orchestration    |
| - validation                |
+--------------+--------------+
               |
               v
+-----------------------------+
| Machine Services            |
| - state machine             |
| - interlocks                |
| - subsystem coordination    |
+--------------+--------------+
               |
               v
+-----------------------------+
| Device Adapters             |
| - SDKs, PLCs, cameras       |
+-----------------------------+

State models

A machine HMI should display state from explicit models.

Examples:

text
MachineState
WorkflowState
AlarmState
DeviceHealthState
RecipeState
InspectionState
PermissionState
CommandAvailabilityState

This avoids screen-specific interpretation.

Instead of each screen asking:

text
Can I enable this button?

The screen receives:

text
StartCommand.Available = false
StartCommand.DisabledReason = "Machine is not idle"

Command models

Operator actions should be modeled as commands.

Examples:

text
StartRunCommand
PauseRunCommand
AbortRunCommand
LoadRecipeCommand
ActivateRecipeCommand
ManualMoveAxisCommand
ResetAlarmCommand
RetryStepCommand

Each command should have:

text
who requested it
when it was requested
from which mode
with what parameters
whether it was accepted or rejected
why it was rejected
what workflow/machine state existed at the time

This gives you auditability and diagnosability.

Event-driven updates

Industrial HMIs usually need event-driven state updates, but not every event should directly hit the UI.

Better:

text
Machine Events
     |
     v
State Aggregator
     |
     v
View State Projection
     |
     v
UI Binding

This prevents the UI from becoming a dumping ground for raw device events.

Raw device events are often too noisy, too low-level, or too unstable for direct display.


PART 9 — INTERVIEW / REAL-WORLD TALKING POINTS

A strong interview answer could sound like this:

In an industrial machine, I would not treat the HMI as a normal frontend. The HMI is the operator-facing part of the machine system, so it must reflect authoritative machine state and send operator intent through validated command paths. I would separate the UI from workflow orchestration, machine control, and device adapters. The UI should not directly call hardware. It should display state, expose allowed commands, and show clear feedback when commands are rejected.

Another strong explanation:

The most important boundary is between operator intent and physical execution. A button click is not a hardware command. It is a request. That request must pass through application validation, workflow rules, machine state checks, interlocks, and permissions before reaching device control.

Common mistakes engineers make:

text
UI directly talks to hardware.
Screens duplicate command rules.
Machine state is cached inconsistently.
Buttons are enabled based on stale assumptions.
Workflow logic is hidden inside view models.
Manual mode bypasses safety checks.
Alarms are displayed but not integrated with command policy.
UI responsiveness is treated as cosmetic instead of operational.

What strong engineers understand:

text
The HMI is part of machine operation.
The UI must be state-driven.
Commands must be validated centrally.
Unknown state should usually disable action.
Mode switching must be explicit and safe.
Long-running stability matters more than demo smoothness.
Operator confusion is a system failure, not just UX weakness.

The simplest architectural principle is:

text
Bad:
UI -> Hardware

Better:
UI -> Application Command -> Workflow -> Machine Control -> Device

And the strongest mental model is:

text
The HMI should make the machine understandable,
but it should not become the machine controller.

Docs-first project memory for AI-assisted implementation.