Message Framing & Parsing
This topic sits inside the communication area of your roadmap, specifically “Protocol framing and parsing” in the industrial communication domain.
PART 1 — WHY FRAMING & PARSING MATTER
In industrial systems, the transport layer gives your software bytes, not meaningful business objects.
A serial port does not tell you, “here is one complete status message.” A TCP socket does not tell you, “here is exactly one command response.”
It just gives you whatever bytes happened to arrive at that moment.
That means software has to solve a foundational problem:
Where does one message begin and end? That is framing.
Once I have one whole message, how do I interpret it safely? That is parsing.
If you get either part wrong, the system may appear to work most of the time and then fail in ugly, intermittent ways.
Why this is harder than many engineers expect
A transport read can return:
- half of a message
- one whole message
- one and a half messages
- three messages combined together
- a message plus corrupted junk
- nothing at all, even though more data will come later
That is normal behavior, not an edge case.
Example: serial stream
Imagine a device sends two framed messages one after another:
<STX>CMD,123<ETX><STX>STS,OK<ETX>Your read operation might return:
Read #1: <STX>CMD,
Read #2: 123<ETX><STX>ST
Read #3: S,OK<ETX>The transport did nothing wrong. Your software must reconstruct the intended messages.
Example: TCP stream
Suppose the device logically sends:
LEN=05 HELLO
LEN=05 WORLDYour socket might return:
Read #1: LEN=05 HE
Read #2: LLOLEN=05WORLDAgain, normal. TCP is a byte stream, not a message queue.
Why incorrect parsing is dangerous
Bad framing/parsing causes problems that look like device bugs, random timeouts, or “weird field issues”:
- command interpreted incorrectly
- status field shifted into wrong column
- checksum validated against wrong bytes
- parser waits forever because it lost synchronization
- one bad byte corrupts every subsequent message
- silent data corruption because parser accepted invalid structure
In industrial software, those are not small bugs. They can become:
- wrong setpoints
- false alarms
- missed alarms
- invalid motion parameters
- incorrect measurement data
- hard-to-reproduce production failures
The core lesson is simple:
Communication code is not about reading data. It is about reconstructing truth from imperfect byte streams.
PART 2 — WHAT IS “FRAMING”
Framing means defining how the receiver knows where a message starts and ends.
Without framing, a stream of bytes is just an undifferentiated river.
Common framing techniques
1. Delimiter-based framing
A special byte or sequence marks the end, sometimes start and end.
Examples:
- newline-terminated text
- STX/ETX wrappers
- CRLF terminators
<STX>TEMP,42<ETX><STX>STATUS,OK<ETX>Good for:
- simple protocols
- human-readable protocols
- instrument-style command sets
Risks:
- delimiter may appear inside payload unless escaped
- lost delimiter can misalign the stream
2. Fixed-length framing
Every message is exactly N bytes.
[10 bytes][10 bytes][10 bytes]Good for:
- highly regular device packets
- simple low-level controllers
Risks:
- inflexible
- one-byte shift destroys alignment
- versioning becomes harder
3. Length-prefixed framing
Header says how long the payload is.
[Header][Length=12][12 bytes payload]Good for:
- binary protocols
- variable-size messages
- efficient stream handling
Risks:
- corrupted length field can break everything
- parser must defend against absurd lengths
4. Header-based framing
A known header pattern starts the frame, often followed by type, length, and checksum.
[AA][55][Type][Length][Payload][CRC]Good for:
- binary industrial protocols
- protocols needing resynchronization
- protocols with multiple message types
Risks:
- header pattern may appear accidentally in corrupted data
- parser must know how to recover from false positives
ASCII data stream diagram
Raw byte stream
+---------------------------------------------------------------+
| 7E 01 04 A1 B2 C3 D4 7E 02 02 11 22 7E 03 03 99 88 77 7E ... |
+---------------------------------------------------------------+
Framed as delimiter-based messages
+-------------------+ +-------------+ +----------------+
| 7E 01 04 A1 B2... | | 7E 02 02... | | 7E 03 03 ... |
+-------------------+ +-------------+ +----------------+
Msg 1 Msg 2 Msg 3What this diagram means
The stream arrives as one continuous sequence. The parser’s first job is to identify the frame boundaries so the rest of the software can work on complete messages.
PART 3 — BUFFERING & STREAM PROCESSING
Because reads are arbitrary, incoming data must be buffered.
You do not parse directly from each read as if it were a complete message. You append bytes to a buffer, then repeatedly try to extract complete frames from that buffer.
That is the standard mental model.
Core loop
- receive bytes
- append to buffer
- inspect buffer
- if a full frame exists, extract it
- leave incomplete remainder in buffer
- wait for more bytes
ASCII buffer diagram
After Read #1
Buffer:
+----------------------+
| AA 55 03 10 20 |
+----------------------+
^
incomplete frame, wait for more
After Read #2
Buffer:
+--------------------------------------+
| AA 55 03 10 20 30 AA 55 02 99 88 |
+--------------------------------------+
Extract frame 1:
+------------------+
| AA 55 03 10 20 30|
+------------------+
Remaining buffer:
+------------------+
| AA 55 02 99 88 |
+------------------+What this diagram means
The parser does not “forget” partial data between reads. It preserves incomplete bytes, extracts only what is complete, and leaves leftovers for the next read.
Partial message handling
Suppose framing is length-prefixed:
[Type][Length][Payload]If buffer contains:
01 05 A0 B1That says the payload should contain 5 bytes, but only 2 have arrived so far. The correct behavior is not failure. The correct behavior is:
- keep buffer as-is
- wait for more bytes
- continue once the payload is complete
Leftover handling
If one read contains:
- one complete message
- plus the beginning of the next one
you must extract only the complete message and keep the remainder.
This is where many naive implementations fail. They parse one message correctly but then either discard trailing bytes or accidentally mix them into the next read.
Design implication
Your parser should think in terms of:
- stream buffer
- cursor / consumption position
- frame extraction loop
Not in terms of:
- “one read equals one message”
That assumption is one of the most common bugs in device integrations.
PART 4 — PARSING STRUCTURED MESSAGES
Once framing gives you a complete message, parsing begins.
Framing answers: “Do I have one whole message?”
Parsing answers: “What does this message contain, and is it valid?”
Typical parsing steps
- identify message type
- extract fields
- validate structure
- convert raw values into usable data
- produce a message object or parse result
Example structure
[Header][Type][Length][Payload][Checksum]Payload might itself contain fields:
[DeviceId][Status][Value]ASCII message structure diagram
+--------+------+--------+-------------------+----------+
| Header | Type | Length | Payload | Checksum |
+--------+------+--------+-------------------+----------+
| AA55 | 02 | 03 | 10 01 7F | 9C |
+--------+------+--------+-------------------+----------+
Payload decoded as:
+----------+--------+-------+
| DeviceId | Status | Value |
+----------+--------+-------+
| 0x10 | 0x01 | 0x7F |
+----------+--------+-------+What this diagram means
The parser first validates the outer frame structure. Then it interprets the payload according to the message type.
Important distinction
A frame can be structurally complete but semantically invalid.
For example:
- length matches actual bytes
- checksum passes
- but message type is unknown
- or payload field values are out of allowed range
A robust parser treats those as different categories:
- framing valid
- syntax valid
- semantic meaning maybe invalid
That separation matters for debugging.
Practical parser outputs
Good parsers usually return something like:
- success with parsed message
- incomplete frame, need more data
- invalid frame, discard or resync
- unsupported but structurally valid message
- internal parser error
That is much better than throwing random exceptions at every bad input.
PART 5 — HANDLING PARTIAL & CORRUPTED DATA
In real systems, data is often imperfect.
Not constantly broken, but imperfect often enough that your parser must expect it.
Three common bad states
1. Incomplete data
You do not yet have a full frame.
Example:
AA 55 04 10 20Length says 4 payload bytes, but only 2 are present.
Correct response:
- keep buffer
- wait for more bytes
2. Misaligned data
Your buffer starts in the middle of a frame or contains junk before a valid frame.
Example:
99 88 FF AA 55 02 11 22 7CCorrect response:
- scan for the next plausible frame start
- discard bytes before it
- continue carefully
This is called resynchronization.
3. Corrupted data
Frame exists, but structure or integrity check fails.
Example:
- invalid checksum
- impossible length
- illegal type
- truncated payload
Correct response may be:
- reject frame
- log raw bytes
- discard frame
- attempt resync to next valid header
Checksum / CRC validation
Checksums are not the full answer, but they are a major defense.
They help detect:
- flipped bits
- truncated messages
- accidental concatenation errors
- some kinds of misalignment
They do not guarantee correctness of meaning. They only help validate byte integrity.
Why robustness is critical
In industrial software, parsers sit at the trust boundary between your software and the outside world.
That outside world may include:
- noisy serial lines
- buggy vendor firmware
- device restarts mid-stream
- mixed protocol versions
- partial initialization states
- unexpected diagnostic messages
A parser that assumes perfect input is not production-grade.
A production parser must be able to say:
- this is incomplete
- this is invalid
- this looks like a new frame start
- this cannot be trusted
- this should be dropped without poisoning the rest of the stream
PART 6 — REAL-WORLD FAILURE SCENARIOS
This is where the topic becomes very real.
1. Message split across multiple reads
What it looks like
Read #1: AA 55 05 10 20
Read #2: 30 40 50 9CWhy it happens
Transport read boundaries are unrelated to message boundaries.
How engineers debug it
They inspect raw receive logs and notice the parser assumed each read was a whole packet.
Typical symptom:
- “works sometimes on local machine”
- “fails randomly in production”
- “device responses occasionally timeout”
The actual cause is often: parser discarded partial data instead of buffering it.
2. Multiple messages in one buffer
What it looks like
Read #1: AA 55 02 11 22 7C AA 55 01 33 44Why it happens
The OS or driver delivered more than one logical message in a single read.
How engineers debug it
They see first message processed successfully, second one mysteriously missing.
Typical root cause:
- parser extracted one frame and ignored remaining bytes
- or code returned too early after one successful parse
Strong parsers loop until no more complete frames remain.
3. Lost delimiter causing misalignment
What it looks like
Delimiter-based protocol:
<STX>ABC<ETX><STX>DEF<ETX>But one ETX is lost:
<STX>ABC<STX>DEF<ETX>Now parser may think:
ABC<STX>DEFis one giant invalid frame.
Why it happens
Noise, device bug, or incorrect escaping.
How engineers debug it
They look at raw dumps and compare against expected framing markers.
Good engineers ask:
- can my parser recover after delimiter loss?
- or will one missing delimiter corrupt the whole session?
This is why resynchronization strategy matters.
4. Corrupted data causing incorrect parsing
What it looks like
Length field says 120 bytes when the protocol maximum is 32.
Why it happens
Corruption, stale buffer content, endian mismatch, or parser bug.
How engineers debug it
They add validation logs:
- raw bytes
- computed length
- expected max length
- message type
- checksum result
Without those, debugging becomes guesswork.
A defensive parser rejects absurd values early.
5. Parser stuck waiting for data that never arrives
What it looks like
Buffer contains header and declared payload length, but device stopped sending.
AA 55 08 10 20 30Parser keeps waiting forever for remaining bytes.
Why it happens
Device restart, cable issue, transport timeout, sender bug.
How engineers debug it
They correlate:
- receive timestamps
- buffer state
- parser state
- timeout events
A robust design must separate:
- incomplete but still plausible
- incomplete for too long, now stale and invalid
Otherwise the parser can deadlock the protocol session conceptually, even if not at thread level.
6. Incorrect assumption about message boundaries
What it looks like
Developer tests against a simulator that always sends one message per read. Real device later batches messages or splits them.
Why it happens
Test environment was too clean.
How engineers debug it
They compare simulator behavior vs field behavior and discover the parser was accidentally built around a transport artifact.
This is a classic commissioning bug.
Strong engineers test with:
- partial frames
- combined frames
- junk prefixes
- corrupted bytes
- delayed completion
Not just happy-path packets.
PART 7 — DESIGNING ROBUST PARSERS
A good industrial parser is not clever. It is deterministic, boring, defensive, and inspectable.
That is what you want.
Principles
Deterministic parsing logic
Same input buffer should always produce the same decision:
- incomplete
- valid frame extracted
- invalid frame rejected
- resync applied
No hidden heuristics unless the protocol truly requires them.
Clear parser state machine
Even if not formally modeled, the logic should behave like a state machine:
- searching for start
- reading header
- reading declared length
- waiting for remaining bytes
- validating checksum
- emitting frame
- resynchronizing after failure
That mental model keeps the code understandable.
Defensive validation
Never trust incoming bytes blindly.
Validate things like:
- minimum frame size
- maximum frame size
- header signature
- supported message type
- length consistency
- checksum / CRC
- field ranges where applicable
Rejecting bad data early is cheaper than letting corrupted state travel upward.
Log raw data when needed
Not always at full production verbosity, but you need a way to capture:
- received raw bytes
- timestamps
- parser decisions
- reason for rejection
- resync events
Many field issues are only solvable if raw communication history can be inspected.
ASCII parsing flow diagram
+------------------+
| Incoming bytes |
+------------------+
|
v
+------------------+
| Append to buffer |
+------------------+
|
v
+-------------------------------+
| Is there enough for frame |
| header/minimum structure? |
+-------------------------------+
| Yes | No
v v
+----------------------------+ +------------------+
| Detect frame boundary | | Wait for more |
| / locate valid start | +------------------+
+----------------------------+
|
v
+----------------------------+
| Is complete frame present? |
+----------------------------+
| Yes | No
v v
+----------------------------+ +------------------+
| Validate length/checksum | | Keep remainder |
| / structure | | in buffer |
+----------------------------+ +------------------+
| Valid | Invalid
v v
+------------------------+ +----------------------+
| Parse fields | | Discard / resync |
| Emit message object | | Log reason |
+------------------------+ +----------------------+What this diagram means
The parser is a controlled pipeline:
- buffer
- detect
- validate
- extract
- parse
- recover
Not a loose pile of Substring, Split, or if statements.
PART 8 — SOFTWARE DESIGN IMPLICATIONS
This topic has big architectural consequences.
Parsing should be isolated from business logic
Business logic should not care about byte offsets, delimiters, checksum math, or resync rules.
That belongs in a dedicated communication/parsing layer.
Good layering
+-----------------------------+
| Application / Workflow |
| "Start cycle" |
| "Update device state" |
+-----------------------------+
^
|
+-----------------------------+
| Protocol Message Layer |
| typed messages / DTOs |
| command and response models |
+-----------------------------+
^
|
+-----------------------------+
| Framing & Parsing Layer |
| buffer, extract, validate |
| resync, decode |
+-----------------------------+
^
|
+-----------------------------+
| Transport Layer |
| serial / TCP / vendor API |
| raw bytes in/out |
+-----------------------------+What this diagram means
Each layer has a clean job:
- transport moves bytes
- framing/parsing reconstructs messages
- message layer represents protocol structures cleanly
- application uses meaningful commands and events
That separation makes systems easier to test and debug.
Bad approach
Ad hoc parsing scattered everywhere:
- socket callback slices bytes
- service class splits strings
- workflow interprets magic indexes
- UI directly depends on protocol field positions
That creates:
- duplicated assumptions
- inconsistent validation
- impossible debugging
- fragile changes when protocol evolves
Good approach
Structured parsing pipeline:
- one input buffer owner
- one framing algorithm per protocol
- one parser for message definitions
- typed parse results
- explicit invalid/incomplete states
- test cases for edge conditions
Testability matters
A good parser can be tested entirely offline with byte arrays.
You should be able to unit test scenarios like:
- frame split across three reads
- two frames in one read
- invalid checksum
- junk before valid frame
- declared length too large
- unknown message type
That is one of the strongest design advantages of isolating the parsing layer.
PART 9 — INTERVIEW / REAL-WORLD TALKING POINTS
How to explain framing vs parsing clearly
A strong simple explanation:
Framing is how you identify where a message starts and ends in a raw byte stream. Parsing is how you decode that complete message into structured fields and validate that it is actually meaningful.
That is crisp and correct.
Why stream-based communication is tricky
Because reads do not align with messages. You can receive:
- partial messages
- multiple messages together
- corrupted bytes
- misaligned data after an error
So robust software must buffer, extract, validate, and recover.
Common mistakes engineers make
- assuming one read equals one message
- discarding incomplete data
- parsing without clear framing
- trusting length fields blindly
- not handling junk or resynchronization
- mixing byte parsing with business logic
- lacking raw communication logs
- testing only happy-path packets
What strong engineers understand
Strong engineers understand that parsing is not a small helper function. It is a reliability boundary.
They know that:
- stream handling is stateful
- imperfect input is normal
- recovery after corruption matters as much as normal parsing
- deterministic parser behavior is critical
- observability is essential for field debugging
- protocol code should be isolated and heavily tested
A strong interview answer
You could say something like this:
In industrial systems, transport layers usually deliver arbitrary chunks of bytes, not complete messages. So the software needs a framing layer to reconstruct message boundaries, then a parsing layer to decode and validate structured content. The hard part is not the happy path. The hard part is handling split frames, combined frames, corrupted data, and loss of synchronization without poisoning the whole communication session. Good designs isolate parsing from business logic, use deterministic state-based parsing, validate aggressively, and provide raw-byte diagnostics for field troubleshooting.
That sounds like someone who has moved past textbook protocol knowledge into production engineering thinking.
Final mental model
Think of the whole topic like this:
Raw bytes arrive
->
bytes are buffered
->
frame boundaries are detected
->
complete frame is validated
->
fields are parsed into typed data
->
invalid input is rejected or resynchronized
->
business logic sees only meaningful messagesThat is the real job of message framing and parsing.
If you want, I can continue with the next topic in the same style.