Below is the next topic in the same roadmap context: Vision, Imaging & Inspection Systems → Image storage and retrieval / data traceability, aligned with the broader project source of truth.

Data Storage & Traceability in Industrial Vision Systems

In industrial vision systems, storage is not just “save images somewhere.” It is part of the inspection architecture.

A wafer inspection machine may inspect thousands or millions of regions over time. Each inspection decision may later need to be explained: which wafer, which die, which image, which recipe, which threshold, which lighting condition, which machine state, and which software version produced this result?

That is why storage and traceability are not secondary features. They are how the system preserves truth after production has moved on.

PART 1 — Why Storage & Traceability Matter

In business software, storage usually answers questions like:

What did the user order? What is the current status? What is the transaction history?

In industrial inspection software, storage must answer a more physical question:

What actually happened to this real product at this real moment on this real machine?

For a vision system, that means being able to answer:

What was inspected?
Which image was used?
What result was produced?
Why was it accepted or rejected?
Which recipe and parameter version were active?
What machine state existed at inspection time?
Was the image quality acceptable?
Was the result stable or suspicious?
Can we reproduce or review the decision later?

A strong traceability design connects the physical product, the captured image, the inspection algorithm, the recipe, the machine state, and the final decision.

For example, in wafer defect review, it is not enough to say:

text

Wafer W123 failed.

A useful system says:

text

Wafer: W123
Lot: L9001
Die: X=143, Y=82
Inspection Step: Post-alignment surface defect scan
Image ID: IMG-20260425-00088421
Recipe: RCP-Gold-ProductA-v17
Algorithm: SurfaceDefectDetector v3.2
Decision: FAIL
Defect Count: 7
Largest Defect: 14.2 µm
Image Quality: PASS
Focus Score: 0.91
Light Profile: Brightfield-A
Machine State: AutoRun / Stable / No active alarm
Timestamp: 2026-04-25T10:21:44.183Z

That is the difference between a stored result and a traceable result.

Storage supports:

quality review
production audit
customer investigation
recipe tuning
process improvement
service diagnostics
defect trend analysis
replay and offline debugging

A common mistake from software engineers entering this domain is thinking storage is only about persistence. In machine systems, storage is also about future explanation.

PART 2 — What Data Needs to Be Stored

Industrial vision systems usually produce several types of data. Not all of them must always be stored, but the architecture should consciously decide what is stored, why, for how long, and at what fidelity.

1. Raw Images

A raw image is the closest software artifact to the physical inspection moment.

It may be grayscale, color, high bit-depth, multi-channel, or camera-specific format.

Raw images are useful when engineers need to answer:

Was the defect actually visible?
Was the image blurred?
Was lighting unstable?
Was the camera saturated?
Did the algorithm fail or was the input bad?
Can we reprocess this later with a new algorithm?

Raw images have high diagnostic value, but they are expensive. They consume disk space quickly, especially in high-throughput inspection.

A 25 MB image sounds small until the machine captures 10 images per second for 20 hours.

text

25 MB/image × 10 images/sec × 3600 sec/hour × 20 hours
= 18,000,000 MB
= ~18 TB/day

So “store every raw image forever” is rarely realistic unless the production requirement explicitly demands it.

2. Processed Images

Processed images may include:

normalized images
filtered images
contrast-enhanced images
cropped regions of interest
corrected images after alignment
stitched images
defect-highlighted images

These are useful when the review team wants to see the image as the algorithm saw it.

However, processed images can be dangerous if stored without enough context. A processed image is no longer pure input. It depends on algorithm version, recipe parameters, calibration, scaling, filtering, and sometimes display transformation.

So if you store processed images, store the processing context too.

3. Thumbnails / Display Images

Thumbnails are low-cost images used for quick review.

They are useful for:

operator review screens
defect galleries
fast browsing
remote support packages
summary reports

They should not be treated as inspection evidence unless clearly defined. A thumbnail may hide detail, compression artifacts, or small defects.

A good system distinguishes:

text

RawImage       = inspection evidence
ReviewImage    = human-friendly visualization
Thumbnail      = fast browsing artifact

4. Overlays / Annotations

Overlays show things like:

defect bounding boxes
measurement lines
pass/fail regions
alignment points
masks
search regions
excluded areas
threshold regions

A key design decision is whether overlays are stored as images or as structured data.

Bad approach:

text

Save only screenshot with red boxes.

Better approach:

text

Store overlay geometry as structured data:
- shape type
- coordinates
- coordinate system
- label
- color/style
- source result ID
- image ID

Then the system can regenerate overlays later.

This matters because screenshots are hard to query, hard to validate, and hard to compare. Structured overlays are reviewable, searchable, versionable, and replay-friendly.

5. Inspection Results

Inspection results include the actual output of the vision logic:

pass/fail
defect count
defect type
measurements
classification
confidence score
quality flags
rule triggered
threshold used
algorithm version
source image reference

Pass/fail alone is almost useless for root cause analysis.

A result should explain itself.

Instead of:

text

Result = FAIL

Prefer:

text

Decision = FAIL
Reason = DefectCountExceeded
DefectCount = 7
Limit = 3
LargestDefectSizeUm = 14.2
SizeLimitUm = 10.0
RuleId = SurfaceDefect.MaxCount
RecipeVersion = 17
ImageId = IMG-88421

6. Measurements

Measurements may include:

diameter
width
height
position
offset
angle
area
intensity
focus score
edge distance
defect size

Measurements should always include units.

A dangerous result model says:

text

Value = 12.5

A better result model says:

text

MeasurementName = DefectDiameter
Value = 12.5
Unit = µm
CoordinateSystem = WaferLocal
CalibrationVersion = CAL-20260420-02

Units, coordinate systems, and calibration versions matter because machine measurements are physical, not abstract numbers.

7. Defect Lists

A defect list is usually a collection of detected abnormalities.

Each defect may have:

defect ID
type/classification
bounding box
centroid
size
severity
confidence
image coordinates
product coordinates
die coordinates
review status
source image ID

Example:

text

DefectId: D-00042
Type: Scratch
Severity: Major
ImageX: 1842
ImageY: 921
WaferXmm: 42.182
WaferYmm: -18.533
SizeUm: 14.2
Confidence: 0.87
SourceImageId: IMG-88421

This allows review teams to locate, filter, compare, and explain defects.

8. Metadata

Metadata is the glue of traceability.

Useful metadata includes:

lot ID
wafer ID
part ID
panel ID
job ID
recipe ID
recipe version
machine ID
camera ID
lens/optics profile
illumination profile
exposure/gain/focus
operator ID where relevant
timestamp
workflow step
software version
calibration version
machine mode
alarm state
image quality flags

Metadata is often more important than engineers expect. Without metadata, images become anonymous files.

Anonymous images are almost useless in production investigations.

9. Recipe / Version Information

The recipe defines how inspection should behave.

It may contain:

thresholds
regions of interest
alignment settings
lighting settings
exposure settings
defect rules
measurement tolerances
classification parameters

Never store only the current recipe ID.

You need the recipe version or snapshot used at inspection time.

Bad:

text

RecipeId = ProductA

Good:

text

RecipeId = ProductA
RecipeVersion = 17
RecipeHash = 7F3A91...
ActivatedAt = 2026-04-25T09:10:12Z

Best for high-traceability systems:

text

Store immutable recipe snapshot reference.

Because after a recipe changes, old results must still be explainable.

10. Machine / Workflow Context

The inspection result is affected by machine state.

Useful context includes:

machine mode: Auto, Manual, Maintenance
workflow step
current job/session/run
active alarms
stage position
motion settled status
trigger source
camera acquisition mode
focus state
environmental state if available
device health status

Example:

text

WorkflowStep = InspectDie
StagePosition = X=42.18mm, Y=-18.53mm, Z=1.20mm
MotionState = Settled
FocusState = Locked
TriggerMode = HardwarePositionTrigger
ActiveAlarmCount = 0

This helps diagnose whether the result was caused by inspection logic or machine condition.

11. Quality Metrics

Image quality metrics help determine whether an inspection was trustworthy.

Examples:

focus score
brightness mean
brightness variance
saturation percentage
contrast score
blur estimate
missing frame flag
timestamp drift
exposure stability
illumination status

A strong system separates:

text

Inspection decision: PASS / FAIL
Image quality decision: VALID / INVALID / WARNING

Because a part should not necessarily fail just because the image was bad. Sometimes the correct outcome is:

text

InspectionInvalidDueToPoorImageQuality

That is very different from:

text

ProductFailedDueToDefect

PART 3 — Traceability Model

Traceability connects inspection data to production reality.

A simplified model looks like this:

text

+----------------+
| Lot / Batch    |
+----------------+
| LotId          |
| ProductType    |
| Customer       |
| StartTime      |
+-------+--------+
        |
        | contains
        v
+----------------+
| Wafer / Part   |
+----------------+
| ProductId      |
| LotId          |
| SerialNo       |
| SlotNo         |
| Status         |
+-------+--------+
        |
        | inspected during
        v
+----------------+
| InspectionRun  |
+----------------+
| RunId          |
| ProductId      |
| JobId          |
| MachineId      |
| RecipeVersion  |
| StartTime      |
| EndTime        |
| SoftwareVer    |
+-------+--------+
        |
        | has steps
        v
+----------------+
| InspectionStep |
+----------------+
| StepId         |
| RunId          |
| StepName       |
| WorkflowState  |
| Timestamp      |
| StagePosition  |
+-------+--------+
        |
        | captures
        v
+----------------+
| ImageFrame     |
+----------------+
| ImageId        |
| StepId         |
| CameraId       |
| FrameNo        |
| Timestamp      |
| StorageUri     |
| Width/Height   |
| PixelFormat    |
| ImageHash      |
+-------+--------+
        |
        | produces
        v
+----------------+
| InspectionResult|
+----------------+
| ResultId       |
| ImageId        |
| Decision       |
| RuleId         |
| Measurements   |
| DefectCount    |
| QualityFlags   |
| CreatedAt      |
+-------+--------+
        |
        | contains
        v
+----------------+
| DefectRecord   |
+----------------+
| DefectId       |
| ResultId       |
| Type           |
| Severity       |
| X/Y            |
| Size           |
| Confidence     |
+----------------+

The important idea is not the exact table design. The important idea is the chain:

text

Lot → Product → Inspection Run → Step → Image → Result → Defects

If that chain is broken, traceability is weak.

For example:

image exists but no result link
result exists but no image link
image and result exist but no recipe version
defect exists but no coordinate system
result exists but no workflow step
timestamp exists but clock source is inconsistent

These problems may not show up during a demo. They show up during customer escalation.

PART 4 — Image Storage Strategies

There is no single correct image storage strategy. The right design depends on throughput, customer requirements, diagnostic needs, and storage cost.

Strategy 1: Store Every Raw Image

This gives maximum diagnostic value.

Useful when:

inspection is safety-critical or high-value
customer requires complete evidence
production volume is manageable
offline reprocessing is important
early-stage machine development needs full diagnostics

Benefits:

best replay support
best debugging support
strongest evidence trail
useful for algorithm improvement

Costs:

huge storage usage
slower retrieval if not indexed well
higher backup/archive cost
risk of disk-full failures
more data lifecycle complexity

This strategy is attractive during development but often too expensive for full production unless carefully designed.

Strategy 2: Store Only Failed Images

This is common in production.

Useful when:

most products pass
failures need review
storage must be controlled
customer mainly cares about rejects

Benefits:

much lower storage cost
review is focused on suspicious items
easier retention

Costs:

cannot investigate false passes later
limited ability to compare pass/fail distributions
harder to debug borderline behavior
misses context before and after failure

This strategy is efficient, but risky if the system later needs to explain why something passed.

Strategy 3: Store Thumbnails for All, Raw Images for Selected Cases

This is often a balanced strategy.

Example:

text

For every inspection:
- store result metadata
- store thumbnail/review image

For failures:
- store raw image
- store processed image
- store overlay data

For borderline cases:
- store raw image

For periodic sampling:
- store raw image every N products

Benefits:

supports fast browsing
preserves enough evidence for failures
controls storage cost
gives some data for trend analysis

Costs:

more complex policy logic
requires clear classification of “selected cases”
some cases may not have raw evidence

This is usually a practical production design.

Strategy 4: Store Compressed Images

Compression reduces storage cost.

Options include:

lossless compression
lossy compression
region-of-interest compression
format-specific compression
lower bit-depth storage
downsampled review images

The critical question is whether the compressed image is still valid evidence.

For inspection evidence, prefer lossless or clearly documented compression.

For operator review thumbnails, lossy compression is often acceptable.

Never silently compress evidence images in a way that changes measurement meaning.

Strategy 5: Store Metadata and Results Only

This minimizes storage.

Useful when:

image data is too large
product value is low
traceability requirement is minimal
inspection is simple
machine is not expected to support deep review

Benefits:

low storage cost
fast queries
simple retention

Costs:

weak diagnostics
difficult customer investigation
cannot reprocess
cannot prove visual evidence later

This strategy should be chosen intentionally, not accidentally.

Strategy 6: Rolling Diagnostic Buffer

A rolling buffer stores recent images temporarily.

Example:

text

Keep last 30 minutes of raw images.
Persist permanently only if:
- failure occurs
- operator marks product for review
- machine alarm occurs
- engineer enables diagnostic mode

Benefits:

excellent for debugging transient problems
avoids permanent storage explosion
useful during commissioning
captures context before failure

Costs:

failure must be detected before buffer expires
needs careful disk management
may create false confidence if buffer retention is misunderstood

Rolling buffers are very useful in real machines because many bugs are discovered only after something strange happens.

PART 5 — Result Storage & Querying

Inspection results must be queryable and explainable.

A weak result model stores:

text

ProductId
Timestamp
PassFail

That is not enough.

A useful result record contains:

text

+----------------------+
| InspectionResult     |
+----------------------+
| ResultId             |
| ProductId            |
| RunId                |
| StepId               |
| ImageId              |
| Decision             |
| DecisionReason       |
| RecipeId             |
| RecipeVersion        |
| AlgorithmVersion     |
| RuleId               |
| ThresholdUsed        |
| MeasurementSummary   |
| DefectCount          |
| QualityFlags         |
| CorrelationId        |
| CreatedAt            |
+----------------------+

The result should answer:

What did we decide?
Why did we decide it?
Which data supported the decision?
Which rule or threshold was used?
Which image produced the result?
Which product and workflow step does it belong to?
Was the result reliable?
Can it be reviewed later?

Pass/fail alone is insufficient because two failures may look identical in summary but completely different in meaning.

Example:

text

Part A: FAIL because defect count exceeded limit.
Part B: FAIL because alignment failed.
Part C: FAIL because image quality was invalid.
Part D: FAIL because measurement exceeded tolerance.

All are “FAIL,” but the required action is different.

The result model should separate:

text

Product disposition:
- Accepted
- Rejected
- NeedsReview
- InspectionInvalid

Inspection reason:
- DefectDetected
- MeasurementOutOfTolerance
- AlignmentFailed
- ImageQualityInvalid
- RecipeError
- MachineStateInvalid

This distinction matters in production.

A defective product and an invalid inspection are not the same thing.

PART 6 — Retention, Archival, and Cleanup

Image data grows fast. If retention is not designed, the machine eventually becomes unreliable.

This is not just an IT problem. It is a machine reliability problem.

A machine with a full disk may:

stop acquiring images
fail to write results
block the inspection pipeline
crash the application
lose traceability
corrupt partial files
force production downtime

A good storage design has explicit retention tiers.

text

+-------------------+
| Hot Storage       |
+-------------------+
| Recent data       |
| Fast retrieval    |
| Local disk / SSD  |
| Hours to days     |
+---------+---------+
          |
          v
+-------------------+
| Warm Storage      |
+-------------------+
| Review data       |
| Slower retrieval  |
| Local server/NAS  |
| Days to weeks     |
+---------+---------+
          |
          v
+-------------------+
| Cold Archive      |
+-------------------+
| Audit/history     |
| Compressed        |
| Long-term storage |
| Months/years      |
+-------------------+

Typical retention policies may say:

text

All result metadata: keep 2 years
Failed raw images: keep 90 days
Passed thumbnails: keep 30 days
Raw pass images: keep 7 days or sample only
Diagnostic rolling buffer: keep last 24 hours
Alarm-related image packages: keep 180 days

Retention is often customer-specific.

Customer A may require:

text

Store all failed images for 1 year.

Customer B may require:

text

Store every inspected image for 30 days.

Customer C may require:

text

Do not store operator-identifying data beyond 90 days.

So retention should be policy-driven, not hardcoded.

A strong system monitors:

free disk space
image write latency
storage queue depth
failed write count
archival backlog
cleanup success/failure
oldest retained file
estimated time until disk full

The system should raise alarms before storage becomes critical.

Example:

text

Warning: Image storage disk below 20%
Alarm: Image storage disk below 10%
Critical: Image storage disk below 5%, stop new job start

The exact behavior depends on the machine, but the principle is clear: do not wait until the disk is full.

PART 7 — Real-World Failure Scenarios

1. Result Stored Without Image Reference

What it looks like:

text

Customer asks why wafer W123 failed.
Database says FAIL.
No linked image exists.

Why it happens:

result table designed before image storage
image filename generated separately
no shared correlation ID
image write failed silently
cleanup deleted image but result still references it incorrectly

How experienced engineers handle it:

introduce immutable image IDs
store image reference in result record
make image write status visible
use correlation IDs across pipeline
detect missing image references during health checks

2. Image Stored Without Recipe Version

What it looks like:

text

Engineer opens old failed image.
Current recipe says threshold = 10 µm.
But at inspection time threshold may have been 8 µm.
Nobody knows.

Why it happens:

result stores recipe name only
recipe is mutable
old recipe versions overwritten
no activation snapshot

How to handle it:

store recipe version
store recipe hash
store immutable recipe snapshot reference
log recipe activation events
link result to recipe version used at inspection time

3. Disk Fills During Production

What it looks like:

text

Machine runs normally for hours.
Then inspection slows down.
Then image writes fail.
Then database writes fail.
Then operator sees confusing errors.

Why it happens:

no retention policy
no disk monitoring
debug mode left on
raw images stored for every pass
cleanup job failed silently
image queue grows faster than writer can flush

How to handle it:

storage health monitoring
retention cleanup service
bounded storage queue
backpressure policy
early alarms
prevent new job start when storage is unsafe
separate critical result writes from large image writes where appropriate

4. Failed Image Overwritten Before Review

What it looks like:

text

Operator sees a failure.
Later the engineer opens review screen.
Image is gone or replaced.

Why it happens:

rolling buffer too short
image filename reused
frame ID not unique
temporary images not promoted to permanent storage
failure event not connected to storage policy

How to handle it:

use immutable image IDs
promote failed images from buffer immediately
never reuse filenames for evidence
store retention class per image
verify promotion success

5. Operator Cannot Trace Why Product Failed

What it looks like:

text

Operator sees: FAIL.
No reason.
No defect location.
No measurement value.
No threshold.
No image.

Why it happens:

system designed around final decision only
UI and storage not designed around review
algorithm returns bool instead of structured result
defect information discarded after display

How to handle it:

result model includes decision reason
measurements are stored
defects are stored
overlays are linked
thresholds and rules are captured
review workflow uses stored evidence, not transient UI state

6. Stored Result Does Not Match Displayed Overlay

What it looks like:

text

Database says defect at X=1200,Y=800.
Overlay shows defect at X=1400,Y=950.
Operator loses trust.

Why it happens:

coordinate transform changed
display used scaled image coordinates
result used raw image coordinates
overlay generated from current recipe instead of historical recipe
image was rotated/flipped for display
calibration changed after result was stored

How to handle it:

store coordinate system explicitly
store transform version
store overlay geometry with source coordinate space
version display transformations
regenerate overlays using historical context
avoid mixing raw image, processed image, and display image coordinates without metadata

7. Timestamps Inconsistent Across Components

What it looks like:

text

Image timestamp says 10:01:02.100.
Motion log says stage position at 10:01:02.100 was different.
Inspection result says created at 10:01:05.700.
Alarm says trigger happened at 10:01:01.900.
Nobody knows the true sequence.

Why it happens:

camera, PC, controller use different clocks
local time mixed with UTC
timestamp taken after processing, not at acquisition
log timestamp confused with event timestamp
no monotonic sequence number

How to handle it:

use clear timestamp semantics
separate acquisition time, processing time, storage time
use UTC for persisted records
include frame sequence numbers
include hardware trigger counter where possible
correlate with workflow step IDs, not timestamps alone

8. Database Write Blocks Inspection Pipeline

What it looks like:

text

Inspection throughput is fine during demo.
In production, database slows down.
Image processing waits for result insert.
Camera queue backs up.
Frames are dropped.
Machine takt time degrades.

Why it happens:

synchronous writes in critical inspection path
unbounded queues
database transaction includes large image write
retry logic blocks processing thread
storage service not isolated from acquisition/inspection

How to handle it:

non-blocking storage pipeline
bounded queues
explicit backpressure behavior
separate acquisition/processing from persistence
durable local spool if needed
storage health metrics
fail-safe policy when persistence is degraded

PART 8 — Software Design Implications

Storage must be designed as part of the vision architecture from the beginning.

A bad architecture treats storage as an afterthought:

text

Inspection code
   |
   +-- save random image file
   +-- insert pass/fail row
   +-- hope filename can be matched later

A good architecture treats storage as a first-class subsystem:

text

+----------------------+
| Inspection Pipeline  |
+----------------------+
          |
          v
+-----------------------------+
| Image + Result + Metadata   |
| InspectionContext           |
+-----------------------------+
          |
          v
+----------------------+
| Storage Queue        |
| bounded, async       |
+----------------------+
          |
          v
+-----------------------------+
| Storage Service             |
| validates, writes, indexes  |
+-----------------------------+
      |                 |
      v                 v
+-------------+   +----------------+
| Image Store |   | Result Database|
+-------------+   +----------------+
      |                 |
      +--------+--------+
               v
+-----------------------------+
| Review / Debug / Reporting  |
+-----------------------------+

The key design object is often an explicit InspectionContext.

Conceptually:

text

+-------------------------+
| InspectionContext       |
+-------------------------+
| CorrelationId           |
| LotId                   |
| ProductId               |
| RunId                   |
| StepId                  |
| MachineId               |
| RecipeId                |
| RecipeVersion           |
| CameraId                |
| FrameId                 |
| Timestamp               |
| WorkflowStep            |
| StagePosition           |
| CalibrationVersion      |
| SoftwareVersion         |
+-------------------------+

Every image and result should carry or reference this context.

Without explicit context, traceability becomes a guessing game.

Non-Blocking Storage Pipeline

The inspection path should not be casually blocked by slow storage.

A simplified flow:

text

+-------------+     +-------------+     +-------------+
| Acquisition | --> | Processing  | --> | Inspection  |
+-------------+     +-------------+     +-------------+
                                             |
                                             v
                                  +--------------------+
                                  | Storage Message    |
                                  | image/result/meta  |
                                  +--------------------+
                                             |
                                             v
                                  +--------------------+
                                  | Bounded Queue      |
                                  +--------------------+
                                             |
                                             v
                                  +--------------------+
                                  | Storage Worker(s)  |
                                  +--------------------+
                                             |
                                +------------+------------+
                                v                         v
                          +-----------+             +-------------+
                          | File/NAS  |             | Database    |
                          | Object    |             | Index/Query |
                          | Storage   |             | Results     |
                          +-----------+             +-------------+

Important design choices:

queue must be bounded
storage failures must be visible
writes should be idempotent where possible
result and image references must stay consistent
partial write states must be modeled
retry policies must not cause unbounded memory growth
machine behavior under storage degradation must be defined

For example:

text

If result DB is unavailable:
- Can the machine continue?
- Can it spool locally?
- For how long?
- Should it stop new lots?
- Should it allow current wafer to finish?
- What alarm should operator see?

These are architecture decisions, not implementation details.

Schema and Versioning

Vision result schemas evolve.

Today you may store:

text

DefectCount
LargestDefectSize
Decision

Later you may need:

text

DefectClass
Confidence
ReviewStatus
FalsePositiveLabel
AlgorithmModelVersion

So result storage should tolerate versioning.

Options include:

relational core fields plus JSON details
schema version per result
immutable result payloads
migration strategy
backward-compatible readers
explicit algorithm output version

Avoid designing result storage as if inspection output will never change.

It will change.

Replay and Review Support

A strong traceability system supports replay.

Replay means:

text

Take historical image + historical recipe/context
Run inspection again or review previous result
Compare old result and new result

This is useful for:

debugging false rejects
validating recipe changes
comparing algorithm versions
customer investigations
regression testing

Replay requires:

image availability
recipe version availability
calibration/transform context
algorithm version awareness
deterministic or explainable processing
stored result for comparison

Without traceable storage, replay becomes impossible.

PART 9 — Interview / Real-World Talking Points

A strong interview answer should sound like this:

In an industrial vision system, storage is not just persistence. It is part of traceability and diagnostics. Every inspection result should be connected to the product, image, recipe version, workflow step, machine state, and timestamp that produced it. Otherwise, when a customer asks why a wafer or part failed, the system cannot explain its own decision.

You can also say:

I would separate image storage from result indexing. Images may live in file/object/NAS storage, while structured result data should be queryable in a database. The important part is that both are connected by stable image IDs, run IDs, step IDs, and correlation IDs.

Another strong point:

I would avoid blocking the acquisition or inspection pipeline on slow image writes. Storage should usually be asynchronous, bounded, monitored, and designed with backpressure. But I would still define what happens if storage fails, because losing traceability may be unacceptable for some production modes.

Common mistakes software engineers make:

saving files with weak names like image1.bmp
storing pass/fail without decision reason
not storing recipe version
not linking image and result
assuming current recipe explains historical result
using timestamps as the only correlation mechanism
blocking inspection on database writes
ignoring disk-full scenarios
deleting failed images too early
storing overlays only as screenshots
mixing coordinate systems without metadata
treating image storage as an IT concern instead of machine reliability

What strong engineers understand:

traceability is part of product quality
diagnostic value must be designed intentionally
image retention is a trade-off, not an afterthought
pass/fail is not enough
recipe versions must be immutable or reconstructable
storage must not destabilize real-time-ish inspection flow
cleanup and disk monitoring are production safety concerns
review and replay requirements shape storage architecture

The core mental model:

text

A vision system does not only inspect.

It must remember enough evidence
to explain the inspection later.

That is the heart of data storage and traceability in industrial vision systems.

Streaming Pipelines Dotnet Real World

Data Storage & Traceability in Industrial Vision Systems ​

PART 1 — Why Storage & Traceability Matter ​

PART 2 — What Data Needs to Be Stored ​

1. Raw Images ​

2. Processed Images ​

3. Thumbnails / Display Images ​

4. Overlays / Annotations ​

5. Inspection Results ​

6. Measurements ​

7. Defect Lists ​

8. Metadata ​

9. Recipe / Version Information ​

10. Machine / Workflow Context ​

11. Quality Metrics ​

PART 3 — Traceability Model ​

PART 4 — Image Storage Strategies ​

Strategy 1: Store Every Raw Image ​

Strategy 2: Store Only Failed Images ​

Strategy 3: Store Thumbnails for All, Raw Images for Selected Cases ​

Strategy 4: Store Compressed Images ​

Strategy 5: Store Metadata and Results Only ​

Strategy 6: Rolling Diagnostic Buffer ​

PART 5 — Result Storage & Querying ​

PART 6 — Retention, Archival, and Cleanup ​

PART 7 — Real-World Failure Scenarios ​

1. Result Stored Without Image Reference ​

2. Image Stored Without Recipe Version ​

3. Disk Fills During Production ​

4. Failed Image Overwritten Before Review ​

5. Operator Cannot Trace Why Product Failed ​

6. Stored Result Does Not Match Displayed Overlay ​

7. Timestamps Inconsistent Across Components ​

8. Database Write Blocks Inspection Pipeline ​

PART 8 — Software Design Implications ​

Non-Blocking Storage Pipeline ​

Schema and Versioning ​

Replay and Review Support ​

PART 9 — Interview / Real-World Talking Points ​

Data Storage & Traceability in Industrial Vision Systems

PART 1 — Why Storage & Traceability Matter

PART 2 — What Data Needs to Be Stored

1. Raw Images

2. Processed Images

3. Thumbnails / Display Images

4. Overlays / Annotations

5. Inspection Results

6. Measurements

7. Defect Lists

8. Metadata

9. Recipe / Version Information

10. Machine / Workflow Context

11. Quality Metrics

PART 3 — Traceability Model

PART 4 — Image Storage Strategies

Strategy 1: Store Every Raw Image

Strategy 2: Store Only Failed Images

Strategy 3: Store Thumbnails for All, Raw Images for Selected Cases

Strategy 4: Store Compressed Images

Strategy 5: Store Metadata and Results Only

Strategy 6: Rolling Diagnostic Buffer

PART 5 — Result Storage & Querying

PART 6 — Retention, Archival, and Cleanup

PART 7 — Real-World Failure Scenarios

1. Result Stored Without Image Reference

2. Image Stored Without Recipe Version

3. Disk Fills During Production

4. Failed Image Overwritten Before Review

5. Operator Cannot Trace Why Product Failed

6. Stored Result Does Not Match Displayed Overlay

7. Timestamps Inconsistent Across Components

8. Database Write Blocks Inspection Pipeline

PART 8 — Software Design Implications

Non-Blocking Storage Pipeline

Schema and Versioning

Replay and Review Support

PART 9 — Interview / Real-World Talking Points