Below is the next topic in the same roadmap context: Vision, Imaging & Inspection Systems → Image storage and retrieval / data traceability, aligned with the broader project source of truth.
Data Storage & Traceability in Industrial Vision Systems
In industrial vision systems, storage is not just “save images somewhere.” It is part of the inspection architecture.
A wafer inspection machine may inspect thousands or millions of regions over time. Each inspection decision may later need to be explained: which wafer, which die, which image, which recipe, which threshold, which lighting condition, which machine state, and which software version produced this result?
That is why storage and traceability are not secondary features. They are how the system preserves truth after production has moved on.
PART 1 — Why Storage & Traceability Matter
In business software, storage usually answers questions like:
What did the user order? What is the current status? What is the transaction history?
In industrial inspection software, storage must answer a more physical question:
What actually happened to this real product at this real moment on this real machine?
For a vision system, that means being able to answer:
- What was inspected?
- Which image was used?
- What result was produced?
- Why was it accepted or rejected?
- Which recipe and parameter version were active?
- What machine state existed at inspection time?
- Was the image quality acceptable?
- Was the result stable or suspicious?
- Can we reproduce or review the decision later?
A strong traceability design connects the physical product, the captured image, the inspection algorithm, the recipe, the machine state, and the final decision.
For example, in wafer defect review, it is not enough to say:
Wafer W123 failed.A useful system says:
Wafer: W123
Lot: L9001
Die: X=143, Y=82
Inspection Step: Post-alignment surface defect scan
Image ID: IMG-20260425-00088421
Recipe: RCP-Gold-ProductA-v17
Algorithm: SurfaceDefectDetector v3.2
Decision: FAIL
Defect Count: 7
Largest Defect: 14.2 µm
Image Quality: PASS
Focus Score: 0.91
Light Profile: Brightfield-A
Machine State: AutoRun / Stable / No active alarm
Timestamp: 2026-04-25T10:21:44.183ZThat is the difference between a stored result and a traceable result.
Storage supports:
- quality review
- production audit
- customer investigation
- recipe tuning
- process improvement
- service diagnostics
- defect trend analysis
- replay and offline debugging
A common mistake from software engineers entering this domain is thinking storage is only about persistence. In machine systems, storage is also about future explanation.
PART 2 — What Data Needs to Be Stored
Industrial vision systems usually produce several types of data. Not all of them must always be stored, but the architecture should consciously decide what is stored, why, for how long, and at what fidelity.
1. Raw Images
A raw image is the closest software artifact to the physical inspection moment.
It may be grayscale, color, high bit-depth, multi-channel, or camera-specific format.
Raw images are useful when engineers need to answer:
- Was the defect actually visible?
- Was the image blurred?
- Was lighting unstable?
- Was the camera saturated?
- Did the algorithm fail or was the input bad?
- Can we reprocess this later with a new algorithm?
Raw images have high diagnostic value, but they are expensive. They consume disk space quickly, especially in high-throughput inspection.
A 25 MB image sounds small until the machine captures 10 images per second for 20 hours.
25 MB/image × 10 images/sec × 3600 sec/hour × 20 hours
= 18,000,000 MB
= ~18 TB/daySo “store every raw image forever” is rarely realistic unless the production requirement explicitly demands it.
2. Processed Images
Processed images may include:
- normalized images
- filtered images
- contrast-enhanced images
- cropped regions of interest
- corrected images after alignment
- stitched images
- defect-highlighted images
These are useful when the review team wants to see the image as the algorithm saw it.
However, processed images can be dangerous if stored without enough context. A processed image is no longer pure input. It depends on algorithm version, recipe parameters, calibration, scaling, filtering, and sometimes display transformation.
So if you store processed images, store the processing context too.
3. Thumbnails / Display Images
Thumbnails are low-cost images used for quick review.
They are useful for:
- operator review screens
- defect galleries
- fast browsing
- remote support packages
- summary reports
They should not be treated as inspection evidence unless clearly defined. A thumbnail may hide detail, compression artifacts, or small defects.
A good system distinguishes:
RawImage = inspection evidence
ReviewImage = human-friendly visualization
Thumbnail = fast browsing artifact4. Overlays / Annotations
Overlays show things like:
- defect bounding boxes
- measurement lines
- pass/fail regions
- alignment points
- masks
- search regions
- excluded areas
- threshold regions
A key design decision is whether overlays are stored as images or as structured data.
Bad approach:
Save only screenshot with red boxes.Better approach:
Store overlay geometry as structured data:
- shape type
- coordinates
- coordinate system
- label
- color/style
- source result ID
- image IDThen the system can regenerate overlays later.
This matters because screenshots are hard to query, hard to validate, and hard to compare. Structured overlays are reviewable, searchable, versionable, and replay-friendly.
5. Inspection Results
Inspection results include the actual output of the vision logic:
- pass/fail
- defect count
- defect type
- measurements
- classification
- confidence score
- quality flags
- rule triggered
- threshold used
- algorithm version
- source image reference
Pass/fail alone is almost useless for root cause analysis.
A result should explain itself.
Instead of:
Result = FAILPrefer:
Decision = FAIL
Reason = DefectCountExceeded
DefectCount = 7
Limit = 3
LargestDefectSizeUm = 14.2
SizeLimitUm = 10.0
RuleId = SurfaceDefect.MaxCount
RecipeVersion = 17
ImageId = IMG-884216. Measurements
Measurements may include:
- diameter
- width
- height
- position
- offset
- angle
- area
- intensity
- focus score
- edge distance
- defect size
Measurements should always include units.
A dangerous result model says:
Value = 12.5A better result model says:
MeasurementName = DefectDiameter
Value = 12.5
Unit = µm
CoordinateSystem = WaferLocal
CalibrationVersion = CAL-20260420-02Units, coordinate systems, and calibration versions matter because machine measurements are physical, not abstract numbers.
7. Defect Lists
A defect list is usually a collection of detected abnormalities.
Each defect may have:
- defect ID
- type/classification
- bounding box
- centroid
- size
- severity
- confidence
- image coordinates
- product coordinates
- die coordinates
- review status
- source image ID
Example:
DefectId: D-00042
Type: Scratch
Severity: Major
ImageX: 1842
ImageY: 921
WaferXmm: 42.182
WaferYmm: -18.533
SizeUm: 14.2
Confidence: 0.87
SourceImageId: IMG-88421This allows review teams to locate, filter, compare, and explain defects.
8. Metadata
Metadata is the glue of traceability.
Useful metadata includes:
- lot ID
- wafer ID
- part ID
- panel ID
- job ID
- recipe ID
- recipe version
- machine ID
- camera ID
- lens/optics profile
- illumination profile
- exposure/gain/focus
- operator ID where relevant
- timestamp
- workflow step
- software version
- calibration version
- machine mode
- alarm state
- image quality flags
Metadata is often more important than engineers expect. Without metadata, images become anonymous files.
Anonymous images are almost useless in production investigations.
9. Recipe / Version Information
The recipe defines how inspection should behave.
It may contain:
- thresholds
- regions of interest
- alignment settings
- lighting settings
- exposure settings
- defect rules
- measurement tolerances
- classification parameters
Never store only the current recipe ID.
You need the recipe version or snapshot used at inspection time.
Bad:
RecipeId = ProductAGood:
RecipeId = ProductA
RecipeVersion = 17
RecipeHash = 7F3A91...
ActivatedAt = 2026-04-25T09:10:12ZBest for high-traceability systems:
Store immutable recipe snapshot reference.Because after a recipe changes, old results must still be explainable.
10. Machine / Workflow Context
The inspection result is affected by machine state.
Useful context includes:
- machine mode: Auto, Manual, Maintenance
- workflow step
- current job/session/run
- active alarms
- stage position
- motion settled status
- trigger source
- camera acquisition mode
- focus state
- environmental state if available
- device health status
Example:
WorkflowStep = InspectDie
StagePosition = X=42.18mm, Y=-18.53mm, Z=1.20mm
MotionState = Settled
FocusState = Locked
TriggerMode = HardwarePositionTrigger
ActiveAlarmCount = 0This helps diagnose whether the result was caused by inspection logic or machine condition.
11. Quality Metrics
Image quality metrics help determine whether an inspection was trustworthy.
Examples:
- focus score
- brightness mean
- brightness variance
- saturation percentage
- contrast score
- blur estimate
- missing frame flag
- timestamp drift
- exposure stability
- illumination status
A strong system separates:
Inspection decision: PASS / FAIL
Image quality decision: VALID / INVALID / WARNINGBecause a part should not necessarily fail just because the image was bad. Sometimes the correct outcome is:
InspectionInvalidDueToPoorImageQualityThat is very different from:
ProductFailedDueToDefectPART 3 — Traceability Model
Traceability connects inspection data to production reality.
A simplified model looks like this:
+----------------+
| Lot / Batch |
+----------------+
| LotId |
| ProductType |
| Customer |
| StartTime |
+-------+--------+
|
| contains
v
+----------------+
| Wafer / Part |
+----------------+
| ProductId |
| LotId |
| SerialNo |
| SlotNo |
| Status |
+-------+--------+
|
| inspected during
v
+----------------+
| InspectionRun |
+----------------+
| RunId |
| ProductId |
| JobId |
| MachineId |
| RecipeVersion |
| StartTime |
| EndTime |
| SoftwareVer |
+-------+--------+
|
| has steps
v
+----------------+
| InspectionStep |
+----------------+
| StepId |
| RunId |
| StepName |
| WorkflowState |
| Timestamp |
| StagePosition |
+-------+--------+
|
| captures
v
+----------------+
| ImageFrame |
+----------------+
| ImageId |
| StepId |
| CameraId |
| FrameNo |
| Timestamp |
| StorageUri |
| Width/Height |
| PixelFormat |
| ImageHash |
+-------+--------+
|
| produces
v
+----------------+
| InspectionResult|
+----------------+
| ResultId |
| ImageId |
| Decision |
| RuleId |
| Measurements |
| DefectCount |
| QualityFlags |
| CreatedAt |
+-------+--------+
|
| contains
v
+----------------+
| DefectRecord |
+----------------+
| DefectId |
| ResultId |
| Type |
| Severity |
| X/Y |
| Size |
| Confidence |
+----------------+The important idea is not the exact table design. The important idea is the chain:
Lot → Product → Inspection Run → Step → Image → Result → DefectsIf that chain is broken, traceability is weak.
For example:
- image exists but no result link
- result exists but no image link
- image and result exist but no recipe version
- defect exists but no coordinate system
- result exists but no workflow step
- timestamp exists but clock source is inconsistent
These problems may not show up during a demo. They show up during customer escalation.
PART 4 — Image Storage Strategies
There is no single correct image storage strategy. The right design depends on throughput, customer requirements, diagnostic needs, and storage cost.
Strategy 1: Store Every Raw Image
This gives maximum diagnostic value.
Useful when:
- inspection is safety-critical or high-value
- customer requires complete evidence
- production volume is manageable
- offline reprocessing is important
- early-stage machine development needs full diagnostics
Benefits:
- best replay support
- best debugging support
- strongest evidence trail
- useful for algorithm improvement
Costs:
- huge storage usage
- slower retrieval if not indexed well
- higher backup/archive cost
- risk of disk-full failures
- more data lifecycle complexity
This strategy is attractive during development but often too expensive for full production unless carefully designed.
Strategy 2: Store Only Failed Images
This is common in production.
Useful when:
- most products pass
- failures need review
- storage must be controlled
- customer mainly cares about rejects
Benefits:
- much lower storage cost
- review is focused on suspicious items
- easier retention
Costs:
- cannot investigate false passes later
- limited ability to compare pass/fail distributions
- harder to debug borderline behavior
- misses context before and after failure
This strategy is efficient, but risky if the system later needs to explain why something passed.
Strategy 3: Store Thumbnails for All, Raw Images for Selected Cases
This is often a balanced strategy.
Example:
For every inspection:
- store result metadata
- store thumbnail/review image
For failures:
- store raw image
- store processed image
- store overlay data
For borderline cases:
- store raw image
For periodic sampling:
- store raw image every N productsBenefits:
- supports fast browsing
- preserves enough evidence for failures
- controls storage cost
- gives some data for trend analysis
Costs:
- more complex policy logic
- requires clear classification of “selected cases”
- some cases may not have raw evidence
This is usually a practical production design.
Strategy 4: Store Compressed Images
Compression reduces storage cost.
Options include:
- lossless compression
- lossy compression
- region-of-interest compression
- format-specific compression
- lower bit-depth storage
- downsampled review images
The critical question is whether the compressed image is still valid evidence.
For inspection evidence, prefer lossless or clearly documented compression.
For operator review thumbnails, lossy compression is often acceptable.
Never silently compress evidence images in a way that changes measurement meaning.
Strategy 5: Store Metadata and Results Only
This minimizes storage.
Useful when:
- image data is too large
- product value is low
- traceability requirement is minimal
- inspection is simple
- machine is not expected to support deep review
Benefits:
- low storage cost
- fast queries
- simple retention
Costs:
- weak diagnostics
- difficult customer investigation
- cannot reprocess
- cannot prove visual evidence later
This strategy should be chosen intentionally, not accidentally.
Strategy 6: Rolling Diagnostic Buffer
A rolling buffer stores recent images temporarily.
Example:
Keep last 30 minutes of raw images.
Persist permanently only if:
- failure occurs
- operator marks product for review
- machine alarm occurs
- engineer enables diagnostic modeBenefits:
- excellent for debugging transient problems
- avoids permanent storage explosion
- useful during commissioning
- captures context before failure
Costs:
- failure must be detected before buffer expires
- needs careful disk management
- may create false confidence if buffer retention is misunderstood
Rolling buffers are very useful in real machines because many bugs are discovered only after something strange happens.
PART 5 — Result Storage & Querying
Inspection results must be queryable and explainable.
A weak result model stores:
ProductId
Timestamp
PassFailThat is not enough.
A useful result record contains:
+----------------------+
| InspectionResult |
+----------------------+
| ResultId |
| ProductId |
| RunId |
| StepId |
| ImageId |
| Decision |
| DecisionReason |
| RecipeId |
| RecipeVersion |
| AlgorithmVersion |
| RuleId |
| ThresholdUsed |
| MeasurementSummary |
| DefectCount |
| QualityFlags |
| CorrelationId |
| CreatedAt |
+----------------------+The result should answer:
- What did we decide?
- Why did we decide it?
- Which data supported the decision?
- Which rule or threshold was used?
- Which image produced the result?
- Which product and workflow step does it belong to?
- Was the result reliable?
- Can it be reviewed later?
Pass/fail alone is insufficient because two failures may look identical in summary but completely different in meaning.
Example:
Part A: FAIL because defect count exceeded limit.
Part B: FAIL because alignment failed.
Part C: FAIL because image quality was invalid.
Part D: FAIL because measurement exceeded tolerance.All are “FAIL,” but the required action is different.
The result model should separate:
Product disposition:
- Accepted
- Rejected
- NeedsReview
- InspectionInvalid
Inspection reason:
- DefectDetected
- MeasurementOutOfTolerance
- AlignmentFailed
- ImageQualityInvalid
- RecipeError
- MachineStateInvalidThis distinction matters in production.
A defective product and an invalid inspection are not the same thing.
PART 6 — Retention, Archival, and Cleanup
Image data grows fast. If retention is not designed, the machine eventually becomes unreliable.
This is not just an IT problem. It is a machine reliability problem.
A machine with a full disk may:
- stop acquiring images
- fail to write results
- block the inspection pipeline
- crash the application
- lose traceability
- corrupt partial files
- force production downtime
A good storage design has explicit retention tiers.
+-------------------+
| Hot Storage |
+-------------------+
| Recent data |
| Fast retrieval |
| Local disk / SSD |
| Hours to days |
+---------+---------+
|
v
+-------------------+
| Warm Storage |
+-------------------+
| Review data |
| Slower retrieval |
| Local server/NAS |
| Days to weeks |
+---------+---------+
|
v
+-------------------+
| Cold Archive |
+-------------------+
| Audit/history |
| Compressed |
| Long-term storage |
| Months/years |
+-------------------+Typical retention policies may say:
All result metadata: keep 2 years
Failed raw images: keep 90 days
Passed thumbnails: keep 30 days
Raw pass images: keep 7 days or sample only
Diagnostic rolling buffer: keep last 24 hours
Alarm-related image packages: keep 180 daysRetention is often customer-specific.
Customer A may require:
Store all failed images for 1 year.Customer B may require:
Store every inspected image for 30 days.Customer C may require:
Do not store operator-identifying data beyond 90 days.So retention should be policy-driven, not hardcoded.
A strong system monitors:
- free disk space
- image write latency
- storage queue depth
- failed write count
- archival backlog
- cleanup success/failure
- oldest retained file
- estimated time until disk full
The system should raise alarms before storage becomes critical.
Example:
Warning: Image storage disk below 20%
Alarm: Image storage disk below 10%
Critical: Image storage disk below 5%, stop new job startThe exact behavior depends on the machine, but the principle is clear: do not wait until the disk is full.
PART 7 — Real-World Failure Scenarios
1. Result Stored Without Image Reference
What it looks like:
Customer asks why wafer W123 failed.
Database says FAIL.
No linked image exists.Why it happens:
- result table designed before image storage
- image filename generated separately
- no shared correlation ID
- image write failed silently
- cleanup deleted image but result still references it incorrectly
How experienced engineers handle it:
- introduce immutable image IDs
- store image reference in result record
- make image write status visible
- use correlation IDs across pipeline
- detect missing image references during health checks
2. Image Stored Without Recipe Version
What it looks like:
Engineer opens old failed image.
Current recipe says threshold = 10 µm.
But at inspection time threshold may have been 8 µm.
Nobody knows.Why it happens:
- result stores recipe name only
- recipe is mutable
- old recipe versions overwritten
- no activation snapshot
How to handle it:
- store recipe version
- store recipe hash
- store immutable recipe snapshot reference
- log recipe activation events
- link result to recipe version used at inspection time
3. Disk Fills During Production
What it looks like:
Machine runs normally for hours.
Then inspection slows down.
Then image writes fail.
Then database writes fail.
Then operator sees confusing errors.Why it happens:
- no retention policy
- no disk monitoring
- debug mode left on
- raw images stored for every pass
- cleanup job failed silently
- image queue grows faster than writer can flush
How to handle it:
- storage health monitoring
- retention cleanup service
- bounded storage queue
- backpressure policy
- early alarms
- prevent new job start when storage is unsafe
- separate critical result writes from large image writes where appropriate
4. Failed Image Overwritten Before Review
What it looks like:
Operator sees a failure.
Later the engineer opens review screen.
Image is gone or replaced.Why it happens:
- rolling buffer too short
- image filename reused
- frame ID not unique
- temporary images not promoted to permanent storage
- failure event not connected to storage policy
How to handle it:
- use immutable image IDs
- promote failed images from buffer immediately
- never reuse filenames for evidence
- store retention class per image
- verify promotion success
5. Operator Cannot Trace Why Product Failed
What it looks like:
Operator sees: FAIL.
No reason.
No defect location.
No measurement value.
No threshold.
No image.Why it happens:
- system designed around final decision only
- UI and storage not designed around review
- algorithm returns bool instead of structured result
- defect information discarded after display
How to handle it:
- result model includes decision reason
- measurements are stored
- defects are stored
- overlays are linked
- thresholds and rules are captured
- review workflow uses stored evidence, not transient UI state
6. Stored Result Does Not Match Displayed Overlay
What it looks like:
Database says defect at X=1200,Y=800.
Overlay shows defect at X=1400,Y=950.
Operator loses trust.Why it happens:
- coordinate transform changed
- display used scaled image coordinates
- result used raw image coordinates
- overlay generated from current recipe instead of historical recipe
- image was rotated/flipped for display
- calibration changed after result was stored
How to handle it:
- store coordinate system explicitly
- store transform version
- store overlay geometry with source coordinate space
- version display transformations
- regenerate overlays using historical context
- avoid mixing raw image, processed image, and display image coordinates without metadata
7. Timestamps Inconsistent Across Components
What it looks like:
Image timestamp says 10:01:02.100.
Motion log says stage position at 10:01:02.100 was different.
Inspection result says created at 10:01:05.700.
Alarm says trigger happened at 10:01:01.900.
Nobody knows the true sequence.Why it happens:
- camera, PC, controller use different clocks
- local time mixed with UTC
- timestamp taken after processing, not at acquisition
- log timestamp confused with event timestamp
- no monotonic sequence number
How to handle it:
- use clear timestamp semantics
- separate acquisition time, processing time, storage time
- use UTC for persisted records
- include frame sequence numbers
- include hardware trigger counter where possible
- correlate with workflow step IDs, not timestamps alone
8. Database Write Blocks Inspection Pipeline
What it looks like:
Inspection throughput is fine during demo.
In production, database slows down.
Image processing waits for result insert.
Camera queue backs up.
Frames are dropped.
Machine takt time degrades.Why it happens:
- synchronous writes in critical inspection path
- unbounded queues
- database transaction includes large image write
- retry logic blocks processing thread
- storage service not isolated from acquisition/inspection
How to handle it:
- non-blocking storage pipeline
- bounded queues
- explicit backpressure behavior
- separate acquisition/processing from persistence
- durable local spool if needed
- storage health metrics
- fail-safe policy when persistence is degraded
PART 8 — Software Design Implications
Storage must be designed as part of the vision architecture from the beginning.
A bad architecture treats storage as an afterthought:
Inspection code
|
+-- save random image file
+-- insert pass/fail row
+-- hope filename can be matched laterA good architecture treats storage as a first-class subsystem:
+----------------------+
| Inspection Pipeline |
+----------------------+
|
v
+-----------------------------+
| Image + Result + Metadata |
| InspectionContext |
+-----------------------------+
|
v
+----------------------+
| Storage Queue |
| bounded, async |
+----------------------+
|
v
+-----------------------------+
| Storage Service |
| validates, writes, indexes |
+-----------------------------+
| |
v v
+-------------+ +----------------+
| Image Store | | Result Database|
+-------------+ +----------------+
| |
+--------+--------+
v
+-----------------------------+
| Review / Debug / Reporting |
+-----------------------------+The key design object is often an explicit InspectionContext.
Conceptually:
+-------------------------+
| InspectionContext |
+-------------------------+
| CorrelationId |
| LotId |
| ProductId |
| RunId |
| StepId |
| MachineId |
| RecipeId |
| RecipeVersion |
| CameraId |
| FrameId |
| Timestamp |
| WorkflowStep |
| StagePosition |
| CalibrationVersion |
| SoftwareVersion |
+-------------------------+Every image and result should carry or reference this context.
Without explicit context, traceability becomes a guessing game.
Non-Blocking Storage Pipeline
The inspection path should not be casually blocked by slow storage.
A simplified flow:
+-------------+ +-------------+ +-------------+
| Acquisition | --> | Processing | --> | Inspection |
+-------------+ +-------------+ +-------------+
|
v
+--------------------+
| Storage Message |
| image/result/meta |
+--------------------+
|
v
+--------------------+
| Bounded Queue |
+--------------------+
|
v
+--------------------+
| Storage Worker(s) |
+--------------------+
|
+------------+------------+
v v
+-----------+ +-------------+
| File/NAS | | Database |
| Object | | Index/Query |
| Storage | | Results |
+-----------+ +-------------+Important design choices:
- queue must be bounded
- storage failures must be visible
- writes should be idempotent where possible
- result and image references must stay consistent
- partial write states must be modeled
- retry policies must not cause unbounded memory growth
- machine behavior under storage degradation must be defined
For example:
If result DB is unavailable:
- Can the machine continue?
- Can it spool locally?
- For how long?
- Should it stop new lots?
- Should it allow current wafer to finish?
- What alarm should operator see?These are architecture decisions, not implementation details.
Schema and Versioning
Vision result schemas evolve.
Today you may store:
DefectCount
LargestDefectSize
DecisionLater you may need:
DefectClass
Confidence
ReviewStatus
FalsePositiveLabel
AlgorithmModelVersionSo result storage should tolerate versioning.
Options include:
- relational core fields plus JSON details
- schema version per result
- immutable result payloads
- migration strategy
- backward-compatible readers
- explicit algorithm output version
Avoid designing result storage as if inspection output will never change.
It will change.
Replay and Review Support
A strong traceability system supports replay.
Replay means:
Take historical image + historical recipe/context
Run inspection again or review previous result
Compare old result and new resultThis is useful for:
- debugging false rejects
- validating recipe changes
- comparing algorithm versions
- customer investigations
- regression testing
Replay requires:
- image availability
- recipe version availability
- calibration/transform context
- algorithm version awareness
- deterministic or explainable processing
- stored result for comparison
Without traceable storage, replay becomes impossible.
PART 9 — Interview / Real-World Talking Points
A strong interview answer should sound like this:
In an industrial vision system, storage is not just persistence. It is part of traceability and diagnostics. Every inspection result should be connected to the product, image, recipe version, workflow step, machine state, and timestamp that produced it. Otherwise, when a customer asks why a wafer or part failed, the system cannot explain its own decision.
You can also say:
I would separate image storage from result indexing. Images may live in file/object/NAS storage, while structured result data should be queryable in a database. The important part is that both are connected by stable image IDs, run IDs, step IDs, and correlation IDs.
Another strong point:
I would avoid blocking the acquisition or inspection pipeline on slow image writes. Storage should usually be asynchronous, bounded, monitored, and designed with backpressure. But I would still define what happens if storage fails, because losing traceability may be unacceptable for some production modes.
Common mistakes software engineers make:
- saving files with weak names like
image1.bmp - storing pass/fail without decision reason
- not storing recipe version
- not linking image and result
- assuming current recipe explains historical result
- using timestamps as the only correlation mechanism
- blocking inspection on database writes
- ignoring disk-full scenarios
- deleting failed images too early
- storing overlays only as screenshots
- mixing coordinate systems without metadata
- treating image storage as an IT concern instead of machine reliability
What strong engineers understand:
- traceability is part of product quality
- diagnostic value must be designed intentionally
- image retention is a trade-off, not an afterthought
- pass/fail is not enough
- recipe versions must be immutable or reconstructable
- storage must not destabilize real-time-ish inspection flow
- cleanup and disk monitoring are production safety concerns
- review and replay requirements shape storage architecture
The core mental model:
A vision system does not only inspect.
It must remember enough evidence
to explain the inspection later.That is the heart of data storage and traceability in industrial vision systems.