Skip to content

Span<T>, ReadOnlySpan<T>, Memory<T>, ReadOnlyMemory<T>, and stackalloc in modern .NET

This topic matters because a lot of real performance problems are not about CPU math. They are about data movement.

In production systems, especially real-time and long-running ones, performance often gets worse because we keep:

  • allocating short-lived arrays and strings
  • copying buffers again and again
  • creating temporary objects while parsing or transforming data
  • forcing the GC to work harder than necessary

That is exactly the problem these features were introduced to solve.

They are not “cool syntax.” They are tools for writing code that works directly over existing memory, with less copying and fewer allocations.


Part 1 — The big picture

Why modern .NET introduced Span<T> and Memory<T>

Before these types became common, a lot of data processing code followed patterns like this:

  • receive a byte[]
  • extract a subrange into another byte[]
  • convert some bytes into a string
  • split or copy again
  • pass another new array to another layer

That style is easy to write, but in high-throughput systems it creates a huge amount of temporary garbage.

In a business app, this may not matter much.

In systems like:

  • image processing
  • machine communication
  • real-time telemetry
  • binary protocol parsing
  • defect/result pipelines

it matters a lot.

Because these systems often do the same operation:

  • thousands of times per second
  • on large buffers
  • for many hours
  • under latency pressure
  • while the UI must stay responsive

So the runtime and language evolved toward a more explicit model:

“Can we process existing memory directly, instead of copying it into new objects?”

That is the world of Span<T> and Memory<T>.


Why copying data is expensive

Copying has two costs.

The first cost is obvious: CPU time. If you copy 8 KB once, it may be trivial. If you copy 8 KB millions of times, it becomes real work.

The second cost is often worse: allocation pressure.

Every time you create a new array, substring, or intermediate object, you increase GC workload. In long-running desktop systems, that shows up as:

  • occasional pauses
  • unstable latency
  • increased memory footprint
  • more frequent Gen 0 / Gen 1 collections
  • eventual promotion of objects that should have died early
  • harder-to-predict throughput under load

In a wafer inspection system, that can mean:

  • delayed UI updates
  • jitter in machine status handling
  • slower result processing after hours of runtime
  • more memory fragmentation around large result sets and images

So the goal is not “avoid every allocation.” The goal is:

avoid unnecessary allocations in the parts of the system that run often enough for it to matter.


Why zero-allocation processing improves both performance and stability

A lot of engineers think this topic is just about speed. It is also about stability.

In real systems, low-allocation processing helps because it makes runtime behavior more predictable.

If a message parser:

  • reuses memory
  • slices instead of copying
  • avoids temporary strings
  • uses stack-local scratch space for tiny buffers

then the system tends to behave more consistently under sustained load.

That matters a lot in:

  • camera pipelines
  • binary device communication
  • continuous inspection loops
  • high-frequency measurement processing
  • always-on WPF desktop applications

The practical benefit is often not “2x faster.” Very often the real benefit is:

  • fewer spikes
  • lower GC noise
  • smoother throughput
  • less memory churn
  • more consistent latency

That is a very production-oriented win.


Part 2 — What Span<T> really is

Conceptually: a lightweight view over contiguous memory

The best mental model for Span<T> is this:

Span<T> is not the data itself. It is a window onto existing contiguous memory.

That memory might come from:

  • an array
  • a stack allocation
  • native memory
  • another span
  • a pooled buffer

A Span<byte> does not mean “create bytes.” It means “here is a safe view over these bytes.”

That is why it is so useful. You can pass around a view of the data without duplicating the data.


Slicing without copying

This is one of the most important ideas.

Suppose you receive a machine message in a byte[]. The header is in the first 16 bytes, and the payload starts after that.

Old style thinking:

  • create a new byte[] for the header
  • create another new byte[] for the payload

Span-based thinking:

  • create a slice for the header
  • create a slice for the payload

Both slices refer to the same underlying memory. No copy is needed.

That matters a lot in hot paths because many protocols are basically:

  • read buffer
  • carve it into sections
  • interpret sections
  • move on

Span<T> is ideal for that.


Why Span<T> is stack-only

Span<T> is intentionally restricted because it can point to memory with very short lifetime.

For example, it may refer to:

  • stack memory from stackalloc
  • temporary local data
  • memory that must not outlive the current scope

If the runtime allowed spans to be stored anywhere freely, it would become easy to create dangerous cases where a reference outlives the memory it points to.

So the language protects you by making Span<T> a stack-only type.

That means, in practical terms:

  • you cannot store it in normal class fields
  • you cannot let it escape into arbitrary heap-based state
  • you cannot freely carry it across async boundaries
  • it is meant for synchronous, scoped, immediate use

This is not a random limitation. It is the reason Span<T> can be both fast and safe.


Why it is a ref struct

Span<T> is a ref struct because the language needs stronger lifetime rules for it.

The important mental model is not compiler trivia. The important point is:

.NET wants to let you work directly over memory, but only in places where the lifetime can be proven safe.

That is why Span<T> feels powerful but constrained.

Those constraints are not annoying extras. They are the design.


Part 3 — ReadOnlySpan<T> in practice

When to use ReadOnlySpan<T>

Use ReadOnlySpan<T> when you want:

  • zero-copy access
  • slicing
  • high-performance reading
  • protection against accidental mutation

It is the “read-only view” version of Span<T>.

In many production systems, this is actually the more common choice, because many low-level operations only need to inspect data, not modify it.


Why it is especially useful for strings, arrays, and buffers

This type is extremely useful because many read-heavy operations are really just:

  • scanning
  • parsing
  • validating
  • matching
  • tokenizing
  • extracting fields

Examples:

  • parsing a command payload from byte[]
  • reading metadata from an image header
  • scanning telemetry frames for markers
  • checking prefixes/suffixes without creating substrings
  • parsing delimited text without Split()

A big advantage is that ReadOnlySpan<char> lets you process parts of a string without creating new strings.

That is a major improvement over old habits like:

  • Substring
  • Split
  • repeated small string allocations during parsing

Practical examples

Parsing command payloads

A machine sends a payload like:

CMD=START;MODE=AUTO;LOT=12345

Naive code may:

  • convert bytes to string
  • split by ;
  • split again by =
  • allocate lots of temporary strings

A better approach is to:

  • decode only when needed
  • parse spans directly
  • slice token by token

This reduces garbage and gives more control over parsing cost.

Reading image metadata

Suppose image metadata is stored in a binary header:

  • width
  • height
  • pixel format
  • timestamp
  • exposure

You do not need to copy parts of the header into new arrays. You can take slices of a ReadOnlySpan<byte> and decode fields directly.

Scanning buffers

In telemetry or protocol parsing, sometimes you just need to:

  • find a marker byte
  • match a signature
  • skip a known prefix
  • read a fixed-width field

That is exactly the kind of operation spans were made for.


Part 4 — What Memory<T> / ReadOnlyMemory<T> solve

This is where many engineers get confused.

Why Memory<T> exists in addition to Span<T>

Span<T> is great, but it is intentionally scoped and synchronous.

That means it does not work well when the memory must survive beyond the current synchronous call chain.

Real systems often need to:

  • queue buffers
  • store them in objects
  • pass them across async methods
  • hand them to background workers
  • keep references for later stages in a pipeline

That is the problem Memory<T> solves.

The mental model is:

  • Span<T> = fast, synchronous view for immediate work
  • Memory<T> = storable, heap-friendly, async-friendly memory handle

ReadOnlyMemory<T> is the read-only version.


Difference between Span<T> and Memory<T>

A simple way to think about it:

TypeBest for
Span<T>synchronous hot-path processing
ReadOnlySpan<T>synchronous read-only parsing and inspection
Memory<T>passing/storing memory across async or heap-based boundaries
ReadOnlyMemory<T>async-friendly read-only buffer ownership/pass-through

Memory<T> can later give you a Span<T> when you are in a safe synchronous scope.

That pattern is very common.


When you need one vs the other

Use Span<T> / ReadOnlySpan<T> when:

  • the method is synchronous
  • the work is immediate
  • you want minimal overhead
  • you are parsing, scanning, formatting, or transforming in place

Use Memory<T> / ReadOnlyMemory<T> when:

  • the buffer must be stored
  • it crosses await
  • it is queued for later processing
  • it moves between pipeline stages
  • ownership/lifetime must extend beyond one synchronous scope

Examples

Pipelines crossing async boundaries

A camera acquisition service receives image bytes and pushes them into an async processing pipeline.

The acquisition layer may produce or own:

  • IMemoryOwner<byte>
  • Memory<byte>

Later, inside a synchronous decode method, you turn that into:

  • Span<byte>
  • ReadOnlySpan<byte>

That separation is healthy.

Handing buffers between services

Suppose one component reads data from a socket and another parses it later on a background channel.

You cannot safely carry Span<byte> through that whole path. But you can pass Memory<byte> or a pooled owner object.

Background processing stages

If one stage reads a packet and another stage processes it later, Memory<T> is usually the right abstraction at the boundary.

Then, inside the actual parser, you work with ReadOnlySpan<T>.

That is a very common real-world design.


Part 5 — stackalloc in practice

What stackalloc does

stackalloc allocates a block of memory on the current thread’s stack instead of on the managed heap.

That means:

  • allocation is extremely cheap
  • cleanup is automatic when the method returns
  • no GC tracking is needed for that buffer itself

This makes it useful for small, short-lived temporary buffers.


Relationship with Span<T>

Modern .NET made stackalloc far more practical because it works naturally with spans.

Instead of dealing with unsafe pointer-heavy code, you can do:

  • allocate a small stack buffer
  • wrap it in a Span<T>
  • use normal span operations on it

That is why stackalloc became much more approachable in production code.


When it helps

It helps when you need:

  • a tiny scratch buffer
  • only for the current method
  • in a hot path
  • where allocating a heap array would happen frequently

Examples:

  • temporary formatting buffer
  • parsing a small token
  • assembling a short protocol frame
  • small conversion workspace
  • transient char buffer for number formatting or normalization

Why it should be used carefully

The stack is limited.

Heap allocations are GC-managed and can scale much larger. Stack allocations are fast, but they consume a small bounded resource.

So stackalloc is a tool for:

  • small
  • predictable
  • local
  • short-lived buffers

Not for:

  • large buffers
  • variable huge payloads
  • data you need after the method returns
  • anything you want to store or queue

A common practical pattern is:

  • use stackalloc for small sizes
  • fall back to heap or pool for larger sizes

Benefits and risks

Benefits:

  • avoids heap allocation
  • very cheap for tiny buffers
  • excellent for hot-path scratch space
  • naturally works with Span<T>

Risks:

  • stack overflow if misused
  • harder readability if overused
  • dangerous if size is not controlled
  • not appropriate for buffers with longer lifetime

Part 6 — Real problems in a wafer inspection WPF desktop app

This is where these tools become meaningful.

Where they actually help

Parsing high-frequency binary machine messages

Machines often emit compact binary frames:

  • command acknowledgements
  • sensor values
  • motion status
  • fault data
  • timestamps
  • counters

This code is usually:

  • low-level
  • repetitive
  • latency-sensitive
  • allocation-sensitive

Using ReadOnlySpan<byte> here is a strong fit.

You can:

  • slice headers and payloads
  • decode primitives without copies
  • avoid per-message temporary arrays

Processing image metadata without copying

Large image buffers are expensive to move. Often you do not need to copy the whole image, or even the header.

You may only need:

  • dimensions
  • pixel format
  • row stride
  • channel layout
  • capture timestamp
  • exposure/gain metadata

Span-based readers are a good fit for this.

Slicing large image or telemetry buffers

A single acquisition buffer may contain:

  • header
  • image body
  • trailing metadata
  • checksum
  • appended inspection info

Creating separate arrays for each part is wasteful. Slicing is cleaner and cheaper.

Reducing temporary allocations in defect/result pipelines

Suppose a defect pipeline takes raw measurements and transforms them into normalized records. If that hot path:

  • builds temporary arrays
  • converts many small fragments into strings
  • repeatedly copies data into intermediate buffers

you get unnecessary churn.

This is a classic place for:

  • span-based parsing
  • span-based formatting
  • pooled buffers
  • careful boundary design

Writing hot-path low-level infrastructure code

Examples:

  • protocol parsers
  • frame decoders
  • checksum calculators
  • custom binary serializers
  • measurement transformation loops
  • small image utility routines

These are good candidates.


Where they usually do not belong

This is just as important.

These types usually do not belong in:

  • ViewModels
  • general business logic
  • workflow orchestration
  • command handlers
  • application services
  • most UI-facing code
  • stateful domain logic that is not performance-critical

Why?

Because in those places:

  • readability matters more
  • allocation cost is usually negligible
  • data flow is more important than raw buffer efficiency
  • async/stateful design is often dominant
  • the code is maintained by a broader team

If you push span-heavy style too far upward, you make the codebase harder to understand for very little gain.

A senior engineer isolates this style to the places where it pays.


Part 7 — Zero-allocation data processing patterns

Slicing instead of copying

This is the foundation.

Instead of:

  • allocate a subarray
  • copy data into it
  • process the copy

you:

  • create a slice
  • process the slice directly

This is often the single biggest conceptual shift.


Working over spans in hot loops

In hot loops, tiny inefficiencies compound.

If you are processing:

  • measurement batches
  • telemetry frames
  • image line metadata
  • protocol records

then using spans helps you stay close to the data without repeated object creation.


Parsing directly from spans

This is a very modern and useful design style.

Instead of:

  • convert bytes to string
  • create substrings
  • parse each substring

you:

  • read directly from ReadOnlySpan<byte> or ReadOnlySpan<char>
  • parse fields in place
  • only materialize strings when you truly need them

This is much more efficient for protocol and format parsing.


Formatting into spans/buffers

Output can also be low-allocation.

Instead of:

  • concatenate strings repeatedly
  • build many intermediate strings

you can:

  • format into a span
  • write into a reusable or stack-allocated buffer
  • materialize a final string only once, if needed

This matters in:

  • logging infrastructure
  • protocol generation
  • identifier formatting
  • repeated numeric formatting in hot paths

Avoiding intermediate strings, arrays, and lists

A lot of hidden cost comes from intermediate representations.

Examples:

  • Split() produces arrays
  • Substring() produces new strings
  • LINQ chains can create iterators and extra work
  • converting bytes to strings too early creates churn

In performance-critical code, the better question is:

can I stay on the original buffer a bit longer?

That is usually where spans pay off.


Examples

Binary protocol parsing

Take a ReadOnlySpan<byte>:

  • read fixed header
  • slice payload
  • validate checksum
  • decode fields directly

No subarray copies needed.

Image header parsing

Take the first N bytes of an image buffer:

  • decode width
  • decode height
  • decode pixel format
  • decode stride

Again, direct reads from slices.

Tokenizing input without allocation

Given text input, instead of Split(','), scan the ReadOnlySpan<char>, find separators, and process slices.

That is very useful in log parsing, command parsing, and compact text protocols.

Transforming batches of measurement data

If a batch arrives as a contiguous buffer, you can iterate spans over fixed-size records instead of allocating record fragments repeatedly.


Part 8 — How these features behave with async and pipelines

Why Span<T> cannot cross async/await

Span<T> is scoped to safe synchronous use.

An await can suspend execution, move control elsewhere, and resume later. That breaks the simple lifetime guarantees Span<T> depends on.

So Span<T> is deliberately not allowed to flow across async boundaries.

This is one of the most important design rules to internalize.


Why Memory<T> is often used for async-friendly pipelines

When data must:

  • survive across awaits
  • be queued
  • be stored in objects
  • move through channels
  • be processed later

you need a representation that can live on the heap safely.

That is what Memory<T> is for.

A common pattern is:

  • transport/pipeline boundary uses Memory<T> or ReadOnlyMemory<T>
  • synchronous processing method converts to span
  • parsing/transformation happens over span
  • result moves onward in a higher-level model

Designing APIs correctly

This is a very mature design pattern:

  • sync hot-path APIReadOnlySpan<byte> or Span<byte>
  • async/boundary APIReadOnlyMemory<byte> or Memory<byte>

That gives you:

  • high performance where it matters
  • correct lifetime behavior across async stages
  • cleaner separation of concerns

Example pattern

  • acquisition stage reads bytes and exposes ReadOnlyMemory<byte>
  • pipeline queues that memory to another stage
  • parser method receives ReadOnlySpan<byte>
  • parser works synchronously over that span
  • output is mapped to a higher-level record

That combination is very common in well-designed systems.


Part 9 — Practical .NET usage

Here are realistic examples.

1. Method taking ReadOnlySpan<byte>

csharp
using System;
using System.Buffers.Binary;

public readonly record struct MachineMessageHeader(
    ushort MessageType,
    ushort Version,
    int PayloadLength,
    uint SequenceNumber);

public static class MachineProtocolParser
{
    private const int HeaderSize = 12;

    public static bool TryParseHeader(
        ReadOnlySpan<byte> buffer,
        out MachineMessageHeader header)
    {
        header = default;

        if (buffer.Length < HeaderSize)
            return false;

        ushort messageType = BinaryPrimitives.ReadUInt16LittleEndian(buffer.Slice(0, 2));
        ushort version = BinaryPrimitives.ReadUInt16LittleEndian(buffer.Slice(2, 2));
        int payloadLength = BinaryPrimitives.ReadInt32LittleEndian(buffer.Slice(4, 4));
        uint sequence = BinaryPrimitives.ReadUInt32LittleEndian(buffer.Slice(8, 4));

        header = new MachineMessageHeader(messageType, version, payloadLength, sequence);
        return true;
    }
}

Why this is good:

  • no subarray allocation
  • direct reads from the original buffer
  • easy to compose into a larger parser

2. Parsing from ReadOnlySpan<byte> without copying

csharp
using System;

public readonly record struct InspectionPacket(
    MachineMessageHeader Header,
    ReadOnlyMemory<byte> Payload);

public static class InspectionPacketReader
{
    public static bool TryParsePacket(
        ReadOnlyMemory<byte> packetMemory,
        out InspectionPacket packet)
    {
        packet = default;

        ReadOnlySpan<byte> packetSpan = packetMemory.Span;

        if (!MachineProtocolParser.TryParseHeader(packetSpan, out var header))
            return false;

        int totalLength = 12 + header.PayloadLength;
        if (packetSpan.Length < totalLength)
            return false;

        ReadOnlyMemory<byte> payload = packetMemory.Slice(12, header.PayloadLength);
        packet = new InspectionPacket(header, payload);
        return true;
    }
}

This is a nice pattern:

  • boundary type is ReadOnlyMemory<byte>
  • parsing uses Span
  • payload remains slice-based, not copied

3. Using stackalloc for a small temporary formatting buffer

csharp
using System;

public static class DefectCodeFormatter
{
    public static string FormatDefectCode(int line, int column, int defectId)
    {
        Span<char> buffer = stackalloc char[64];

        buffer.Clear();
        int written = 0;

        "L".AsSpan().CopyTo(buffer.Slice(written));
        written += 1;

        if (!line.TryFormat(buffer.Slice(written), out int lineWritten))
            throw new InvalidOperationException("Failed to format line.");
        written += lineWritten;

        "-C".AsSpan().CopyTo(buffer.Slice(written));
        written += 2;

        if (!column.TryFormat(buffer.Slice(written), out int columnWritten))
            throw new InvalidOperationException("Failed to format column.");
        written += columnWritten;

        "-D".AsSpan().CopyTo(buffer.Slice(written));
        written += 2;

        if (!defectId.TryFormat(buffer.Slice(written), out int defectWritten))
            throw new InvalidOperationException("Failed to format defect ID.");
        written += defectWritten;

        return new string(buffer.Slice(0, written));
    }
}

This avoids several intermediate strings during formatting.

Would I use this everywhere? No. Would I use it in a hot path that formats huge volumes of identifiers? Possibly yes.


4. Converting Memory<T> to Span<T> safely in a synchronous scope

csharp
using System;

public static class TelemetryNormalizer
{
    public static void NormalizeInPlace(Memory<float> samples, float offset, float scale)
    {
        Span<float> span = samples.Span;

        for (int i = 0; i < span.Length; i++)
        {
            span[i] = (span[i] + offset) * scale;
        }
    }
}

This is fine because the span is only used inside the synchronous method body.


5. Slicing buffers

csharp
using System;

public static class ImageHeaderReader
{
    public static bool TryReadDimensions(ReadOnlySpan<byte> buffer, out int width, out int height)
    {
        width = 0;
        height = 0;

        const int HeaderLength = 16;
        if (buffer.Length < HeaderLength)
            return false;

        ReadOnlySpan<byte> metadata = buffer.Slice(0, HeaderLength);
        ReadOnlySpan<byte> widthBytes = metadata.Slice(4, 4);
        ReadOnlySpan<byte> heightBytes = metadata.Slice(8, 4);

        width = BitConverter.ToInt32(widthBytes);
        height = BitConverter.ToInt32(heightBytes);

        return true;
    }
}

Even this simple pattern is useful: slice the data you need, do not create a new buffer for it.


Part 10 — Common mistakes

1. Using Span<T> everywhere just because it is fast

This is a classic overreaction.

Engineers learn that spans reduce allocations, then start pushing them into every method. The result is often:

  • harder APIs
  • more confusing code
  • lifetime issues
  • little measurable benefit

Most code does not need this style.

Use it where data movement and allocation are actually part of the problem.


2. Trying to store Span<T> in class fields

This happens because people think of span as “just another buffer type.”

It is not.

It is intentionally scoped. If you try to turn it into long-lived object state, you are fighting the design.

If you need storage, use:

  • Memory<T>
  • ReadOnlyMemory<T>
  • arrays
  • pooled owners

not Span<T>.


3. Using stackalloc for large buffers

This is dangerous.

A small stackalloc is a nice optimization. A large one is a reliability risk.

In production, this can turn into intermittent failures or stack overflows, especially if:

  • the method is recursive
  • multiple stack-heavy calls are nested
  • buffer sizes vary more than expected

Use it only for small, bounded scratch space.


4. Introducing hard-to-read code for tiny gains

Very common.

For example, replacing readable parsing code with intricate span logic in a path that runs a few times per user action is usually not worth it.

The result:

  • harder maintenance
  • fewer engineers confident to change the code
  • more bug risk
  • no meaningful product impact

Performance code should pay rent.


5. Misunderstanding Memory<T> vs Span<T>

A lot of confusion comes from treating them as interchangeable.

They are related, but they play different roles:

  • Span<T> for immediate synchronous access
  • Memory<T> for storage and async-friendly boundaries

If you use the wrong one, the design becomes awkward.


6. Forcing low-allocation patterns into application-layer code

This is a maturity issue.

ViewModels, orchestration services, domain workflows, and UI commands usually benefit more from:

  • clarity
  • composability
  • testability
  • explicit business intent

Low-level buffer tricks there usually reduce quality.


7. Optimizing before profiling or benchmarking

This is maybe the most important mistake.

You may spend hours rewriting code with spans and stackalloc, then discover:

  • the bottleneck was database I/O
  • the UI thread was blocked by rendering
  • the real problem was image decode, not header parsing
  • allocations were not on the hot path at all

These features are powerful, but they still need measurement discipline.


Part 11 — Performance and memory trade-offs

Reduced allocations vs increased complexity

This is the central trade-off.

You often get:

  • fewer temporary objects
  • less copying
  • lower GC pressure

But you also get:

  • stricter lifetime rules
  • less familiar APIs
  • more low-level code
  • more mental overhead

Good engineers do not ignore either side.


Stack usage limits

stackalloc is fast because stack usage is simple and local. But the stack is much more limited than the heap.

So stack-based optimization is good when the amount is:

  • small
  • fixed or tightly bounded
  • short-lived

It is a bad choice for “maybe this payload is 32 bytes, maybe 200 KB.”


Zero-copy vs readability

Zero-copy is not automatically better.

Sometimes copying a tiny amount of data into a clearer, safer structure is absolutely the right engineering decision.

For example:

  • small config parsing done rarely
  • UI-driven actions
  • business-layer orchestration
  • code touched by many engineers

The goal is not purity. The goal is proportional optimization.


API flexibility vs low-level efficiency

A low-level API taking ReadOnlySpan<byte> can be excellent internally. But if you expose that style too broadly in public or general-purpose layers, you may reduce usability.

There is often a good balance:

  • keep hot-path helpers efficient
  • keep higher-level application APIs ergonomic

When the win is real vs negligible

The win is real when:

  • the code is on a hot path
  • data is large or frequent
  • allocations are measurable
  • GC noise is visible
  • the code processes streams, buffers, or binary data
  • copying is happening repeatedly

The win is negligible when:

  • the code runs rarely
  • data is tiny
  • overall latency is dominated elsewhere
  • readability suffers more than performance improves

That is the engineering judgment part.


Part 12 — API design with Span<T> / Memory<T>

Designing low-level APIs

These types are great in low-level APIs like:

  • binary parsers
  • protocol readers
  • image header utilities
  • checksum/hash routines
  • encoding/decoding helpers
  • numeric formatting/parsing helpers

Typical pattern:

  • input: ReadOnlySpan<T>
  • output target: Span<T>
  • async/storage boundary: Memory<T> or owner abstractions

That is usually a strong design.


When public APIs should expose these types

Expose them when the API is clearly:

  • low-level
  • performance-sensitive
  • buffer-oriented
  • used by engineers comfortable with this style

Examples:

  • internal SDKs
  • infrastructure libraries
  • parsing libraries
  • image-processing utilities
  • transport/protocol components

When to keep them internal

Keep them internal when:

  • the benefit is local
  • the external API should remain simpler
  • most consumers do not need to think about memory views
  • the abstraction would leak low-level concerns upward

This is very often the right call in application codebases.


Example judgments

Parsing helpers

Great fit for ReadOnlySpan<byte> or ReadOnlySpan<char>.

Protocol readers

Excellent fit. This is one of the best use cases.

Image-processing utilities

Very good fit, especially for headers, rows, channels, and pixel buffers.

Internal infrastructure libraries

Often a good fit if the library is performance-critical.

But for top-level application services, a more domain-oriented API is often better.


Part 13 — Connection to other advanced features

ref struct

Span<T> is one of the most important real-world examples of a ref struct.

This is how C# enforces:

  • stack-only usage
  • tighter lifetime control
  • safe direct memory access patterns

So understanding spans helps you understand why ref struct exists.


ref, in, out

These features all live in the same broader world:

  • controlling copying
  • improving performance
  • handling data more directly

Examples:

  • in helps avoid copying large structs
  • ref lets you operate on existing storage
  • spans are essentially a safer view-based model over memory

They are different tools, but they belong to the same mental model:

data movement matters


ArrayPool<T>

Spans often become even more useful when combined with pooled memory.

A common pattern is:

  • rent a large array from ArrayPool<T>
  • expose slices as spans
  • process without further allocation
  • return the array to the pool

This is common in:

  • protocol readers
  • serialization
  • high-throughput buffering
  • image pipelines
  • reusable transport layers

Low-allocation async pipelines

This is where Memory<T> becomes important.

A modern high-performance async pipeline often uses:

  • Memory<T> or ReadOnlyMemory<T> across async stages
  • Span<T> inside the synchronous inner loop
  • pooled buffers for reuse
  • owner abstractions for explicit lifetime management

That combination is very practical.


Benchmarking and profiling

These features are exactly the kind of thing you should validate with measurement.

Good questions:

  • did allocations actually drop?
  • did throughput improve?
  • did p99 latency improve?
  • did GC collections drop?
  • did code complexity increase too much?

Without profiling and benchmarking, it is easy to overuse these tools.


Part 14 — Senior engineer mental model

How experienced engineers think about data movement

Strong engineers look at a pipeline and ask:

  • where is data being copied?
  • where are temporary allocations created?
  • how often does this path run?
  • how large is the data?
  • is GC pressure showing up in traces or profiles?
  • do we need a new object here, or just a view?

That mindset is far more important than memorizing APIs.


How to decide when these features are justified

They are justified when:

  • the code is performance-critical
  • memory churn is measurable
  • the operation is buffer-oriented
  • parsing/formatting/transformation happens at high frequency
  • zero-copy design meaningfully improves throughput or stability

They are not justified just because the code “looks more advanced.”


How to isolate performance-critical code

This is one of the most important design habits.

Keep low-level performance code:

  • small
  • focused
  • well-tested
  • close to infrastructure boundaries
  • hidden behind clean abstractions

For example:

  • parser utility uses spans internally
  • application service receives a normal parsed model
  • ViewModel never sees the low-level buffer logic

That is the right separation.


How to keep it maintainable and testable

A good pattern is:

  • isolate span-heavy code in dedicated components
  • keep methods short and purpose-specific
  • validate boundaries carefully
  • test with representative data
  • benchmark before and after
  • document why the low-level optimization exists

If a method is span-heavy and hard to read, it should earn that complexity.


How to avoid turning the whole codebase into performance code

This is the maturity point.

The best systems do not try to make every layer low-allocation.

They use the right style in the right layer:

  • low-level transport/parsing/image utilities span/memory/pool/stackalloc can be appropriate

  • domain logic and workflows clarity and correctness dominate

  • UI/ViewModels/application services readability, state management, and testability dominate

That balance is what separates strong engineering judgment from cargo-cult optimization.


Final practical summary

Span<T> and friends exist because copying and allocating too much is expensive in real systems.

The simple mental model is:

  • Span<T> = fast, stack-scoped view for synchronous work
  • ReadOnlySpan<T> = read-only version for parsing/scanning/inspection
  • Memory<T> = heap-storable, async-friendly memory abstraction
  • ReadOnlyMemory<T> = read-only async/storage-friendly version
  • stackalloc = tiny temporary stack buffer for carefully bounded hot-path work

They shine in:

  • binary protocol parsing
  • image metadata/header processing
  • telemetry/result pipelines
  • low-level infrastructure code
  • high-frequency buffer transformations

They usually do not belong in:

  • ViewModels
  • workflow orchestration
  • most business logic
  • general application-layer code

The real senior-level lesson is this:

Do not optimize syntax. Optimize data movement.

When data is large, frequent, or hot-path, these features can make a real difference. When the path is not performance-critical, they mostly add complexity.

Use them deliberately, measure the result, and keep the complexity contained.

If you want, I can next turn this into:

  1. an interview-ready Q&A version, or
  2. a sharper version with even deeper code examples using ArrayPool<T>, BinaryPrimitives, and async pipeline patterns.

Docs-first project memory for AI-assisted implementation.