2026-05-04 13:45:18 -07:00
2026-05-03 13:06:11 -07:00
2026-05-04 11:14:57 -07:00
2026-05-04 11:14:57 -07:00
2026-05-03 13:06:11 -07:00
2026-05-02 22:48:03 -07:00
2026-05-02 22:48:03 -07:00
2026-05-02 23:01:06 -07:00
2026-05-04 13:45:18 -07:00

roto

Zero-allocation Rust protobuf reader and writer.

Overview

Instead of deserializing binary protobuf data into Rust structs, roto scans a message once on construction — recording the byte offset of each field — then reads fields on demand directly from the original bytes. No heap allocation, no data copying, no full deserialization upfront.

Writing works the same way: you provide a fixed buffer and a builder writes fields directly into it, returning a slice of the bytes written.

Design

protoc generates a CodeGeneratorRequest message; protoc-gen-roto (in src/bin/protoc-gen-roto.rs) reads this from stdin, generates Rust source files, and writes a CodeGeneratorResponse to stdout. protoc then writes those .rs files to disk. The generated files are included directly in the crate that uses the protobuffers.

Generated code

For each protobuf message roto generates two types:

  • Reader struct MessageName<'a> — borrows the original byte slice, zero-copy.
  • Builder struct MessageNameBuilder<'b> — writes into a caller-provided &mut [u8].

Nested message types are placed in a pub mod message_name { ... } module (snake_case of the parent message name) within the same generated file.

Sample usage

Given this proto definition:

message Hello {
    string hello_world = 1;
    message InnerWorld {
        string thought = 1;
    }
    InnerWorld inner_world = 2;
}

Reading

fn parse_proto(data: &[u8]) -> roto::Result<String> {
    // Scan the data once, recording field offsets
    let hello = Hello::new(data)?;

    // String fields return &str borrowed from the original bytes (zero-copy)
    let hello_world: &str = hello.hello_world()?;

    // Nested message fields return &[u8]; construct the nested reader from those bytes
    let inner_bytes: &[u8] = hello.inner_world()?;
    let inner_world = hello::InnerWorld::new(inner_bytes)?;
    let thought: &str = inner_world.thought()?;

    Ok(format!("{} is about {}", hello_world, thought))
}

Fields absent from the binary data return Err(roto::RotoError::FieldNotFound).

Writing

Nested messages must be serialized into a scratch buffer first, then embedded as raw bytes in the outer builder.

fn build_proto(buf: &mut [u8]) -> roto::Result<&[u8]> {
    // Serialize the inner message first
    let mut inner_buf = [0u8; 256];
    let inner_bytes = hello::InnerWorldBuilder::builder(&mut inner_buf)
        .thought("some thought")?
        .finish()?;

    // Build the outer message, embedding the serialized inner bytes
    HelloBuilder::builder(buf)
        .hello_world("some world")?
        .inner_world(inner_bytes)?
        .finish() // returns Result<&'b mut [u8]> — the written portion of buf
}

Builder methods consume self and return Result<Self>, enabling ?-based chaining. finish() returns Result<&'b mut [u8]> — a slice of the portion of the buffer that was written.

Repeated fields

Repeated fields return a RepeatedFieldIterator<'a>. Each item yields Result<(&[u8], WireType)>.

let hello = Hello::new(data)?;
for item in hello.tags() {
    let (value_bytes, _wire_type) = item?;
    // decode value_bytes according to the expected wire type
}

Runtime API

The core runtime in src/lib.rs provides:

  • ProtoAccessor<'a> — scans a message's fields and reads values at recorded offsets.
  • ProtoBuilder<'a> — writes fields into a provided &mut [u8] buffer.
  • FieldIterator<'a> / RepeatedFieldIterator<'a> — iterators over fields and repeated fields.
  • Tag, WireType — protobuf encoding primitives.
  • read_varint, write_varint, skip_value — low-level wire-format helpers.
  • RotoError, Result<T> — error type and alias.

High-level design

On construction (MessageName::new(data)), the generated reader struct iterates the binary once using FieldIterator and records the byte offset of each field's tag. Subsequent field accesses call ProtoAccessor::get_value_at(offset) — no re-scanning. For repeated fields, the start and end offsets of the field range are recorded to bound iteration efficiently.

Literature

https://protobuf.dev/programming-guides/encoding/

S
Description
No description provided
Readme 2.3 MiB
Languages
PureBasic 62.1%
Rust 35.2%
C 2.7%