en2026-05-13

Building a Coding Agent in Zig: conversation

Hypercode conversation illustration

In the third post, we got Hypercode talking. One question, one answer, no memory between them. This post adds the layer that turns single-shot into conversation: a Session that holds history, and a REPL loop that chains turns together.

Code stays on github.com/alexisbchz/hypercode.

Why multi-turn?

Every coding agent you use is turn-by-turn. You say "fix this", it fixes; you say "no, the previous version was better", it understands "previous" relative to what was just discussed. Without memory, "previous" means nothing.

Mechanically, it's trivial: keep the list of exchanged messages and send the whole list on each call. OpenAI, Anthropic, OpenRouter — all expect a messages array, not a single prompt. We already have that shape in openrouter.zig; we just use a single-element array.

The shape: Session

A new file, src/session.zig:

src/session.zig
const std = @import("std");

pub const Role = enum { user, assistant };

pub const Message = struct {
    role: Role,
    content: []const u8,
};

pub const Session = struct {
    gpa: std.mem.Allocator,
    messages: std.ArrayList(Message),

    pub fn init(gpa: std.mem.Allocator) Session {
        return .{ .gpa = gpa, .messages = .empty };
    }

    pub fn deinit(self: *Session) void {
        for (self.messages.items) |m| self.gpa.free(m.content);
        self.messages.deinit(self.gpa);
    }

    pub fn append(self: *Session, role: Role, content: []const u8) !void {
        const owned = try self.gpa.dupe(u8, content);
        errdefer self.gpa.free(owned);
        try self.messages.append(self.gpa, .{ .role = role, .content = owned });
    }
};

Four decisions worth a word.

DecisionReason
Role is an enum, not a []const u8The compiler stops .{ .role = "useer" } at the source. Bug impossible.
append copies content via gpa.dupeThe caller can free its source. No shared lifetime.
errdefer after the dupeIf the ArrayList push fails (OOM), we free the copy. No leak.
std.ArrayList(Message) rather than a static bufferFor now. We'll migrate to static allocation when we know the max — for now, we're prototyping.

Refactoring openrouter.call

Before, call took a single prompt. Now, an array of messages. The internal Message struct becomes pub.

src/openrouter.zig
pub const Message = struct {
    role: []const u8,
    content: []const u8,
};

pub fn call(
    gpa: std.mem.Allocator,
    io: std.Io,
    api_key: []const u8,
    model: []const u8,
    messages: []const Message,
) !Result {
    const req = Request{ .model = model, .messages = messages };
    // ... rest unchanged ...
}

Note that openrouter.Message.role is a []const u8 (because that's what the wire JSON expects), while session.Message.role is a Role enum. Deliberate: the network layer uses protocol strings, the business layer uses safe Zig types. Translation happens at the boundary, in main.

The -i / --interactive flag

src/cli.zig
pub const Args = struct {
    help: bool = false,
    version: bool = false,
    interactive: bool = false,
    model: ?[:0]const u8 = null,
    api_key: ?[:0]const u8 = null,
    prompt: ?[:0]const u8 = null,
};

And the branch in the parser:

} else if (std.mem.eql(u8, arg, "-i") or std.mem.eql(u8, arg, "--interactive")) {
    out.interactive = true;

In config.resolve, the prompt is no longer required if --interactive:

src/config.zig
if (!args.interactive and args.prompt == null) return .no_prompt;

In interactive mode, the first message comes from stdin. The positional prompt is still accepted — it becomes the first message if provided.

The REPL loop in main

This is the piece that flips Hypercode from a one-shot binary into a conversational agent.

src/main.zig
const gpa = init.gpa;
var session = session_mod.Session.init(gpa);
defer session.deinit();

if (cfg.prompt) |p| try session.append(.user, p);

if (cfg.interactive) {
    try repl(gpa, io, cfg, &session, stdout, stderr);
} else {
    try one_turn(gpa, io, cfg, &session, stdout, stderr);
}

one_turn factors what we did in Post 03 — one model call. New: it appends the response to the session.

fn one_turn(...) !void {
    const wire = try to_wire(gpa, session.messages.items);
    defer gpa.free(wire);

    const result = try openrouter.call(gpa, io, cfg.api_key, cfg.model, wire);
    switch (result) {
        .ok => |text| {
            defer gpa.free(text);
            try stdout.writeAll(text);
            try stdout.writeAll("\n");
            try stdout.flush();
            try session.append(.assistant, text);
        },
        // ... errors as before ...
    }
}

to_wire translates []session_mod.Message (with Role enum) into []openrouter.Message (with strings):

fn to_wire(
    gpa: std.mem.Allocator,
    messages: []const session_mod.Message,
) ![]const openrouter.Message {
    const wire = try gpa.alloc(openrouter.Message, messages.len);
    for (messages, 0..) |m, i| {
        wire[i] = .{ .role = @tagName(m.role), .content = m.content };
    }
    return wire;
}

@tagName(m.role) gives "user" or "assistant". Three lines bridge the two worlds.

The loop itself

fn repl(...) !void {
    // If a seed prompt was passed on CLI, answer it first.
    if (session.messages.items.len > 0) try one_turn(gpa, io, cfg, session, stdout, stderr);

    var stdin_buffer: [4096]u8 = undefined;
    var stdin_reader: Io.File.Reader = .init(.stdin(), io, &stdin_buffer);
    const stdin = &stdin_reader.interface;

    while (true) {
        try stdout.writeAll("> ");
        try stdout.flush();

        const raw = stdin.takeDelimiterInclusive('\n') catch |err| switch (err) {
            error.EndOfStream => {
                try stdout.writeAll("\n");
                try stdout.flush();
                return;
            },
            else => return err,
        };
        const trimmed = std.mem.trim(u8, raw, " \t\r\n");
        if (trimmed.len == 0) continue;
        if (std.mem.eql(u8, trimmed, "/quit") or std.mem.eql(u8, trimmed, "/exit")) return;

        try session.append(.user, trimmed);
        try one_turn(gpa, io, cfg, session, stdout, stderr);
    }
}

A five-line useful loop, the rest is ergonomics:

  • > as visual prompt
  • /quit or /exit or Ctrl-D to leave
  • blank lines are skipped
  • whitespace/CR/LF are trimmed

A trap that cost me thirty minutes

I first used takeDelimiterExclusive('\n'). The binary spun into an infinite loop after the first turn, printing > at 98% CPU.

Reading the stdlib source (zig/lib/std/Io/Reader.zig), the bug became obvious: takeDelimiterExclusive doesn't consume the delimiter. It returns content up to the \n, then toss(result.len) — but result.len doesn't include the \n. The \n stays in the stream. Next call reads "" before the un-consumed \n, and we loop.

The doc says "advancing the seek position past the delimiter". The code says the opposite. The doc lies.

Fix: takeDelimiterInclusive('\n') then std.mem.trim to drop the trailing \n. That's what we do here.

The kind of thing the zero-dependency policy forces us to understand — no wrapper masks the problem. The stdlib is the authority; when it lies, we dive into the source.

Pull cross-module tests

main.zig adds session.zig to the test import list:

test {
    _ = @import("cli.zig");
    _ = @import("config.zig");
    _ = @import("openrouter.zig");
    _ = @import("session.zig");
}
./zig/zig build test --summary all
Build Summary: 3/3 steps succeeded; 11/11 tests passed

The demo

./zig-out/bin/hypercode -i
> My favorite color is blue.
That's a great choice! Blue is such a versatile color — it can be calm and
soothing like a clear sky, or deep and mysterious like the ocean. Do you have
a particular shade of blue you like best?

> What color did I just say?
You said your favorite color is blue! I remember because you mentioned it in
your first message, and it's a wonderful choice — blue is such a calming and
refreshing color.

> /quit

The model remembers. Not by magic: every turn, we send the full history. The token cost grows with each exchange — that's the first real limit we'll have to address later, with a compression or sliding-window strategy.

Single-shot mode still works:

./zig-out/bin/hypercode "say hi in 3 words"
Hi there! 😊

The commits

Four commits, four layers:

9097f69 feat(main): REPL loop wires session + openrouter across turns
165d9c7 feat(cli): -i/--interactive makes prompt optional
c817d53 refactor(openrouter): call takes a message slice, not a single prompt
62e6fab feat(session): conversation state with role+content messages

git show 62e6fab shows the session structure alone. git show c817d53 shows the protocol refactor. The split stays legible.

Conclusion

The agent has memory now. We can talk to it like a colleague who follows the thread of the discussion. But it still can't do anything other than talk — it can't read a file, run a test, modify code.

In the next post, we give it its first real capability: a tool. We define OpenRouter's tool-call protocol, build a dispatcher, and wire in a Read tool that lets it read the user's files. From there, Hypercode becomes an agent.

Stuck, or want to share notes? Join the Discord server.