Protocols

Protocol declarations specify ordering constraints on channel interactions — "pause before get_memory, get_memory before replace_memory." They are the linear fragment of the plumbing type system: while data channels freely support copy, discard, and merge, protocol channels must be consumed exactly once according to the declared protocol.

See processes.md for the data-flow primitives and algebraic structure. See plumbing.5.md for the full language reference.

Two fragments

The plumbing language has two categorically distinct kinds of channel:

Data channels are non-linear. They live in the copy-discard (CD) category. Objects carry the ! modality (streams). Copy, discard, and merge all apply. The comonoid structure is universal.

Protocol channels are linear. They live in the session-type fragment. No copy, no discard. A protocol channel must be consumed exactly once according to the declared protocol.

Why linearity matters operationally: if you could copy a protocol channel, two processes would both try to be at "step 3" of the same protocol — a race condition. If you could discard a protocol channel, the other end would deadlock waiting for a message that will never arrive. Linearity is the structural enforcement of protocol safety.

Syntax

Sequential protocol

protocol Compaction =
  send Pause . recv PauseAck .
  send GetMemory . recv MemoryDump .
  send SetMemory . recv SetMemoryAck .
  send Resume . end

send T — output a message of type T, then continue
recv T — input a message of type T, then continue
. — sequencing operator (separates steps)
end — session termination

Labelled choice

protocol DocumentOps = {
  read:  send Path . recv Content . loop,
  write: send (Path, Content) . recv Ack . loop,
  close: end
}

The initiator selects a branch by label. Each branch is a session type. This is internal choice (⊕ in linear logic) — the process speaking this protocol picks which branch to take.

Recursion

loop returns to the top of the enclosing protocol declaration. It is an implicit μ — the protocol declaration itself is the binding form for the recursion variable.

For a protocol with choice:

protocol FileOps = {
  read:  send Path . recv Content . loop,
  close: end
}

After completing a read, the client returns to the choice point, not to the beginning of the read branch. The client may then choose read again or close. The recursion wraps the entire choice.

Sequential recursion (no choice needed):

protocol ReqResp = send Request . recv Response . loop

This defines an infinite exchange: send a request, receive a response, repeat.

Named recursion variables (μX. ... X) are not yet supported.

Why `send`/`recv`, not `!`/`?`

The session types literature uses !A for send and ?A for receive. The plumbing language already uses !T to mean "stream of T" (the exponential modality from linear logic). Using ! for send would create a genuine ambiguity: !Pause could mean "stream of Pause" or "send Pause." The send/recv keywords avoid this confusion. The correspondence to the Honda/Yoshida notation is: send = !, recv = ?, end = end.

Grammar

protocol_decl  ::= "protocol" IDENT "=" session_type

session_type   ::= session_seq
                 | "{" session_branch ("," session_branch)* "}"

session_seq    ::= "send" msg_type "." session_type
                 | "recv" msg_type "." session_type
                 | "end"
                 | "loop"

session_branch ::= IDENT ":" session_seq

msg_type       ::= type_atom (without dotted names)

Message types use a restricted form of type_atom that accepts plain identifiers (Request) but not dotted names (Module.Type). This avoids a grammar ambiguity: in send Request . recv Response, the . is the sequencing operator, not a module separator. The dotted_name rule would greedily consume Request.recv as a qualified name.

Duality

Every session type has a dual — the type seen by the process at the other end of the channel. Duality swaps the direction of communication:

Original	Dual
`send T . S`	`recv T . dual(S)`
`recv T . S`	`send T . dual(S)`
`select { l₁: S₁, ... }`	`offer { l₁: dual(S₁), ... }`
`offer { l₁: S₁, ... }`	`select { l₁: dual(S₁), ... }`
`end`	`end`
`loop`	`loop`

Duality is an involution: dual(dual(S)) = S for all session types S. This is the structural guarantee that protocol endpoints are symmetric. If process P speaks protocol S, the process at the other end speaks dual(S), and compatibility follows from the algebra — no runtime check needed.

end is self-dual because both endpoints agree the session is finished. loop is self-dual because both endpoints agree to return to the top — the recursion point is shared, not directional.

In Phase 1, offer is produced only by dual, not by the parser. Protocols are written from the initiator's perspective (using select syntax: { label: branch, ... }). The handler's perspective is computed by duality. Server declarations that parse offer directly from source are future work.

Worked example

The compaction protocol (the homunculus's view):

protocol Compaction =
  send Pause . recv PauseAck .
  send GetMemory . recv MemoryDump .
  send SetMemory . recv SetMemoryAck .
  send Resume . end

The agent's view (dual(Compaction)):

recv Pause . send PauseAck .
recv GetMemory . send MemoryDump .
recv SetMemory . send SetMemoryAck .
recv Resume . end

Walk-through: the homunculus sends Pause, the agent receives it and sends PauseAck. The homunculus sends GetMemory, the agent dumps its history as MemoryDump. The homunculus sends the compacted SetMemory, the agent acknowledges with SetMemoryAck. The homunculus sends Resume, and the session ends. Both sides agree on the order and direction of every message. This agreement is enforced by the type system, not by instructions.

Well-formedness rules

The parser checks four well-formedness conditions on protocol declarations:

Guardedness

Every path from the top of a protocol to a loop must pass through at least one send or recv.

This prevents unproductive protocols — the session-type analogue of a divergent term. A process "implementing" an unguarded protocol would spin without ever exchanging a message, and any synchronisation point waiting on it would deadlock.

Rejected:

protocol Spin = loop
(* error: Spin: unguarded loop *)

protocol Bad = {
  ok: send A . end,
  spin: loop
}
(* error: Bad: unguarded loop *)

This is the standard contractivity condition from the session types literature (Gay & Hole, Honda et al.).

No stream types in messages

Types in send and recv positions must be data types. Stream types (!T) are rejected.

protocol Bad = send !string . end
(* error: Bad: stream type !string not allowed in send position *)

Sending a stream over a protocol channel would be session delegation — passing a channel endpoint as a message. This is a well-studied feature in the session types literature but is future work. Restricting messages to data types preserves the clean separation between the two fragments.

Label uniqueness

Branch labels within a single choice must be unique.

protocol Bad = {
  go: send A . end,
  go: send B . end
}
(* error: Bad: duplicate branch label 'go' *)

Labels are the selection mechanism. Duplicate labels make the choice ambiguous.

Protocol name uniqueness

No two protocol declarations in the same program may share a name.

protocol P = send A . end
protocol P = send B . end
(* error: duplicate protocol name: P *)

Module system

Protocols participate in the module system alongside types and bindings. When a file is imported via use, its protocol names are qualified with the module prefix:

(* compaction.plumb *)
module Compaction

type Pause = { reason: string }
type PauseAck = { ok: bool }
protocol Compact = send Pause . recv PauseAck . end

(* main.plumb *)
use compaction

(* Compact is available as Compaction.Compact *)
(* Pause and PauseAck as Compaction.Pause and Compaction.PauseAck *)

The loader rewrites message types within protocol declarations to use qualified names, matching the existing behaviour for type and binding references.

Algebraic connections

The session type primitives correspond to linear logic connectives:

Session type	Linear logic	Intuition
`send A . S`	`A ⊗ S`	"I send A and continue with S" (multiplicative conjunction)
`recv A . S`	`A ⅋ S`	"I receive A and continue with S" (multiplicative disjunction)
`select { ... }`	`⊕`	"I choose one branch" (additive disjunction, internal choice)
`offer { ... }`	`&`	"You choose, I handle all branches" (additive conjunction, external choice)
`end`	`1`	Multiplicative unit
`!A` (on data channels)	`!A`	Exponential modality ("of course")

The duality of session types (send ↔ recv, select ↔ offer) maps to the duality between copy/filter (selection) and merge (being-selected-for) in the existing CD algebra. This connection situates the plumbing category within *-autonomous categories, where every type has a linear negation given by duality.

This correspondence is the reason session types work, not a surface annotation. The operational guarantees (no deadlock from duality, no divergence from guardedness) follow from the algebraic structure.

For the data-flow primitives and the CD category structure, see processes.md.

Compilation target

Session types are a specification layer. They compile to barrier chains using existing CD-category primitives — barrier, copy, filter, merge, and discard. Each sequential send A . recv B pair maps to a barrier that synchronises the send with its response before the next step can proceed. Choice maps to copy + filter branching. Recursion maps to feedback loops. The functor from session types to the plumbing category is total: every session type expression has a plumbing translation.

All session type constructs (send, recv, end, loop, select, offer) compile to barrier chains using CD-category primitives. Linear protocols produce sequential barrier chains; select/offer produce per-branch barrier chains routed by label filters; loop uses feedback Merge (for linear protocols) or self-triggering from label filter output (for branching protocols). The compiler is in compile_session.ml. For the design rationale — why envelope tagging is necessary, how the functor works, edge cases — see session-compilation.md.

Bidirectional wiring (`<->`)

The <-> combinator connects a port pair to a compiled protocol, automating the barrier chain, envelope tagging, and per-step channel routing that would otherwise require hand-wiring:

(ctrl_out, ctrl_in) <-> Compaction as session

Syntax

session_wire ::= "(" IDENT "," IDENT ")" "<->" protocol_name "as" IDENT

The left-hand side is a port pair: the output port (agent responses come from here) and the input port (send-step payloads go here). The right-hand side names the protocol declaration and an instance alias.

Unicode ↔ (U+2194) is accepted as an alternative to <->.

Named channels

The alias creates addressable channels within the plumb body:

Channel	Type	Description
`alias@trigger`	Unit	Write to start the protocol
`alias@done`	Unit	Fires when the final step completes
`alias@send(T)`	T	Write payload for send step with type T
`alias@recv(T)`	T	Read payload from recv step with type T
`alias@done(T)`	Unit	Fires when recv step with type T completes

Type names use the pretty-printed form: string, int, { name: string }, etc.

Example

Without <-> (112 lines of hand-wired barriers, copies, filters):

let compact : (!CtrlMsg, !string) -> !CtrlMsg =
  plumb(ctrl_out, telemetry, ctrl_in) {
    (* ... 100+ lines of barrier chains ... *)
  }

With <-> (~15 lines):

let compact : (!CtrlMsg, !string) -> !CtrlMsg =
  plumb(ctrl_out, telemetry, ctrl_in) {
    (ctrl_out, ctrl_in) <-> Compaction as session

    telemetry ; filter(kind = "usage" && prompt_tokens > 150000) ;
      map(null) ; session@trigger

    session@trigger           ; map({pause: true})     ; session@send(Pause)
    session@done(PauseAck)    ; map({get_memory: true}) ; session@send(GetMemory)
    session@recv(MemoryDump)  ; compressor ;
      map({set_memory: .messages})                      ; session@send(SetMemory)
    session@done(SetMemoryAck) ; map({resume: true})    ; session@send(Resume)
  }

How it works

The <-> desugarer generates:

Tagger on the output port — wraps agent responses with { __step: N, payload: msg } using the recv-step indices from the protocol
Per-send-step wrapping maps — each alias@send(T) gets a map that wraps with { __step: N, payload: . } using that step's index
MergeAny tree combining tagger output and send wrappers into the barrier chain's control input. Uses any-EOF semantics at the root (MergeAny) so that tagger EOF (agent exit) propagates shutdown without waiting for send wrappers. Inner merges (multiple send wrappers) use regular Merge (all-EOF). See PLMB-033
Compiled barrier chain (from compile_protocol) — filters by __step, synchronises via barriers, extracts payloads
Recv payload routing — extracted recv payloads routed to alias@recv(T) channels
Named channels for trigger, done, and per-step signals

The tagger is a new stateful primitive (Tagger of int list * string option) that lives at the linear/non-linear boundary. It wraps each message with an envelope, using configurable step indices rather than a simple incrementing counter. This is necessary because recv steps may have non-contiguous indices in the protocol (e.g., indices 1, 3, 5, 7 in a send/recv alternating protocol).

Generalisation

The port pair generalises beyond control ports. Any pair of ports can speak a protocol:

(data_out, data_in) <-> DataProtocol as dp

This supports patterns where agents have multiple protocol-governed interactions.

Operational constraints

Round serialisation

Terminating protocols may receive multiple triggers (e.g. one per API response). Rounds must not overlap — concurrent rounds sharing the same control channel deadlock. The compiler inserts a token ring to serialise rounds:

trigger ──→ Barrier(trigger, gate) ──→ Project(0) ──→ internal_trigger
                       ↑                                     │
                     gate                             [barrier chain]
                       ↑                                     │
               Copy ──→ gate                          internal_done
                    ──→ done_ch                              │
                                                       Copy ──┘

The gate channel is a seeded channel — pre-loaded with one null token before threads start (a Petri net initial marking). Round 1 fires immediately (seed + first trigger). Subsequent rounds block until the previous round's done signal replenishes the gate.

For looping protocols, repeated triggers are the correct semantics — each trigger starts a new protocol round via the feedback Merge.

Debug taps

When PIPELINE_DEBUG=1, the desugarer inserts Copy+Log taps at each session step:

Send steps: payload logged before envelope wrapping
Recv steps: payload logged after envelope extraction
Trigger: logged when trigger value arrives

Log output on stderr:

{"log":"debug","event":"protocol:session:send:Pause","value":{"pause":true}}
{"log":"debug","event":"protocol:session:recv:PauseAck","value":{"kind":"pause_ack"}}
{"log":"debug","event":"protocol:session:trigger","value":null}

When PIPELINE_DEBUG is not set, no taps are inserted — the compiled topology is identical and there is zero overhead.

What is not yet implemented

Phases 1–5 provide protocol declarations, static well-formedness checking, compilation to barrier chains, and bidirectional wiring. The following are future work:

Nested select/offer (select within a branch of another select — needs path-based addressing)
Protocol ports in agent type signatures
Parametrised port addressing in wiring chains (session@send(Pause) in wiring syntax)
server declarations (parsing offer from source)
Runtime protocol enforcement
Named recursion (μX. ... X)
Session type subtyping
Session delegation (passing channel endpoints as messages)

Transport provenance

Every message in the fabric carries a provenance trace recording the path it took through the compiled graph (PLMB-068). This replaces the interim {__src, payload} telemetry envelope with a general transport-level mechanism covering all data and telemetry ports.

Trace types

The trace is a tree with four node types:

Endpoint: agents and boundary ports. Only Endpoints mint message IDs.
Step: structural morphisms prepend themselves (name + kind).
Join: barrier synchronisation. Input traces become children.
Empty: EOF sentinels, seed messages, handshakes.

Stamping order: innermost = earliest. For f ; g, the trace is Step(g, Step(f, Endpoint(...))). This is the free-category construction.

v2 wire format

Provenance is encoded as a payload prefix by the protocol layer (lib/fabric/protocol.ml). The transport signature (Transport_intf.S) is unchanged.

v2 payload = <0x02><4-byte trace-len><trace-json><app-payload>
v1 payload = raw application data (no 0x02 prefix)

The decoder checks byte 0: if 0x02, extract trace; otherwise return Empty trace and the full payload. All fabric application payloads are JSON (starting with {, [, ", digit, t, f, n) or empty-string EOF — none start with 0x01 or 0x02, so the sentinel bytes are unambiguous.

Three sentinel byte values partition the payload namespace:

Byte	Meaning	Format
empty (length 0)	EOF — end-of-stream	no payload
`0x01`	Drain marker — feedback cycle quiescence probe	`0x01` + merge_name (UTF-8)
`0x02`	v2 traced payload	`0x02` + 4-byte trace-len + trace-json + app-payload

EOF: empty payload (length 0). No trace prefix. recv_with_trace returns (Empty, "").

Drain marker: 0x01 prefix followed by the merge name that originated the probe. Used for distributed termination detection in feedback cycles. See lib/fabric/protocol.ml for the API and lib/fabric/inline.ml for the drain state machine.

Trace JSON format

{"step":{"name":"copy0","kind":"copy",
         "child":{"endpoint":{"name":"composer","msg_id":"a1b2..."}}}}

Keys: endpoint, step, join, empty. Fields: name, kind, msg_id, child, children.

Morphism trace behaviour

Morphism	Trace
identity	`Step "id"` wrapping input
copy	`Step "copy"`, serialised once for both outputs
merge	`Step "merge"` — NOT a Join (each message has one predecessor)
barrier	`Join` with children from all N inputs
map	`Step "map"`
filter	`Step "filter"` (survivors only)
project	`Step "project"`, preserves full Join
tagger	`Step "tagger"`
format_json	`Step "format_json"`
parse_json	`Step "parse_json"`

Agent stamping

Agents stamp Endpoint { name = agent_name; msg_id = fresh_id() } on every outgoing message (data and telemetry). On receive, origin_names extracts the sender name from the trace for the runtime context From: field.

Channel separation

is_eof operates on v1 (raw) channel payloads only — handshakes, tool requests, control. Traced channels use recv_with_trace, which handles EOF internally.

Interaction with session envelopes

Session envelopes ({__step, payload}) sit inside the app-payload, below the trace frame. The two layers are independent: the trace tracks the message's path through the graph, while __step tracks its position in a protocol.

Protocols

Two fragments

Syntax

Sequential protocol

Labelled choice

Recursion

Why send/recv, not !/?

Grammar

Duality

Worked example

Well-formedness rules

Guardedness

No stream types in messages

Label uniqueness

Protocol name uniqueness

Module system

Algebraic connections

Compilation target

Bidirectional wiring (<->)

Syntax

Named channels

Example

How it works

Generalisation

Operational constraints

Round serialisation

Debug taps

What is not yet implemented

Transport provenance

Trace types

v2 wire format

Trace JSON format

Morphism trace behaviour

Agent stamping

Channel separation

Interaction with session envelopes

Why `send`/`recv`, not `!`/`?`

Bidirectional wiring (`<->`)