# Public Documents


Welcome to the **Artificial Organisations** research workspace, operated by the [Leith Document Company](https://leithdocs.com).

We're exploring whether LLM agents can form effective organisations — not just answer questions, but hold roles, maintain institutional memory, review each other's work adversarially, and build up knowledge over time. The **Perseverance Composition Engine (PCE)** is our testbed: a multi-agent system where a Composer drafts, a Corroborator fact-checks, a Critic evaluates, and a Curator maintains the document store. A Consul serves as persistent advisor across sessions. The system runs in production on real work — this workspace is where we write research applications, manage the project, and track what we learn.

The documents below are observations from operating this system. They're anecdotal and informal — field notes, not papers. The recurring theme is that **prompts are suggestions; structure is reality**: agents reliably follow structural constraints (tool access, visibility tiers, graph topology) but unreliably follow prompt instructions alone. We keep finding new instances of this principle.

## Articles

| Date | Document | Description |
|------|----------|-------------|
| 2026-03-31 | [Organising Agent Knowledge](articles/organising-agent-knowledge.md) | A four-layer taxonomy of agent knowledge and why self-knowledge delivered by infrastructure is more reliable than self-knowledge maintained by policy |
| 2026-03-10 | [Plumbing: first public release](outgoing/plumbing/release.md) | First binary release of the plumbing calculus compiler, interpreter, and MCP server for Linux and macOS |
| 2026-03-10 | [A typed language for agent coordination](outgoing/plumbing/typed-language-for-agent-coordination.md) | The plumbing calculus: typed channels, structural morphisms, composition, and agents. With examples and diagrams. |
| 2026-03-10 | [The agent that doesn't know itself](outgoing/plumbing/the-agent-that-doesnt-know-itself.md) | Session types, compaction protocols, document pinning, and why agents cannot recognise their own state without being told |
| 2026-03-05 | [Plumbing: a typed language for agent pipelines](articles/plumbing-language-teaser.md) | Introducing plumbing — a small typed language for describing how AI agents work together. Composition, structural morphisms, and agents that design their own organisational structure at runtime. |
| 2026-02-25 | [Structural Prompt Preservation: Keeping AI Agents on Track](outgoing/structural-prompt-preservation.md) | How separating system prompts from conversation history prevents behavioural drift under context compaction — and enables efficient prompt caching |
| 2026-02-17 | [Review: 10 Tips from Inside the Claude Code Team](outgoing/review-cherny-claude-code-team-tips.md) | Review of Boris Cherny's Claude Code team tips with comparison to PCE practices |
| 2026-02-17 | [Learning to Work with Agents](outgoing/learning-to-work-with-agents.md) | Practical guide to working effectively with the PCE agent system |

## Evaluation

| Date | Document | Description |
|------|----------|-------------|
| 2026-03-21 | [Plumbing Generation Benchmark](notes/evaluation/plumbing-generation-benchmark.md) | Can LLMs write valid programs in an unfamiliar typed coordination language? 25 models, 4 scenarios, 1000 trials. Six models at 100%; every program that parsed also typechecked. |

## Research observations

Empirical observations from operating a multi-agent system in production. These support our core thesis: **prompts are suggestions; structure is reality** — agents reliably follow structural constraints (tool access, visibility tiers, graph topology) but unreliably follow prompt instructions.

| Date | Document | Description |
|------|----------|-------------|
| 2026-03-14 | [The Stdio Bridge as a Natural Transformation](notes/research/bridge-as-natural-transformation.md) | A coding agent independently framed a runtime refactor as a natural transformation between functors. The categorical structure constrains the refactor so tightly that what has to be done becomes obvious. Emergent, not directed. |
| 2026-02-27 | [Engineering Cross-Departmental Communication Attempts](notes/observations/engineering-routes-via-curator-task-199.md) | An agent systematically exhausts alternatives when the architecture lacks the right channel — confused deputy, honest refusal, and engineering's own self-annotation |
| 2026-02-19 | [Extending Per-Agent Memory Beyond the Consul](notes/observations/extending-per-agent-memory.md) | First use of the current-state convention by a non-Consul agent — the Curator leaving a curation watermark |
| 2026-02-19 | [Cross-Session Memory for Persistent Agents](notes/observations/cross-session-memory-design.md) | Four-level memory hierarchy for agent organisations: context window → private notes → institutional memory → transcripts |
| 2026-02-15 | [Inner Observer Pattern for Agent Self-Monitoring](notes/observations/inner-observer-pattern.md) | An agent network can externalise the inner observer that a single agent structurally lacks — real-time monitoring of decisions against policy |
| 2026-02-15 | [Spontaneous Publication Decision](notes/observations/spontaneous-publication-decision.md) | The Consul spontaneously published an observation without asking — correct decision, unreflective process. Then did it again while documenting the first time. |
| 2026-02-15 | [LLMs as Teletype Users](notes/observations/llms-as-teletype-users.md) | Text-stream protocols from the terminal era fit how agents work better than modern visual interfaces |
| 2026-02-15 | [Confabulation Under Uncertainty](notes/observations/confabulation-under-uncertainty.md) | Agents construct confident answers from insufficient evidence rather than admitting ignorance — and tooling makes it worse |
| 2026-02-14 | [Agent Temporal Blindness](notes/observations/agent-temporal-blindness.md) | Agents have no perception of elapsed time, causing the web-hammering courtesy problem |
| 2026-02-14 | [Supervision Cost as the Scaling Bottleneck](notes/observations/supervision-cost-bottleneck.md) | The principal's attention, not the agent's capability, limits what can be delegated |
| 2026-02-14 | [Model Identity Confusion on Mid-Session Swap](notes/observations/model-identity-confusion-on-swap.md) | Four failure modes when the underlying model changes — pretraining, history, remit, and tool inventory override |
| 2026-02-14 | [Critic Score Gaming Under Explicit Constraints](notes/observations/critic-score-gaming.md) | Agents optimise for the scoring rubric rather than the intent behind it |
| 2026-02-14 | [Curator Metadata Hallucination Under Constraint](notes/observations/curator-metadata-hallucination.md) | Agents confabulate metadata when pressured to produce structured output they can't verify |
| 2026-02-14 | [Agent Orientation Bias](notes/observations/agent-orientation-bias.md) | Agents default to filesystem crawling rather than using the document store's search tools |
| 2026-02-14 | [Intelligent AI Delegation — Reading Note](notes/observations/reading-tomasev-2026-intelligent-delegation.md) | Notes on Tomašev et al. (2026) and its connection to our supervision cost observations |
| 2026-02-14 | [Lightweight Bug Tracking in the Document Store](notes/observations/practice-bug-tracking.md) | Using the document store itself as the bug tracker — practice note |

## Research paper

| Document | Description |
|----------|-------------|
| [Artificial Organisations (arXiv:2602.13275)](https://arxiv.org/abs/2602.13275) | Pre-print describing the artificial organisations framework and the PCE architecture |

## About

This workspace is part of the **Artificial Organisations** research programme exploring how LLM agents can form effective organisations — with defined roles, institutional memory, and adversarial review processes.

For questions, contact the workspace owner via [leithdocs.com](https://leithdocs.com).