Architecture¶
yorishiro-proxy is built on the Envelope + Layered Connection Model defined in RFC-001. Every connection is decomposed into a stack of Layers; each Layer yields one or more Channels; the Pipeline processes protocol-agnostic Envelopes drawn from those Channels. This page is the reference for the data model, the canonical pipeline, and how the pieces fit together.
Design principles¶
L7-first, L4-capable¶
- The default operation interface is a structured L7 view. Communication is represented as protocol-native structured data (method, URL, headers, body, frames, events) so AI agents can reason about it without parsing bytes.
- Raw bytes recording, viewing, and modification are possible for every protocol. As a diagnostic tool, the proxy must support protocol-level anomaly detection and reproduction. Pure transport protocols (e.g. SOCKS5) apply this principle to the tunneled inner protocol.
- L7 parsing is an overlay on top of raw bytes. The wire-observed raw byte snapshot is never destroyed or modified. Mutations are always recorded as separate derived data (a modified variant) alongside the original.
MCP-first control plane¶
Every operation is exposed as an MCP tool. There is no separate CLI command set or REST API. The Web UI itself is an MCP client over Streamable HTTP, indistinguishable from any other agent.
No external proxy libraries¶
The proxy is built on Go's standard library plus a small set of approved dependencies. This eliminates version conflicts, reduces the attack surface, and gives full control over wire-level details that matter for security testing (header casing, header order, frame boundaries, smuggling-relevant byte sequences).
Core data model¶
Envelope¶
Envelope is the protocol-agnostic outer container that flows through the Pipeline. It carries identity, provenance, wire-fidelity raw bytes, and a typed Message payload.
// internal/envelope/envelope.go
type Envelope struct {
// Identity (shared across all protocols)
StreamID string
FlowID string
Sequence int
Direction Direction // Send (client->server) or Receive (server->client)
// Provenance
Protocol Protocol // http, ws, grpc, grpc-web, sse, raw, tls-handshake
// Wire fidelity (read-only view; authoritative bytes)
Raw []byte
// Protocol-specific structured view
Message Message
// Connection-scoped context (ConnID, ClientAddr, TLS snapshot, ...)
Context EnvelopeContext
// Layer-internal state; Pipeline must not type-assert on this
Opaque any
}
Any field on Envelope must be meaningful for every protocol, including raw TCP. Protocol-specific fields belong on the Message.
Message¶
Message is a small interface implemented by protocol-specific payload types. The Pipeline dispatches behavior by type-switching on env.Message.
// internal/envelope/message.go
type Message interface {
Protocol() Protocol
CloneMessage() Message
}
Implementations:
| Type | Package file | Carries |
|---|---|---|
HTTPMessage |
internal/envelope/http.go |
Method, URL, status, ordered headers/trailers, body |
WSMessage |
internal/envelope/ws.go |
Opcode, FIN/RSV bits, payload (per-message-deflate aware) |
GRPCStartMessage |
internal/envelope/grpc.go |
Service/method, request metadata |
GRPCDataMessage |
internal/envelope/grpc.go |
A single LPM-framed payload |
GRPCEndMessage |
internal/envelope/grpc.go |
Status code, status message, trailers |
SSEMessage |
internal/envelope/sse.go |
One data: event with id/event/retry |
RawMessage |
internal/envelope/raw.go |
Opaque byte chunk |
TLSHandshakeMessage |
internal/envelope/ |
SNI, ALPN, JA3/JA4, peer certificate (observation-only) |
gRPC-Web reuses HTTPMessage for the outer request/response and surfaces wire-format anomalies (AnomalyMalformedGRPCWebBase64, AnomalyMalformedGRPCWebLPM, ...) when bodies fail to parse.
Layer, Channel, ConnectionStack¶
A connection is an explicit stack of Layers, assembled per connection by proxybuild.BuildLiveStack. Each Layer wraps the bytes (or events) below it and exposes one or more Channels; the Pipeline runs on Envelopes drawn from those Channels.
The canonical stacks for live traffic are:
bytechunk → tlslayer → http1 → httpaggregator (HTTP/1 already aggregated)
bytechunk → tlslayer → http2 → httpaggregator (HTTP/2 with body buffering)
bytechunk → tlslayer → http1/http2 → grpc | grpcweb | ws | sse (post-Upgrade or content-type dispatch)
bytechunk (raw TCP fallback)
Layer packages live under internal/layer/:
internal/layer/bytechunk/— TCP byte-stream Layer (smuggling-safe pass-through).internal/layer/tlslayer/— TLS handshake termination +TLSHandshakeMessagesynthesis.internal/layer/http1/— Custom HTTP/1.x parser; preserves header casing and order.internal/layer/http2/— Event-granular HTTP/2 frame engine.internal/layer/httpaggregator/— Folds HTTP/2 events into a singleHTTPMessageper stream.internal/layer/grpc/— Native LPM reassembly producingGRPCStart/Data/Endenvelopes.internal/layer/grpcweb/— gRPC-Web (binary and base64 wire formats) over HTTP/1.x or HTTP/2.internal/layer/ws/— WebSocket framing with RFC 7692 per-message-deflate.internal/layer/sse/— Server-Sent Events (per-event envelopes, streaming-aware).
The Layer and Channel interfaces (internal/layer/layer.go, internal/layer/channel.go) are the only contract a new protocol implementation must satisfy.
Wire-level overlay envelopes¶
Several Layers also emit forensic wire-level overlay envelopes alongside the canonical semantic envelope, so the exact bytes at a lower protocol layer remain queryable even after reassembly:
| Overlay | Emitted by | Purpose |
|---|---|---|
h1-chunk |
internal/layer/http1 |
One envelope per HTTP/1.x chunked-transfer chunk boundary (USK-895) |
h2-frame |
internal/layer/http2 + internal/layer/httpaggregator |
One envelope per HTTP/2 DATA frame, including WS/SSE-over-h2 paths (USK-897 / USK-899) |
grpc-lpm-frame |
internal/layer/grpc |
One envelope per gRPC length-prefixed-message frame (USK-896) |
grpcweb-base64 |
internal/layer/grpcweb |
One envelope per gRPC-Web base64-encoded body chunk |
Overlay rows are hidden from default reads; pass filter.wire_level="h2-frame" (etc.) to query or manage.export_flows to surface them. See Flows: wire-level overlays.
L7 dispatch on TCP forwards¶
When a tcp_forwards entry sets protocol to auto (the default) the proxy peeks the initial bytes and dispatches the connection through the matching Layer stack: HTTP/1.x, HTTP/2 (h2c), gRPC, WebSocket, SSE (operator-declared via protocol: "sse"), or raw bytes for opaque traffic (USK-911 .. USK-918). Combined with the tls / upstream_tls matrix on ForwardConfig, this lets a forwarded port behave like any other inbound listener -- intercept, transform, plugin, and capture-scope rules apply to the dispatched envelopes the same way they apply to CONNECT-MITM traffic. The dispatch logic lives in internal/proxybuild/tcp_forward.go.
Canonical pipeline¶
The Pipeline is a fixed 8-step chain that runs on every Envelope. Plugin handling is split across two steps (PluginPre before Intercept, PluginPost after Macro), so the diagram shows nine boxes for one chain.
flowchart LR
HS[HostScope] --> HTS[HTTPScope]
HTS --> SF[Safety]
SF --> PRE[PluginPre]
PRE --> INT[Intercept]
INT --> TR[Transform]
TR --> MA[Macro]
MA --> POST[PluginPost]
POST --> REC[Record]
| Step | Source file | Purpose |
|---|---|---|
| HostScope | internal/pipeline/host_scope_step.go |
Drop envelopes outside the configured host scope |
| HTTPScope | internal/pipeline/http_scope_step.go |
Apply HTTP-specific scope filters (path, method, content-type) |
| Safety | internal/pipeline/safety_step.go |
Run Safety Filter Engine (input/output destructive-pattern + PII presets) |
| PluginPre | internal/pipeline/plugin_step_pre.go |
Dispatch (protocol, event, phase=pre) Starlark hooks |
| Intercept | internal/pipeline/intercept_step.go |
Hold queue + manual edit for matched envelopes |
| Transform | internal/pipeline/transform_step.go |
Per-protocol auto-rewrite rules |
| Macro | internal/pipeline/macro_step (via job runner) |
Macro engine: template / guard / extract / encoder |
| PluginPost | internal/pipeline/plugin_step_post.go |
Dispatch (protocol, event, phase=post) hooks |
| Record | internal/pipeline/record_step.go |
Persist Envelope (original + variant) to the Stream/Flow store |
Steps dispatch by type-switch on env.Message, so a step that only operates on HTTP traffic simply ignores envelopes whose Message is WSMessage, GRPCDataMessage, etc.
Resend, Macro fan-out, synthesized Send¶
Replay paths do not re-run Safety, PluginPre, or Intercept. Resend, Macro fan-out, and any synthesized Send envelope traverse a shorter chain:
This keeps replayed traffic from being re-held by interceptors and from re-firing pre-phase hooks that already ran on the original envelope. The Layer encode step at the tail is performed via the WireEncoderRegistry (internal/pipeline/wire_encoder_registry.go), which selects the per-protocol encoder for the outbound bytes.
Per-protocol rule engines¶
Intercept, Transform, and Safety each consult per-protocol rule engines under internal/rules/:
internal/rules/common/— HoldQueue, pattern compiler, shared presets.internal/rules/http/— Header / URL / body matchers and rewrites forHTTPMessage.internal/rules/ws/— Frame-level matchers forWSMessage.internal/rules/grpc/— Service/method matchers and metadata rewrites forGRPCStart/Data/End.
Pipeline Steps select the engine to consult by type-switching on env.Message. SSE and raw TCP envelopes fall through to the common matchers.
Protocol coverage¶
| Protocol | L7 view | L4 raw bytes | Layer package |
|---|---|---|---|
| HTTP/1.x | YES | YES | internal/layer/http1/ |
| HTTP/2 | YES | YES | internal/layer/http2/ + internal/layer/httpaggregator/ |
| gRPC | YES | YES (via H2) | internal/layer/grpc/ |
| gRPC-Web | YES | YES | internal/layer/grpcweb/ |
| WebSocket | YES | YES (per frame) | internal/layer/ws/ |
| SSE | YES | YES | internal/layer/sse/ |
| Raw TCP | N/A | YES (byte stream) | internal/layer/bytechunk/ |
| TLS handshake | YES (observation) | N/A | internal/layer/tlslayer/ |
| SOCKS5 | N/A (transport) | N/A | internal/connector/socks5_handler.go |
SOCKS5 is a transport: after the handshake completes, the inner connection re-enters protocol detection and is handled by the Layer stack appropriate for the tunneled protocol.
Listener and stack assembly¶
The TCP entrypoint is internal/connector/full_listener.go. Multi-listener orchestration lives in internal/connector/coordinator.go. Per-connection plumbing — peek-based detection, ALPN routing, CONNECT tunnel handling, h2 dispatch — is split across connector/detect.go, connector/alpn_routing.go, connector/connect_handler.go, and connector/h2_dispatch.go.
internal/proxybuild/ wires this all together: BuildLiveStack consumes a Deps struct and returns a runnable Stack that the Manager starts and stops via the proxy_start / proxy_stop MCP tools.
Web UI and MCP transport¶
The proxy starts an HTTP MCP transport on a loopback port at boot, serving both the MCP API and the embedded Web UI on the same address. Use -mcp-http-addr to set a fixed address. The Web UI is a React/Vite single-page application that talks to the backend exclusively through MCP over Streamable HTTP — the same protocol used by AI agents.
flowchart LR
A[AI Agent] -->|stdio MCP| B[yorishiro-proxy]
C[AI Agent] -->|Streamable HTTP MCP| B
D[Web UI] -->|Streamable HTTP MCP| B
B --> E[Connector + Layer stack]
B --> F[Pipeline]
F --> G[Stream / Flow store]
The Web UI has the same capability surface as any MCP client. Multiple agents and the Web UI can share access simultaneously, authenticated by Bearer tokens.
Related pages¶
- Flows — Stream and Flow data model, recording lifecycle, variants
- MCP-first design — Why every operation is an MCP tool
- Security model — Two-layer security architecture, Safety Filter, scope
- gRPC-Web — Binary and base64 wire format handling