Built-in codecs¶
yorishiro-proxy includes 14 built-in codecs for encoding, decoding, and hashing payloads, plus a specialized protobuf codec for gRPC message inspection. You can use the encoding chain codecs in Fuzzer, Resender, and Macro to transform payloads before sending.
Codec catalog¶
| Name | Category | Reversible | Description |
|---|---|---|---|
base64 |
Encoding | Yes | Standard Base64 (RFC 4648) |
base64url |
Encoding | Yes | URL-safe Base64 (RFC 4648 §5) |
url_encode_query |
Encoding | Yes | URL query encoding (spaces as +) |
url_encode_path |
Encoding | Yes | URL path encoding (spaces as %20) |
url_encode_full |
Encoding | Yes | Encode all non-alphanumeric characters to %XX |
double_url_encode |
Encoding | Yes | Apply URL query encoding twice |
hex |
Encoding | Yes | Hexadecimal encoding |
html_entity |
Encoding | Yes | Numeric HTML entities (&#xNN;) |
html_escape |
Encoding | Yes | Named HTML entities (&, <, etc.) |
unicode_escape |
Encoding | Yes | Unicode escape sequences (\uXXXX) |
md5 |
Hash | No | MD5 hash (hex-encoded output) |
sha256 |
Hash | No | SHA-256 hash (hex-encoded output) |
lower |
Case | No | Convert to lowercase |
upper |
Case | No | Convert to uppercase |
Encoding codecs¶
base64¶
Standard Base64 encoding per RFC 4648. Uses the standard alphabet (A-Z, a-z, 0-9, +, /) with = padding.
| Direction | Input | Output |
|---|---|---|
| Encode | Hello World |
SGVsbG8gV29ybGQ= |
| Decode | SGVsbG8gV29ybGQ= |
Hello World |
base64url¶
URL-safe Base64 encoding per RFC 4648 §5. Uses - and _ instead of + and /, making it safe for URLs and filenames.
| Direction | Input | Output |
|---|---|---|
| Encode | Hello World |
SGVsbG8gV29ybGQ= |
| Decode | SGVsbG8gV29ybGQ= |
Hello World |
url_encode_query¶
URL encoding for query string values. Spaces are encoded as +, special characters as %XX.
| Direction | Input | Output |
|---|---|---|
| Encode | key=value&foo=bar baz |
key%3Dvalue%26foo%3Dbar+baz |
| Decode | key%3Dvalue%26foo%3Dbar+baz |
key=value&foo=bar baz |
url_encode_path¶
URL encoding for path segments. Spaces are encoded as %20 (not +).
| Direction | Input | Output |
|---|---|---|
| Encode | /path/with spaces |
%2Fpath%2Fwith%20spaces |
| Decode | %2Fpath%2Fwith%20spaces |
/path/with spaces |
url_encode_full¶
Encodes all characters except unreserved characters (RFC 3986 §2.3: A-Z, a-z, 0-9, -, ., _, ~) to %XX format. Useful for WAF bypass testing.
| Direction | Input | Output |
|---|---|---|
| Encode | <script> |
%3Cscript%3E |
| Decode | %3Cscript%3E |
<script> |
double_url_encode¶
Applies URL query encoding twice. The % characters from the first encoding are themselves encoded, producing %25XX sequences. Useful for testing applications that decode URL encoding multiple times.
| Direction | Input | Output |
|---|---|---|
| Encode | <script> |
%253Cscript%253E |
| Decode | %253Cscript%253E |
<script> |
hex¶
Hexadecimal encoding of the raw bytes.
| Direction | Input | Output |
|---|---|---|
| Encode | ABC |
414243 |
| Decode | 414243 |
ABC |
html_entity¶
Encodes every character as a numeric HTML entity (&#xNN;). Useful for XSS payload construction and WAF bypass.
| Direction | Input | Output |
|---|---|---|
| Encode | <img> |
<img> |
| Decode | <img> |
<img> |
html_escape¶
Encodes the five special HTML characters using named entities: & → &, < → <, > → >, " → ", ' → '.
| Direction | Input | Output |
|---|---|---|
| Encode | <a href="x"> |
<a href="x"> |
| Decode | <a href="x"> |
<a href="x"> |
unicode_escape¶
Encodes characters as \uXXXX Unicode escape sequences. Supplementary characters (above U+FFFF) use surrogate pairs.
| Direction | Input | Output |
|---|---|---|
| Encode | Hello |
\u0048\u0065\u006C\u006C\u006F |
| Decode | \u0048\u0065\u006C\u006C\u006F |
Hello |
Hash codecs¶
Hash codecs are one-way — Encode produces a hash, but Decode returns an error (ErrIrreversible). The output is always a hex-encoded string.
md5¶
Computes the MD5 digest (128-bit) and returns a 32-character hex string.
| Direction | Input | Output |
|---|---|---|
| Encode | password |
5f4dcc3b5aa765d61d8327deb882cf99 |
sha256¶
Computes the SHA-256 digest (256-bit) and returns a 64-character hex string.
| Direction | Input | Output |
|---|---|---|
| Encode | password |
5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8 |
Case codecs¶
Case codecs are one-way — the original casing cannot be recovered, so Decode returns an error.
lower¶
Converts all characters to lowercase.
| Direction | Input | Output |
|---|---|---|
| Encode | Hello World |
hello world |
upper¶
Converts all characters to uppercase.
| Direction | Input | Output |
|---|---|---|
| Encode | Hello World |
HELLO WORLD |
Protocol codecs¶
Protocol codecs handle binary protocol formats. Unlike encoding chain codecs, these are used internally by the proxy to decode and re-encode protocol-specific payloads (e.g., when intercepting gRPC traffic).
protobuf¶
Decodes Protocol Buffers wire format without requiring .proto schema files. Uses heuristic type inference to convert binary protobuf data to human-readable JSON and back, following the same approach as PacketProxy.
gRPC frame support: Handles the 5-byte gRPC length-prefixed framing format (1 byte compressed flag + 4 bytes big-endian message length + payload). When you intercept a gRPC request or response, the proxy automatically strips the gRPC framing, decodes the protobuf payload to JSON for editing, and re-encodes it with proper framing on forward.
Compression support: Supports gzip, deflate, snappy, and zstd compressed gRPC messages. When the grpc-encoding header indicates compression, the payload is decompressed before protobuf decoding and re-compressed after encoding.
JSON key format: Decoded fields use the format "field_number:ordinal:type" (e.g., "0001:0000:Varint", "0002:0001:String", "0003:0002:embedded message").
Type inference priority for length-delimited fields:
- UTF-8 printable string
- Embedded protobuf message (recursive)
- Packed repeated varints
- Raw bytes (hex)
Example: A gRPC message with field 1 = "hello" and field 2 = 42 decodes to:
Not available in encoding chains
The protobuf codec is used internally for gRPC intercept and inspection. It is not available as a named codec in fuzzer/resender/macro encoding chains.
Encoding chains¶
You can combine multiple codecs into a chain. Codecs are applied in order during encoding and in reverse order during decoding.
How chains work¶
When you specify encoding: ["base64", "url_encode_query"]:
- Encode: value → base64 → url_encode_query → encoded payload
- Decode: encoded payload → url_encode_query (decode) → base64 (decode) → original value
Usage in MCP tools¶
Specify encoding chains in the encoding parameter of fuzzer payload sets, resender body patches, or macro steps:
// fuzz
{
"flow_id": "abc123",
"positions": [
{
"location": "body",
"name": "payload",
"start": 10,
"end": 20
}
],
"payload_sets": [
{
"name": "payload",
"type": "wordlist",
"values": ["<script>alert(1)</script>", "' OR 1=1 --"],
"encoding": ["base64"]
}
]
}
Practical chain examples¶
Base64 + URL encoding¶
Encode a payload in Base64, then URL-encode the result for safe inclusion in a query parameter:
<script> → PHNjcmlwdD4= → PHNjcmlwdD4%3D
Double URL encoding for WAF bypass¶
<script> → %253Cscript%253E
Hash generation¶
Generate SHA-256 hashes of wordlist entries (e.g., for testing hash-based authentication):
password123 → ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f
Unicode escape for filter bypass¶
alert → \u0061\u006C\u0065\u0072\u0074
Custom codec plugins¶
You can extend the codec registry with custom codecs written in Starlark. Custom codecs are registered through the codec_plugins section in the config file and become available in encoding chains alongside built-in codecs.
See Codec plugins for details on writing custom codecs.
Name conflicts
If a custom codec uses the same name as a built-in codec, registration will fail with an error. Choose unique names for custom codecs.
Related pages¶
- Fuzzer — Use encoding chains with payload sets
- Resender — Apply codecs to body patches
- Macros — Use codecs in macro steps
- Codec plugins — Write custom codecs in Starlark