preserves/_includes/cheatsheet-binary.md

47 lines
2.3 KiB
Markdown
Raw Normal View History

2022-06-20 15:05:47 +00:00
For a value `v`, we write `«v»` for the binary encoding of `v`. The
length of an encoding is always available from context: either from
a containing encoded value, or from the overall container of the data,
which could be a file, an HTTP message, a UDP packet, etc.
«#f» = [0xA0]
«#t» = [0xA1]
«F» = [0xA2] ++ binary32(F) if F ∈ Float
«D» = [0xA2] ++ binary64(D) if D ∈ Double
«x» = [0xA3] ++ intbytes(x) if x ∈ SignedInteger
«S» = [0xA4] ++ utf8(S) ++ [0] if S ∈ String
[0xA5] ++ S if S ∈ ByteString
[0xA6] ++ utf8(S) if S ∈ Symbol
«<L F_1...F_m>» = [0xA7] ++ seq(«L», «F_1», ..., «F_m»)
«[X_1...X_m]» = [0xA8] ++ seq(«X_1», ..., «X_m»)
«#{E_1...E_m}» = [0xA9] ++ seq(«E_1», ..., «E_m»)
«{K_1:V_1...K_m:V_m}» = [0xAA] ++ seq(«K_1», «V_1», ..., «K_m», «V_m»)
seq(R_1, ..., R_m) = len(|R_1|) ++ R_1 ++...++ len(|R_m|) ++ R_m
len(m) = e(m, 128)
e(v, d) = [v + d] if v < 128
e(v / 128, 0) ++ [(v % 128) + d] if v ≥ 128
«#!V» = [0xAB] ++ «V»
The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
8-byte IEEE 754 binary representations of `F` and `D`, respectively.
The function `intbytes(x)` is a big-endian two's-complement signed
binary representation of `x`, taking at least as many whole bytes as
needed to unambiguously identify the value and its sign. `intbytes(0)`
may be the empty byte sequence.
When reading, the length of the input is supplied externally. This means
that, when reading a length/value pair in a `seq()`, each length should
be passed down to the decoder for the corresponding value, so that the
decoder knows when to stop.
**Annotations.** To annotate a `Repr` `r` (that *MUST NOT* itself
already be annotated) with some sequence of `Value`s `[v_1, ..., v_m]`
(that *MUST* be non-empty), surround `r` as follows:
[0xBF] ++ len(|r|) ++ r ++ len(|«v_1»|) ++ «v_1» ++...++ len(|«v_m»|) ++ «v_m»