preserves/_includes/cheatsheet-binary.md

2.3 KiB

For a value v, we write «v» for the binary encoding of v. The length of an encoding is always available from context: either from a containing encoded value, or from the overall container of the data, which could be a file, an HTTP message, a UDP packet, etc.

                      «#f» = [0xA0]
                      «#t» = [0xA1]
                       «F» = [0xA2] ++ binary32(F)     if F ∈ Float
                       «D» = [0xA2] ++ binary64(D)     if D ∈ Double
                       «x» = [0xA3] ++ intbytes(x)     if x ∈ SignedInteger
                       «S» = [0xA4] ++ utf8(S) ++ [0]  if S ∈ String
                             [0xA5] ++ S               if S ∈ ByteString
                             [0xA6] ++ utf8(S)         if S ∈ Symbol

           «<L F_1...F_m>» = [0xA7] ++ seq(«L», «F_1», ..., «F_m»)
             «[X_1...X_m]» = [0xA8] ++ seq(«X_1», ..., «X_m»)
            «#{E_1...E_m}» = [0xA9] ++ seq(«E_1», ..., «E_m»)
     «{K_1:V_1...K_m:V_m}» = [0xAA] ++ seq(«K_1», «V_1», ..., «K_m», «V_m»)

        seq(R_1, ..., R_m) = len(|R_1|) ++ R_1 ++...++ len(|R_m|) ++ R_m

                    len(m) = e(m, 128)

                   e(v, d) = [v + d]                           if v < 128
                             e(v / 128, 0) ++ [(v % 128) + d]  if v ≥ 128

                     «#!V» = [0xAB] ++ «V»

The functions binary32(F) and binary64(D) yield big-endian 4- and 8-byte IEEE 754 binary representations of F and D, respectively.

The function intbytes(x) is a big-endian two's-complement signed binary representation of x, taking at least as many whole bytes as needed to unambiguously identify the value and its sign. intbytes(0) may be the empty byte sequence.

When reading, the length of the input is supplied externally. This means that, when reading a length/value pair in a seq(), each length should be passed down to the decoder for the corresponding value, so that the decoder knows when to stop.

Annotations. To annotate a Repr r (that MUST NOT itself already be annotated) with some sequence of Values [v_1, ..., v_m] (that MUST be non-empty), surround r as follows:

[0xBF] ++ len(|r|) ++ r ++ len(|«v_1»|) ++ «v_1» ++...++ len(|«v_m»|) ++ «v_m»