preserves/cheatsheet.md at 894000a54fc35be5c6885dc6c5dc391aacd2ba61

2.5 KiB

Raw Blame History

no_site_title	title
true	Preserves Cheatsheets

Tony Garnock-Jones tonyg@leastfixedpoint.com
June 2022. Version 0.7.0.

Machine-Oriented Binary Syntax

For a value v, we write «v» for the binary encoding of v. The length of an encoding is always available from context: either from a containing encoded value, or from the overall container of the data, which could be a file, an HTTP message, a UDP packet, etc.

                      «#f» = [0xA0]
                      «#t» = [0xA1]
                       «F» = [0xA2] ++ binary32(F)     if F ∈ Float
                       «D» = [0xA2] ++ binary64(D)     if D ∈ Double
                       «x» = [0xA3] ++ intbytes(x)     if x ∈ SignedInteger
                       «S» = [0xA4] ++ utf8(S) ++ [0]  if S ∈ String
                             [0xA5] ++ S               if S ∈ ByteString
                             [0xA6] ++ utf8(S)         if S ∈ Symbol

           «<L F_1...F_m>» = [0xA7] ++ seq(«L», «F_1», ..., «F_m»)
             «[X_1...X_m]» = [0xA8] ++ seq(«X_1», ..., «X_m»)
            «#{E_1...E_m}» = [0xA9] ++ seq(«E_1», ..., «E_m»)
     «{K_1:V_1...K_m:V_m}» = [0xAA] ++ seq(«K_1», «V_1», ..., «K_m», «V_m»)

        seq(R_1, ..., R_m) = len(|R_1|) ++ R_1 ++...++ len(|R_m|) ++ R_m

                    len(m) = e(m, 128)

                   e(v, d) = [v + d]                           if v < 128
                             e(v / 128, 0) ++ [(v % 128) + d]  if v ≥ 128

                     «#!V» = [0xAB] ++ «V»

The functions binary32(F) and binary64(D) yield big-endian 4- and 8-byte IEEE 754 binary representations of F and D, respectively.

The function intbytes(x) gives the big-endian two's-complement signed binary representation of x, taking exactly as many whole bytes as needed to unambiguously identify the value and its sign. intbytes(0) is the empty byte sequence.

When reading, the length of the input is supplied externally. This means that, when reading a length/value pair in a seq(), each length should be passed down to the decoder for the corresponding value, so that the decoder knows when to stop.

Annotations. To annotate a Repr r (that MUST NOT itself already be annotated) with some sequence of Values [v_1, ..., v_m], surround r as follows:

[0xBF] ++ len(|r|) ++ r ++ len(|«v_1»|) ++ «v_1» ++...++ len(|«v_m»|) ++ «v_m»

2.5 KiB Raw Blame History

Machine-Oriented Binary Syntax

2.5 KiB

Raw Blame History