Split out preserves into its own repository
This commit is contained in:
parent
2c03883b00
commit
d497d9e6d1
|
@ -1,2 +0,0 @@
|
||||||
preserve.pdf: preserve.md preserve.css
|
|
||||||
google-chrome --headless --disable-gpu --print-to-pdf=$@ http://localhost:4000/preserve.html
|
|
|
@ -1,662 +0,0 @@
|
||||||
---
|
|
||||||
---
|
|
||||||
<style>
|
|
||||||
body { font-size: 120%; margin-left: 2rem; }
|
|
||||||
h1, h2, h3, h4, h5, h6 { margin-left: -1rem; }
|
|
||||||
h2 { border-bottom: solid black 1px; }
|
|
||||||
</style>
|
|
||||||
|
|
||||||
2018-06-04 20:31:48 TODO: cwebber's email comments
|
|
||||||
2018-06-04 20:31:51 TODO: look at https://github.com/imbal/rson and at clojure EDN
|
|
||||||
2018-06-05 10:32:02 ... and at http://json-schema.org/latest/json-schema-core.html#rfc.section.4.2
|
|
||||||
|
|
||||||
# SPKI CAT: SPKI S-Expressions with Canonical Atom Tags
|
|
||||||
|
|
||||||
Tony Garnock-Jones <tonyg@leastfixedpoint.com>
|
|
||||||
Christopher Lemmer Webber <cwebber@dustycloud.org>
|
|
||||||
May 2018
|
|
||||||
Version 0.0.1
|
|
||||||
|
|
||||||
________________
|
|
||||||
/ \
|
|
||||||
/\__/\ / boo! \
|
|
||||||
/ \ \ i'm very spki! \
|
|
||||||
\=\_^__^_/= ___/\__________________/
|
|
||||||
|/ \
|
|
||||||
\\ | | /
|
|
||||||
<_|--|_>
|
|
||||||
|
|
||||||
## Introduction
|
|
||||||
|
|
||||||
This document proposes a language-neutral JSON-like *data type*, along
|
|
||||||
with a robust equivalence relation ("semantics") and a total ordering
|
|
||||||
over inhabitants of the type.[^tjson]
|
|
||||||
|
|
||||||
[^tjson]: [TJSON](https://www.tjson.org/) has a similar aim:
|
|
||||||
“different on-the-wire representations of an object correspond to
|
|
||||||
the same typed data object”
|
|
||||||
([source](https://news.ycombinator.com/item?id=12860143)); “TJSON
|
|
||||||
is defined as a serialization format on top of a JSON-like data
|
|
||||||
model” ([source](https://news.ycombinator.com/item?id=12860401)).
|
|
||||||
|
|
||||||
It then suggests conventions for encoding common data formats in terms
|
|
||||||
of the proposed data type.
|
|
||||||
|
|
||||||
Finally, it proposes concrete *syntax* for the data type, offering a
|
|
||||||
language-neutral transfer syntax (based on
|
|
||||||
[Rivest's S-Expressions][sexp.txt] as used in [SPKI/SDSI][spki]) and
|
|
||||||
suggesting possible language-specific representations for the data
|
|
||||||
type's inhabitants.
|
|
||||||
|
|
||||||
[sexp.txt]: http://people.csail.mit.edu/rivest/Sexp.txt
|
|
||||||
[spki]: http://world.std.com/~cme/html/spki.html
|
|
||||||
|
|
||||||
### Why not Just Use JSON?
|
|
||||||
|
|
||||||
<!-- JSON lacks semantics: JSON syntax doesn't denote anything -->
|
|
||||||
|
|
||||||
JSON offers *syntax* for numbers, strings, booleans, null, arrays and
|
|
||||||
string-keyed maps. However, it offers no *semantics* for the syntax:
|
|
||||||
it is left to each implementation to determine how to treat each JSON
|
|
||||||
term. This causes
|
|
||||||
[interoperability](http://seriot.ch/parsing_json.php) and even
|
|
||||||
[security](http://seriot.ch/parsing_json.php) issues.
|
|
||||||
|
|
||||||
Specifically, JSON does not:
|
|
||||||
|
|
||||||
- assign any meaning to numbers,[^meaning-ieee-double]
|
|
||||||
- determine how strings are to be compared,[^string-key-comparison]
|
|
||||||
- determine whether object key ordering is significant, or
|
|
||||||
- determine whether duplicate object keys are permitted, what it
|
|
||||||
would mean if they were, or how to determine a duplicate in the
|
|
||||||
first place.
|
|
||||||
|
|
||||||
In short, JSON syntax doesn't *denote* anything.[^xml-infoset] [^other-formats]
|
|
||||||
|
|
||||||
[^meaning-ieee-double]:
|
|
||||||
[Section 6 of RFC 7159](https://tools.ietf.org/html/rfc7159#section-6)
|
|
||||||
does go so far as to indicate “good interoperability can be
|
|
||||||
achieved” by imagining that parsers are able reliably to
|
|
||||||
understand the syntax of numbers as denoting an IEEE 754
|
|
||||||
double-precision floating-point value.
|
|
||||||
|
|
||||||
[^string-key-comparison]:
|
|
||||||
[Section 8.3 of RFC 7159](https://tools.ietf.org/html/rfc7159#section-8.3)
|
|
||||||
suggests that *if* an implementation compares strings used as
|
|
||||||
object keys “code unit by code unit”, then it will interoperate
|
|
||||||
with *other such implementations*, but neither requires this
|
|
||||||
behaviour nor discusses comparisons of strings used in other
|
|
||||||
contexts.
|
|
||||||
|
|
||||||
[^xml-infoset]: The XML world has the concept of
|
|
||||||
[XML infoset](https://www.w3.org/TR/xml-infoset/). Loosely
|
|
||||||
speaking, XML infoset is the *denotation* of an XML document; the
|
|
||||||
*meaning* of the document.
|
|
||||||
|
|
||||||
[^other-formats]: Most other recent data languages are like JSON in
|
|
||||||
specifying only a syntax with no associated semantics. While some
|
|
||||||
do make a sketch of a semantics, the result is often
|
|
||||||
underspecified (e.g. in terms of how strings are to be compared),
|
|
||||||
overly machine-oriented (e.g. treating 32-bit integers as
|
|
||||||
fundamentally distinct from 64-bit integers and from
|
|
||||||
floating-point numbers), overly fine (e.g. giving visibility to
|
|
||||||
the order in which map entries are written), or all three.
|
|
||||||
|
|
||||||
Some examples:
|
|
||||||
|
|
||||||
- are the JSON values `1`, `1.0`, and `1e0` the same or different?
|
|
||||||
- are the JSON values `1.0` and `1.0000000000000001` the same or different?
|
|
||||||
- are the JSON strings `"päron"` (UTF-8 `70c3a4726f6e`) and `"päron"`
|
|
||||||
(UTF-8 `7061cc88726f6e`) the same or different?
|
|
||||||
- are the JSON objects `{"a":1, "b":2}` and `{"b":2, "a":1}` the same
|
|
||||||
or different?
|
|
||||||
- which, if any, of `{"a":1, "a":2}`, `{"a":1}` and `{"a":2}` are the
|
|
||||||
same? Are all three legal?
|
|
||||||
- are `{"päron":1}` and `{"päron":1}` the same or different?
|
|
||||||
|
|
||||||
Different JSON implementations give different answers to these
|
|
||||||
questions. The JSON specifications are silent on these questions.
|
|
||||||
|
|
||||||
There are other minor problems with JSON having to do with its syntax.
|
|
||||||
Examples include its relative verbosity and its lack of support for
|
|
||||||
binary data.
|
|
||||||
|
|
||||||
## Starting with Semantics
|
|
||||||
|
|
||||||
Taking inspiration from functional programming, we start with a
|
|
||||||
definition of the *values* that we want to work with and give them
|
|
||||||
meaning independent of their syntax. We will treat syntax separately,
|
|
||||||
later in this document.
|
|
||||||
|
|
||||||
We will want our data type to accommodate *atoms* (numbers and text),
|
|
||||||
*products* (both tuples and sequences), and *labelled
|
|
||||||
sums*.[^zephyr-asdl] It should also include *keyed maps*. We should
|
|
||||||
avoid unnecessary restrictions such as machine-oriented fixed-width
|
|
||||||
integer or floating-point values where possible.
|
|
||||||
|
|
||||||
[^zephyr-asdl]: This design was loosely inspired by Zephyr ASDL (h/t
|
|
||||||
[Darius Bacon](https://twitter.com/abecedarius/status/993545767884226561)),
|
|
||||||
which doesn't offer much in the way of atoms, but offers
|
|
||||||
general-purpose labelled sums and products. See D. C. Wang, A. W.
|
|
||||||
Appel, J. L. Korn, and C. S. Serra, “The Zephyr Abstract Syntax
|
|
||||||
Description Language,” in USENIX Conference on Domain-Specific
|
|
||||||
Languages, 1997, pp. 213–228.
|
|
||||||
[PDF available.](https://www.usenix.org/legacy/publications/library/proceedings/dsl97/full_papers/wang/wang.pdf)
|
|
||||||
|
|
||||||
### Values
|
|
||||||
|
|
||||||
A `Value` is one of:
|
|
||||||
|
|
||||||
- a `ByteString` for general-purpose non-numeric atomic data,[^byte-string-rationale]
|
|
||||||
- a `Number` for integers and rational numbers,
|
|
||||||
- a `List` for general-purpose variable-length sequences,
|
|
||||||
- a `Map` for general-purpose variable-size key/value maps, or
|
|
||||||
- a `Record` for tagging a tuple of values with an intended interpretation.
|
|
||||||
|
|
||||||
We define a total order over `Value`s: Every `ByteString` is less than
|
|
||||||
the other kinds of `Value`; every `Number` is less than any `List`,
|
|
||||||
`Map` or `Record`, but greater than any `ByteString`; and so on.
|
|
||||||
|
|
||||||
That is, `ByteString < Number < List < Map < Record`.
|
|
||||||
|
|
||||||
Two values of the same kind are compared using kind-specific rules,
|
|
||||||
given below.
|
|
||||||
|
|
||||||
Two `Value`s are equal if neither is less than the other according to
|
|
||||||
the total order.
|
|
||||||
|
|
||||||
[^byte-string-rationale]: Why include `ByteString`, when we could
|
|
||||||
instead use a reserved `Record` along with a `List` of `Number`s?
|
|
||||||
((TODO: Actually decide about this! Similarly, why include `Map`
|
|
||||||
rather than a restricted form of `List` with a `Record`? I think
|
|
||||||
the answer has to do with the arbitrariness of the label we'd
|
|
||||||
pick: unless *extremely* carefully chosen (i.e. number 0 (ideally
|
|
||||||
even `-Inf`!) for byte strings, number 1 for map, and have the
|
|
||||||
order go `Number < Record < List`), they would mess up the
|
|
||||||
prettiness of the ordering. Though... we could ultimately reduce
|
|
||||||
this to `Number` and `Record`, and have a family of `#"list"` and
|
|
||||||
`#"map"` `Record`s...))
|
|
||||||
|
|
||||||
### Byte strings
|
|
||||||
|
|
||||||
A `ByteString` is an ordered sequence of zero or more integers in the
|
|
||||||
inclusive range [0..255].
|
|
||||||
|
|
||||||
`ByteString`s are compared lexicographically.
|
|
||||||
|
|
||||||
We will write examples of `ByteString`s that contain only ASCII
|
|
||||||
characters using “`#"`” as an opening quote mark and “`"`” as a
|
|
||||||
closing quote mark.
|
|
||||||
|
|
||||||
**Examples.** The `ByteString` containing the three ASCII characters
|
|
||||||
`A`, `B` and `C` is written as `#"ABC"`. The empty `ByteString` is
|
|
||||||
written as `#""`. **N.B.** Despite appearances, these are *binary*
|
|
||||||
data.
|
|
||||||
|
|
||||||
### Numbers
|
|
||||||
|
|
||||||
A `Number` is a signed rational number of finite precision whose
|
|
||||||
magnitude can be exactly represented in base two with a finite number
|
|
||||||
of digits. This includes integers of arbitrary width as well as (for
|
|
||||||
example) the non-infinite non-NaN IEEE 754 floating-point values.
|
|
||||||
|
|
||||||
`Number`s are compared as mathematical numbers.
|
|
||||||
|
|
||||||
We will write examples of `Number`s using standard mathematical
|
|
||||||
notation.
|
|
||||||
|
|
||||||
**Examples.** 10, -6, 0.5, -3/2, 33/192, -1.202E4567.
|
|
||||||
|
|
||||||
**Non-examples.** NaN (the clue is in the name!), ∞ (not finite),
|
|
||||||
0.2 (cannot be exactly represented with a finite number of binary
|
|
||||||
digits), 1/7 (likewise), 2+*i*3 (not rational), √2 (likewise).
|
|
||||||
|
|
||||||
### Lists
|
|
||||||
|
|
||||||
A `List` is an ordered sequence of zero or more `Value`s.
|
|
||||||
|
|
||||||
`List`s are compared lexicographically, appealing to the ordering on
|
|
||||||
`Value`s for comparisons at each position in the `List`s.
|
|
||||||
|
|
||||||
### Maps
|
|
||||||
|
|
||||||
A `Map` is an *unordered* collection of zero or more pairs of
|
|
||||||
`Value`s. Each pair comprises a *key* and a *value*. Keys in a `Map`
|
|
||||||
must be pairwise distinct.
|
|
||||||
|
|
||||||
Instances of `Map` are compared by lexicographic comparison of the
|
|
||||||
sequences resulting from ordering each `Map`'s pairs in ascending
|
|
||||||
order by key. ((TODO: Is this a good idea? Is it clearly-enough
|
|
||||||
written? An alternative approach is to compare first by the *count* of
|
|
||||||
pairs, and only if the count is the same, start comparing the pairs
|
|
||||||
themselves.))
|
|
||||||
|
|
||||||
### Records
|
|
||||||
|
|
||||||
A `Record` is a tuple of one or more `Value`s. The first in the tuple
|
|
||||||
is called the *label* of the `Record`, and the other elements of the
|
|
||||||
tuple are called its *fields*.
|
|
||||||
|
|
||||||
`Record` labels are *usually* `ByteString`s, but can be any kind of
|
|
||||||
`Value`.[^iri-labels]
|
|
||||||
|
|
||||||
[^iri-labels]: It is occasionally (but seldom) necessary to
|
|
||||||
interpret such `ByteString` labels as UTF-8 encoded IRIs. Where a
|
|
||||||
label can be read as a relative IRI, it is notionally interpreted
|
|
||||||
with respect to the IRI `http://spki-cat.org/` ((TODO:
|
|
||||||
placeholder)); where a label can be read as an absolute IRI, it
|
|
||||||
stands for that IRI; and otherwise, it cannot be read as an IRI at
|
|
||||||
all, and so the label simply stands for itself - for its own
|
|
||||||
`Value`.
|
|
||||||
|
|
||||||
`Record`s are compared lexicographically as if they were just tuples;
|
|
||||||
that is, first by their labels, and then by the remainder of their
|
|
||||||
fields.
|
|
||||||
|
|
||||||
We will write examples of `Record`s with `ByteString` labels entirely
|
|
||||||
composed of ASCII characters as their label followed by their
|
|
||||||
parenthesised, comma-separated fields.
|
|
||||||
|
|
||||||
**Examples.** The `Record` with label `#"foo"` and fields 1, 2 and 3
|
|
||||||
is written `#"foo"(1, 2, 3)`; the `Record` with label `#"void"` and no
|
|
||||||
fields is written `#"void"()`.
|
|
||||||
|
|
||||||
## Conventions for Common Data Types
|
|
||||||
|
|
||||||
The `Value` data type is essentially an abstract S-Expression, able to
|
|
||||||
represent semi-structured data over `ByteString` and `Number` atoms.
|
|
||||||
|
|
||||||
However, users need a wide variety of data types for representing
|
|
||||||
domain-specific values such as text, calendrical values, machine
|
|
||||||
words, IEEE 754 floating-point values, booleans, and so on.
|
|
||||||
|
|
||||||
We use appropriately-labelled `Record`s to denote these
|
|
||||||
domain-specific data types.
|
|
||||||
|
|
||||||
All of these conventions are optional. They form a layer atop the core
|
|
||||||
`Value` structure. Non-domain-specific tools do not in general need to
|
|
||||||
treat them specially.
|
|
||||||
|
|
||||||
**Validity.** Many of the labels we will describe in this section come
|
|
||||||
with side-conditions on the contents of labelled `Record`s. It is
|
|
||||||
possible to construct an instance of `Value` that violates these
|
|
||||||
side-conditions without ceasing to be a `Value` or becoming
|
|
||||||
unrepresentable. However, we say that such a `Value` is *invalid*
|
|
||||||
because it fails to honour the necessary side-conditions.
|
|
||||||
Implementations *SHOULD* allow two modes of working: one which
|
|
||||||
treats all `Value`s identically, without regard for side-conditions,
|
|
||||||
and one which enforces validity (i.e. side-conditions) when reading,
|
|
||||||
writing, or constructing `Value`s.
|
|
||||||
|
|
||||||
### Text
|
|
||||||
|
|
||||||
A `Text` is a `Record` labelled with the `ByteString` `#"utf-8"` and
|
|
||||||
having a single field that is also a `ByteString`. The field *MUST* be
|
|
||||||
valid UTF-8.
|
|
||||||
|
|
||||||
We will write examples of `Text`s that contain Unicode text using
|
|
||||||
“`"`” as both an opening and closing quote mark.
|
|
||||||
|
|
||||||
**Examples.** The `Text` containing the three Unicode code points `z`
|
|
||||||
(0x7A), `水` (0x6C34) and `𝄞` (0x1D11E) is written as `"z水𝄞"`.
|
|
||||||
|
|
||||||
**Normalization forms.** Unicode defines multiple
|
|
||||||
[normalization forms](http://unicode.org/reports/tr15/) for text. The
|
|
||||||
ordering and equivalence relations defined for `Value`s mean that, for
|
|
||||||
Unicode text, the UTF-8 encoded byte-level form of a text is used in
|
|
||||||
comparisons.[^utf8-is-awesome] In order for users to unambiguously
|
|
||||||
signal or require a particular normalization form, we define a
|
|
||||||
`NormalizedText`, which is a `Record` labelled with
|
|
||||||
`#"unicode-normalization"` and having two fields, the first of which
|
|
||||||
is a `Text` specifying the normalization form used (e.g. `"nfc"`,
|
|
||||||
`"nfd"`, `"nfkc"`, `"nfkd"`), and the second of which is a `Text`
|
|
||||||
whose underlying representation *MUST* be normalized according to the
|
|
||||||
named normalization form.
|
|
||||||
|
|
||||||
[^utf8-is-awesome]: Happily, the design of UTF-8 is such that this
|
|
||||||
gives the same result as a lexicographic code-point-by-code-point
|
|
||||||
comparison!
|
|
||||||
|
|
||||||
**IRIs.** (URIs, URLs, URNs, etc.) An `IRI` is a `Record` labelled
|
|
||||||
with `#"iri"` and having one field, a `Text` which is the IRI itself
|
|
||||||
and which *MUST* be a valid absolute or relative IRI.
|
|
||||||
|
|
||||||
**Symbols.** Programming languages like Lisp and Prolog frequently use
|
|
||||||
string-like values called *symbols*. A `Symbol` is a `Record` labelled
|
|
||||||
with `#"symbol"` and having one field, a `Text`.
|
|
||||||
|
|
||||||
### Numbers
|
|
||||||
|
|
||||||
The definition of `Number` captures all integers and all
|
|
||||||
finitely-representable floating-point values. However, in certain
|
|
||||||
circumstances it can be valuable to assert that a number inhabits a
|
|
||||||
particular range, such as a fixed-width machine word or an IEEE 754
|
|
||||||
floating-point value.
|
|
||||||
|
|
||||||
**Fixed-width machine words.** (16-, 32- and 64-bit) A family of
|
|
||||||
labels `i`*n* and `u`*n* for *n* ∈ {16,32,64} denote *n*-bit-wide
|
|
||||||
signed and unsigned range restrictions, respectively. Records with
|
|
||||||
these labels *MUST* have one field, a `Number`, which *MUST* fall
|
|
||||||
within the appropriate range. That is, to be valid,
|
|
||||||
- in `#"i16"(`*x*`)`, -32768 <= *x* <= 32767, and ⌊*x*⌋ = *x*.
|
|
||||||
- in `#"u16"(`*x*`)`, 0 <= *x* <= 65535, and ⌊*x*⌋ = *x*.
|
|
||||||
- in `#"i32"(`*x*`)`, -2147483648 <= *x* <= 2147483647, and ⌊*x*⌋ = *x*.
|
|
||||||
- etc.
|
|
||||||
|
|
||||||
**IEEE 754 floating-point.** (single- and double-precision) The labels
|
|
||||||
`f32` and `f64` denote single- and double-precision IEEE 754
|
|
||||||
floating-point values, respectively. Records with these labels *MUST*
|
|
||||||
have one field. This field *MUST* either be a `Number`, which *MUST*
|
|
||||||
fall within the appropriate representable range, or one of the records
|
|
||||||
`#"nan"()`, `#"+inf"()` or `#"-inf"()`.
|
|
||||||
|
|
||||||
### Anonymous Tuples and Unit
|
|
||||||
|
|
||||||
A `Tuple` is a `Record` with label `#"tuple"` and zero or more fields,
|
|
||||||
denoting an anonymous tuple of values.
|
|
||||||
|
|
||||||
The 0-ary tuple, `#"tuple"()`, denotes the empty tuple, sometimes
|
|
||||||
called "unit" or "void" (but *not* e.g. JavaScript's "undefined"
|
|
||||||
value).
|
|
||||||
|
|
||||||
### Booleans, Null and Undefined
|
|
||||||
|
|
||||||
The two 0-ary `Record`s `#"true"()` and `#"false"()` denote the "true"
|
|
||||||
and "false" Boolean values, respectively.
|
|
||||||
|
|
||||||
Tony Hoare's
|
|
||||||
"[billion-dollar mistake](https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions)"
|
|
||||||
can be represented with the 0-ary `Record` `#"null"()`. An "undefined"
|
|
||||||
value can be represented as `#"undefined"()`.
|
|
||||||
|
|
||||||
### Dates and Times
|
|
||||||
|
|
||||||
Dates, times, moments, and timestamps can be represented with a
|
|
||||||
`Record` with label `#"rfc3339"` having a single field, a `Text`,
|
|
||||||
which *MUST* conform to one of the `full-date`, `partial-time`,
|
|
||||||
`full-time`, or `date-time` productions of
|
|
||||||
[section 5.6 of RFC 3339](https://tools.ietf.org/html/rfc3339#section-5.6).
|
|
||||||
|
|
||||||
## Syntax
|
|
||||||
|
|
||||||
Now we have discussed `Value`s and their meanings, we may turn to
|
|
||||||
techniques for *representing* `Value`s for communication or storage.
|
|
||||||
|
|
||||||
The syntax we have used for the examples so far is inadequate in many
|
|
||||||
ways, not least of which is that it cannot represent every `Value`.
|
|
||||||
|
|
||||||
Separation of the meaning of a piece of syntax from the syntax itself
|
|
||||||
opens the door to domain-specific syntaxes, all equivalent and
|
|
||||||
interconvertible.[^asn1] With a robust semantic foundation,
|
|
||||||
connections to other data languages can also be made.
|
|
||||||
|
|
||||||
[^asn1]: Those who remember
|
|
||||||
[ASN.1](https://www.itu.int/en/ITU-T/asn1/Pages/introduction.aspx)
|
|
||||||
will recall BER, DER, PER, CER, XER and so on, each appropriate to
|
|
||||||
a different setting. Similarly,
|
|
||||||
[Rivest's S-Expression design][sexp.txt] offers a human-friendly
|
|
||||||
syntax, a syntax robust to network-induced message corruption, and
|
|
||||||
an unambiguous, simple and easily-parsed machine-friendly syntax
|
|
||||||
for the same underlying values.
|
|
||||||
|
|
||||||
### Transfer syntax: S-Expressions
|
|
||||||
|
|
||||||
For now, we limit our attention to an easily-parsed, easily-produced
|
|
||||||
machine-readable syntax by mapping our `Value`s to the canonical form
|
|
||||||
of [Rivest's S-Expressions][sexp.txt].[^why-not-spki-sexps]
|
|
||||||
|
|
||||||
[^why-not-spki-sexps]: Why not just use Rivest's S-Expressions as
|
|
||||||
they are? While they include binary data and sequences, and an
|
|
||||||
obvious equivalence for them exists, they lack numbers *per se* as
|
|
||||||
well as any kind of unordered structure such as sets or maps. In
|
|
||||||
addition, while "display hints" allow labelling of binary data
|
|
||||||
with an intended interpretation, they cannot be attached to any
|
|
||||||
other kind of structure, and the "hint" itself can only be a
|
|
||||||
binary blob.
|
|
||||||
|
|
||||||
#### Byte strings
|
|
||||||
|
|
||||||
`ByteString`s map to byte-string S-Expressions.
|
|
||||||
|
|
||||||
**Examples.**
|
|
||||||
- What we have been writing above as `#"ABC"` would be represented as
|
|
||||||
the S-Expression `3:ABC`.
|
|
||||||
- The empty `ByteString` is represented by the S-Expression `0:`.
|
|
||||||
|
|
||||||
#### Numbers
|
|
||||||
|
|
||||||
Numbers are the most complicated values to represent as an
|
|
||||||
S-Expression.
|
|
||||||
|
|
||||||
((TODO: Consider cutting complexity by e.g. representing a `Number` as
|
|
||||||
a sign bit, a little-endian blob of the integer part of the number,
|
|
||||||
and a little-endian blob of the fractional part of the number. Lots of
|
|
||||||
trailing/leading zeros for very large/small numbers!))
|
|
||||||
|
|
||||||
We represent `Number`s using a sign-magnitude format, where the
|
|
||||||
magnitude is written using a little-endian, twos-complement binary
|
|
||||||
[*significand*](https://en.wikipedia.org/wiki/Significand) and a
|
|
||||||
(signed) *shift amount*.
|
|
||||||
|
|
||||||
In essence, we use a generalized, variable-width form of binary IEEE
|
|
||||||
floating-point representation.
|
|
||||||
|
|
||||||
Let `N` be the `Number` to represent as an S-Expression.
|
|
||||||
|
|
||||||
The sign bit is 0 when `N` is zero or positive, and 1 when `N` is
|
|
||||||
negative.
|
|
||||||
|
|
||||||
The magnitude of `N` can be viewed as an infinite sequence of bits
|
|
||||||
with a fraction-separator mark placed somewhere in the sequence,
|
|
||||||
|
|
||||||
```
|
|
||||||
···00.000 b_0 b_1 ··· b_{k-1} b_k ··· b_{n-1} 000000···
|
|
||||||
···000000 b_0 b_1 ··· b_{k-1} . b_k ··· b_{n-1} 000000···
|
|
||||||
···000000 b_0 b_1 ··· b_{k-1} b_k ··· b_{n-1} 000.00···
|
|
||||||
```
|
|
||||||
|
|
||||||
where `b_0` is the leftmost (most significant) and `b_{n-1}` the
|
|
||||||
rightmost (least significant) non-zero bit.
|
|
||||||
|
|
||||||
Let `k`, the position of the fraction-separator mark, be `i` when it
|
|
||||||
is immediately to the left of `b_i` for some `i`, generalizing to
|
|
||||||
negative values when it is to the left of `b_0` and values greater
|
|
||||||
than `n-1` when it is to the right of `b_{n-1}`.
|
|
||||||
|
|
||||||
For example, `k` will be:
|
|
||||||
- 0 when the fraction-separator is immediately (i.e. zero bits) to the left of `b_0`;
|
|
||||||
- -3 (as in the first example above) when it is three bits left of `b_0`;
|
|
||||||
- `n` when it is immediately (i.e. zero bits) to the right of `b_{n-1}`;
|
|
||||||
- `n`+3 when it is three bits to the right of `b_{n-1}`.
|
|
||||||
|
|
||||||
The unpadded significand is `b_0 b_1 ··· b_{n-1}`.
|
|
||||||
|
|
||||||
When `k` < `n`, the shift `z`=`k-n` and the significand is:
|
|
||||||
- the unpadded significand,
|
|
||||||
- with the sign bit appended to it on the right, and then
|
|
||||||
- padded on the left with zeroes until it is a whole number of octets wide.
|
|
||||||
|
|
||||||
When `k` ≥ `n`, the shift `z`=`8×⌊(k-n)/8⌋` and the significand is:
|
|
||||||
- the unpadded significand,
|
|
||||||
- padded on the right with `(k-n) mod 8` zeroes,
|
|
||||||
- with the sign bit then appended on the right, and then
|
|
||||||
- padded on the left with zeroes until it is a whole number of octets wide.
|
|
||||||
|
|
||||||
Now, let `s`=2`z` if `z` is zero or positive, or `s`=2|`z`|+1 if `z`
|
|
||||||
is negative.
|
|
||||||
|
|
||||||
Finally, the S-Expression form of `N` is:
|
|
||||||
- `(4:*num [SIGNIFICAND] [SHIFT])`, if `s`≠0; or
|
|
||||||
- `(4:*num [SIGNIFICAND])`, if `s`=0 but the significand contains non-zero bits; or
|
|
||||||
- `(4:*num)`, if `s`=0 and the significand contains no non-zero bits;
|
|
||||||
|
|
||||||
where
|
|
||||||
- `[SIGNIFICAND]` stands for a byte-string S-Expression containing a little-endian representation of the significand, and
|
|
||||||
- `[SHIFT]` stands for a byte-string S-Expression containing a little-endian representation of `s`.
|
|
||||||
|
|
||||||
**Examples.** (Shown using the hexadecimal representation of
|
|
||||||
byte-strings from
|
|
||||||
[section 4.4 of Rivest's S-Expression specification][sexp.txt] in
|
|
||||||
places.)
|
|
||||||
- `N`=0 → `(4:*num)`
|
|
||||||
- `N`=1 → `(4:*num#02#)`
|
|
||||||
- `N`=-1 → `(4:*num#03#)`
|
|
||||||
- `N`=10₁₀=1010.0₂ → `n`=3, `k`=4, `z`=0, `s`=0 → `(4:*num#14#)`
|
|
||||||
- `N`=2560₁₀=101000000000.0₂ → `n`=3, `k`=12, `z`=8, `s`=16 → `(4:*num#14##10#)`
|
|
||||||
- `N`=-2560₁₀=-101000000000.0₂ → `n`=3, `k`=12, `z`=8, `s`=16 → `(4:*num#15##10#)`
|
|
||||||
- `N`=-6₁₀=-110.0₂ → `n`=2, `k`=3, `z`=0, `s`=0 → `(4:*num#0D#)`
|
|
||||||
- `N`=0.5₁₀=0.1₂ → `n`=1, `k`=0, `z`=-1, `s`=3 → `(4:*num#02##03#)`
|
|
||||||
- `N`=-3/2₁₀=-1.1₂ → `n`=2, `k`=1, `z`=-1, `s`=3 → `(4:*num#07##03#)`
|
|
||||||
- `N`=33/192₁₀=0.001011₂ → `n`=4, `k`=-2, `z`=-6, `s`=7 → `(4:*num#16##07#)`
|
|
||||||
- `N`=-1.202E4567=1011011001···000₂ (15172 binary digits, the last 4565 of which are zero) → `n`=10607, `k`=15172, `z`=4560, `s`=9120 → `(4:*num#41828E···24CD16##A023#)`
|
|
||||||
|
|
||||||
((TODO: figure out what this algorithm would actually look like in,
|
|
||||||
say, C, Python and Racket.))
|
|
||||||
|
|
||||||
#### Lists
|
|
||||||
|
|
||||||
A `List` maps to an S-Expression list of representations of its
|
|
||||||
elements, with the byte-string S-Expression `5:*list` prepended.
|
|
||||||
|
|
||||||
**Examples.**
|
|
||||||
- The `List` containing the `ByteString`s `#"a"`, `#"b"`, and `#"c"`
|
|
||||||
would be represented as the S-Expression `(5:*list1:a1:b1:c)`.
|
|
||||||
- The empty `List` is represented by the S-Expression `(5:*list)`.
|
|
||||||
|
|
||||||
#### Maps
|
|
||||||
|
|
||||||
A `Map` is represented by an S-Expression list of representations of
|
|
||||||
the `Map`'s key-value pairs, with the byte-string `4:*map` prepended.
|
|
||||||
|
|
||||||
Each key-value pair is represented by a two-element S-Expression list
|
|
||||||
containing representations of the key and the value, in that order.
|
|
||||||
|
|
||||||
The key-value pairs *MUST* be ordered by `Value`-order of their keys.
|
|
||||||
|
|
||||||
**Examples.**
|
|
||||||
- The `Map` containing entries mapping `#"a"` to `#"d"` and `#"c"` to
|
|
||||||
`#"b"` is represented by `(4:*map(1:a1:d)(1:c1:b))`.
|
|
||||||
- The `Map` containing an entry mapping the empty list to a "true"
|
|
||||||
Boolean value is represented by `(4:*map((5:*list)(4:true)))`.
|
|
||||||
- The empty `Map` is represented by `(4:*map)`.
|
|
||||||
|
|
||||||
**Non-examples.**
|
|
||||||
- The S-Expression `(4:*map(1:c1:b)(1:a1:d))` is invalid, because its
|
|
||||||
key-value pairs are not in `Value`-order by key: `#"c"` > `#"a"`.
|
|
||||||
- The S-Expression `(4:*map1:a1:d1:c1:b)` is invalid, because its
|
|
||||||
key-value pairs appear "flattened" in the outer list, rather than
|
|
||||||
each appearing in a two-element list of its own.
|
|
||||||
|
|
||||||
#### Records
|
|
||||||
|
|
||||||
A `Record` is represented by an S-Expression list of its fields,
|
|
||||||
prepended by:
|
|
||||||
|
|
||||||
- the representation of its label, if its label is a `ByteString` and
|
|
||||||
does not begin with byte 42 (ASCII "`*`"); or
|
|
||||||
- the S-Expression `1:*` followed by the representation of the
|
|
||||||
`Record`'s label, otherwise.
|
|
||||||
|
|
||||||
**Examples.**
|
|
||||||
- The `Text` `"hello-world"` is represented by the S-Expression
|
|
||||||
`(5:utf-811:hello-world)`.
|
|
||||||
- The `IRI` denoting `http://www.w3.org/` is represented by the
|
|
||||||
S-Expression `(3:iri(5:utf-818:http://www.w3.org/))`.
|
|
||||||
- The `Record` `#"*"()` is represented by the S-Expression
|
|
||||||
`(1:*1:*)`.
|
|
||||||
- The `Record` `#"*foo"(#"*bar")` is represented by the S-Expression
|
|
||||||
`(1:*4:*foo4:*bar)`.
|
|
||||||
- The `Record` with the empty list as its label and no fields is
|
|
||||||
represented by the S-Expression `(1:*(5:*list))`.
|
|
||||||
- `(7:rfc3339(5:utf-83:foo))` represents a well-formed `Value` that
|
|
||||||
is a `Record` with `#"rfc3339"` as its label, and a single `Text`
|
|
||||||
field. While it is a perfectly reasonable `Value`, it does *not*
|
|
||||||
represent a valid date or time, since the `Text` `"foo"` does not
|
|
||||||
conform to any of the RFC 3339 productions enumerated above.
|
|
||||||
|
|
||||||
**Non-examples.**
|
|
||||||
- `((5:*list))` is not a representation of the `Record` with the
|
|
||||||
empty list as its label and no fields, because that `Record` has a
|
|
||||||
non-`ByteString` as its label, mandating a `1:*` prefix on its
|
|
||||||
S-Expression representation.
|
|
||||||
- `(4:*foo4:*bar)` does not represent the `Record`
|
|
||||||
`#"*foo"(#"*bar")`, because the label `#"*foo"` begins with "`*`",
|
|
||||||
mandating a `1:*` prefix on the `Record`'s S-Expression
|
|
||||||
representation.
|
|
||||||
|
|
||||||
## Examples
|
|
||||||
|
|
||||||
((TODO: Give some examples of large and small SPKI-CAT documents,
|
|
||||||
perhaps translated from various JSON blobs floating around the
|
|
||||||
internet.))
|
|
||||||
|
|
||||||
## Representing Values in Programming Languages
|
|
||||||
|
|
||||||
We have given a definition of `Value` and its semantics, and proposed
|
|
||||||
a concrete syntax for communicating and storing `Value`s. We now turn
|
|
||||||
to **suggested** representations of `Value`s as *programming-language
|
|
||||||
values* for various programming languages.
|
|
||||||
|
|
||||||
### JavaScript
|
|
||||||
|
|
||||||
- `ByteString` ↔ `Uint8Array`
|
|
||||||
- `Number` ↔ numbers, problematically; bignums, perhaps; other?? TODO
|
|
||||||
- `List` ↔ `Array`
|
|
||||||
- `Map` ↔ `Object`
|
|
||||||
- `Record` ↔ an instance of something like `Record` below, unless the label is...
|
|
||||||
- `#"utf-8"` ↔ `String`
|
|
||||||
- `#"true"` ↔ `true`
|
|
||||||
- `#"false"` ↔ `false`
|
|
||||||
- `#"null"` ↔ `null`
|
|
||||||
- `#"undefined"` ↔ the undefined value
|
|
||||||
- `#"rfc3339"` ↔ `Date`, if the `Record`'s field matches the `date-time` RFC 3339 production
|
|
||||||
|
|
||||||
```javascript
|
|
||||||
function Record(label, ...fields) {
|
|
||||||
this.label = label;
|
|
||||||
this.fields = fields;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Scheme/Racket
|
|
||||||
|
|
||||||
- `ByteString` ↔ byte vector (Racket: "Bytes")
|
|
||||||
- `Number` ↔ numbers
|
|
||||||
- `List` ↔ (where possible, immutable) list
|
|
||||||
- `Map` ↔ hash-table
|
|
||||||
- `Record` ↔ a structure (Racket: a "prefab struct"), unless the label is...
|
|
||||||
- `#"utf-8"` ↔ a string
|
|
||||||
- `#"true"` ↔ `#t`
|
|
||||||
- `#"false"` ↔ `#f`
|
|
||||||
- `#"symbol"` ↔ a symbol
|
|
||||||
|
|
||||||
### Java
|
|
||||||
|
|
||||||
- `ByteString` ↔ `byte[]`
|
|
||||||
- `Number` ↔ numbers, problematically; bignums, perhaps; other?? TODO
|
|
||||||
- `List` ↔ `java.util.List`
|
|
||||||
- `Map` ↔ `java.util.Map`
|
|
||||||
- `Record` ↔ an instance of something like `Record` below, unless the label is...
|
|
||||||
- `#"utf-8"` ↔ `java.lang.String`
|
|
||||||
- `#"true"` ↔ `java.lang.Boolean.TRUE`
|
|
||||||
- `#"false"` ↔ `java.lang.Boolean.FALSE`
|
|
||||||
- `#"null"` ↔ a special singleton object, but *not* Java's `null`
|
|
||||||
- `#"rfc3339"` ↔ `java.util.{Date,Time,Timestamp}`, according to which RFC 3339 production the `Record`'s field matches
|
|
||||||
|
|
||||||
### Erlang
|
|
||||||
|
|
||||||
- `ByteString` ↔ a binary
|
|
||||||
- `Number` ↔ numbers, probably; TODO
|
|
||||||
- `List` ↔ a list
|
|
||||||
- `Map` ↔ a [map](http://erlang.org/doc/reference_manual/data_types.html#id77432) (new in Erlang/OTP R17)
|
|
||||||
- `Record` ↔ a tuple with the label in the first position, and the fields in subsequent positions, unless the label is...
|
|
||||||
- `#"true"` ↔ `true`
|
|
||||||
- `#"false"` ↔ `false`
|
|
||||||
- `#"null"` ↔ `null`
|
|
||||||
- `#"undefined"` ↔ `undefined`
|
|
||||||
- `#"symbol"` ↔ the `Text` field converted to an Erlang atom, if
|
|
||||||
some kind of an "unsafe" mode is set on the decoder (because
|
|
||||||
Erlang atoms are not GC'd); otherwise like any other kind of
|
|
||||||
`Record`
|
|
||||||
|
|
||||||
---
|
|
|
@ -1,61 +0,0 @@
|
||||||
body {
|
|
||||||
font-family: palatino, "Palatino Linotype", "Palatino LT STD", "URW Palladio L", "TeX Gyre Pagella", serif;
|
|
||||||
}
|
|
||||||
@media screen {
|
|
||||||
body { padding-top: 2rem; max-width: 40em; margin: auto; font-size: 120%; }
|
|
||||||
hr { display: none; }
|
|
||||||
}
|
|
||||||
@media print {
|
|
||||||
@page { size: letter; margin: 4rem 0rem 4.333rem 0rem; }
|
|
||||||
body { margin-left: 4.5rem; margin-right: 4.5rem; }
|
|
||||||
h1, h2 { page-break-before: always; margin-top: 0; }
|
|
||||||
h1:first-of-type, h2:first-of-type { page-break-before: auto; }
|
|
||||||
hr+* { page-break-before: always; margin-top: 0; }
|
|
||||||
hr { display: none; }
|
|
||||||
}
|
|
||||||
h1, h2, h3, h4, h5, h6 { color: #4f81bd; }
|
|
||||||
h2 { border-bottom: solid #4f81bd 1px; }
|
|
||||||
pre, code { background-color: #eee; font-family: "DejaVu Sans Mono", monospace; }
|
|
||||||
code { font-size: 75%; }
|
|
||||||
pre { padding: 0.33rem; }
|
|
||||||
|
|
||||||
body {
|
|
||||||
counter-reset: section 0 subsection 0 appendix 0;
|
|
||||||
}
|
|
||||||
h2:before, h3:before {
|
|
||||||
text-align: right;
|
|
||||||
display: inline-block;
|
|
||||||
position: relative;
|
|
||||||
right: 2.33em;
|
|
||||||
font-size: 75%;
|
|
||||||
text-align: right;
|
|
||||||
width: 2em;
|
|
||||||
margin-right: -2em;
|
|
||||||
height: 0;
|
|
||||||
}
|
|
||||||
h2:before {
|
|
||||||
counter-increment: section;
|
|
||||||
content: counter(section) ". ";
|
|
||||||
}
|
|
||||||
h2 {
|
|
||||||
counter-reset: subsection 0;
|
|
||||||
}
|
|
||||||
h3:before {
|
|
||||||
counter-increment: subsection;
|
|
||||||
content: counter(section) "." counter(subsection) ". ";
|
|
||||||
}
|
|
||||||
|
|
||||||
h2[id^="appendix-"]:before {
|
|
||||||
counter-increment: appendix;
|
|
||||||
content: counter(appendix,upper-latin) ". ";
|
|
||||||
}
|
|
||||||
h2[id^="appendix-"] ~ h3:before {
|
|
||||||
counter-increment: subsection;
|
|
||||||
content: counter(appendix,upper-latin) "." counter(subsection) ". ";
|
|
||||||
}
|
|
||||||
|
|
||||||
h2#notes:before {
|
|
||||||
display: none;
|
|
||||||
}
|
|
||||||
|
|
||||||
.footnotes > ol { padding: 0; font-size: 90%; }
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,385 +0,0 @@
|
||||||
import sys
|
|
||||||
import numbers
|
|
||||||
import struct
|
|
||||||
|
|
||||||
try:
|
|
||||||
basestring
|
|
||||||
except NameError:
|
|
||||||
basestring = str
|
|
||||||
|
|
||||||
if isinstance(chr(123), bytes):
|
|
||||||
_ord = ord
|
|
||||||
else:
|
|
||||||
_ord = lambda x: x
|
|
||||||
|
|
||||||
class Float(object):
|
|
||||||
def __init__(self, value):
|
|
||||||
self.value = value
|
|
||||||
|
|
||||||
def __eq__(self, other):
|
|
||||||
if other.__class__ is self.__class__:
|
|
||||||
return self.value == other.value
|
|
||||||
|
|
||||||
def __repr__(self):
|
|
||||||
return 'Float(' + repr(self.value) + ')'
|
|
||||||
|
|
||||||
def __preserve_on__(self, encoder):
|
|
||||||
encoder.leadbyte(0, 0, 2)
|
|
||||||
encoder.buffer.extend(struct.pack('>f', self.value))
|
|
||||||
|
|
||||||
class Symbol(object):
|
|
||||||
def __init__(self, name):
|
|
||||||
self.name = name
|
|
||||||
|
|
||||||
def __eq__(self, other):
|
|
||||||
return isinstance(other, Symbol) and self.name == other.name
|
|
||||||
|
|
||||||
def __hash__(self):
|
|
||||||
return hash(self.name)
|
|
||||||
|
|
||||||
def __repr__(self):
|
|
||||||
return '#' + self.name
|
|
||||||
|
|
||||||
def __preserve_on__(self, encoder):
|
|
||||||
bs = self.name.encode('utf-8')
|
|
||||||
encoder.header(1, 3, len(bs))
|
|
||||||
encoder.buffer.extend(bs)
|
|
||||||
|
|
||||||
class Record(object):
|
|
||||||
def __init__(self, key, fields):
|
|
||||||
self.key = key
|
|
||||||
self.fields = tuple(fields)
|
|
||||||
self.__hash = None
|
|
||||||
|
|
||||||
def __eq__(self, other):
|
|
||||||
return isinstance(other, Record) and (self.key, self.fields) == (other.key, other.fields)
|
|
||||||
|
|
||||||
def __hash__(self):
|
|
||||||
if self.__hash is None:
|
|
||||||
self.__hash = hash((self.key, self.fields))
|
|
||||||
return self.__hash
|
|
||||||
|
|
||||||
def __repr__(self):
|
|
||||||
return str(self.key) + '(' + ', '.join((repr(f) for f in self.fields)) + ')'
|
|
||||||
|
|
||||||
def __preserve_on__(self, encoder):
|
|
||||||
try:
|
|
||||||
index = encoder.shortForms.index(self.key)
|
|
||||||
except ValueError:
|
|
||||||
index = None
|
|
||||||
if index is None:
|
|
||||||
encoder.header(2, 3, len(self.fields) + 1)
|
|
||||||
encoder.append(self.key)
|
|
||||||
else:
|
|
||||||
encoder.header(2, index, len(self.fields))
|
|
||||||
for f in self.fields:
|
|
||||||
encoder.append(f)
|
|
||||||
|
|
||||||
# Blub blub blub
|
|
||||||
class ImmutableDict(dict):
|
|
||||||
def __init__(self, *args, **kwargs):
|
|
||||||
if hasattr(self, '__hash'): raise TypeError('Immutable')
|
|
||||||
super(ImmutableDict, self).__init__(*args, **kwargs)
|
|
||||||
self.__hash = None
|
|
||||||
|
|
||||||
def __delitem__(self, key): raise TypeError('Immutable')
|
|
||||||
def __setitem__(self, key, val): raise TypeError('Immutable')
|
|
||||||
def clear(self): raise TypeError('Immutable')
|
|
||||||
def pop(self, k, d=None): raise TypeError('Immutable')
|
|
||||||
def popitem(self): raise TypeError('Immutable')
|
|
||||||
def setdefault(self, k, d=None): raise TypeError('Immutable')
|
|
||||||
def update(self, e, **f): raise TypeError('Immutable')
|
|
||||||
|
|
||||||
def __hash__(self):
|
|
||||||
if self.__hash is None:
|
|
||||||
h = 0
|
|
||||||
for k in self:
|
|
||||||
h = ((h << 5) ^ (hash(k) << 2) ^ hash(self[k])) & sys.maxsize
|
|
||||||
self.__hash = h
|
|
||||||
return self.__hash
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def from_kvs(kvs):
|
|
||||||
i = iter(kvs)
|
|
||||||
result = ImmutableDict()
|
|
||||||
result_proxy = super(ImmutableDict, result)
|
|
||||||
try:
|
|
||||||
while True:
|
|
||||||
k = next(i)
|
|
||||||
v = next(i)
|
|
||||||
result_proxy.__setitem__(k, v)
|
|
||||||
except StopIteration:
|
|
||||||
pass
|
|
||||||
return result
|
|
||||||
|
|
||||||
def dict_kvs(d):
|
|
||||||
for k in d:
|
|
||||||
yield k
|
|
||||||
yield d[k]
|
|
||||||
|
|
||||||
class DecodeError(ValueError): pass
|
|
||||||
class EncodeError(ValueError): pass
|
|
||||||
|
|
||||||
class Codec(object):
|
|
||||||
def __init__(self):
|
|
||||||
self.shortForms = [Symbol(u'discard'), Symbol(u'capture'), Symbol(u'observe')]
|
|
||||||
|
|
||||||
def set_shortform(self, index, v):
|
|
||||||
if index >= 0 and index < 3:
|
|
||||||
self.shortForms[index] = v
|
|
||||||
else:
|
|
||||||
raise ValueError('Invalid short form index %r' % (index,))
|
|
||||||
|
|
||||||
class Stream(object):
|
|
||||||
def __init__(self, iterator):
|
|
||||||
self._iterator = iterator
|
|
||||||
|
|
||||||
def __preserve_on__(self, encoder):
|
|
||||||
arg = (self.major << 2) | self.minor
|
|
||||||
encoder.leadbyte(0, 2, arg)
|
|
||||||
self._emit(encoder)
|
|
||||||
encoder.leadbyte(0, 3, arg)
|
|
||||||
|
|
||||||
def _emit(self, encoder):
|
|
||||||
raise NotImplementedError('Should be implemented in subclasses')
|
|
||||||
|
|
||||||
class ValueStream(Stream):
|
|
||||||
major = 3
|
|
||||||
def _emit(self, encoder):
|
|
||||||
for v in self._iterator:
|
|
||||||
encoder.append(v)
|
|
||||||
|
|
||||||
class SequenceStream(ValueStream):
|
|
||||||
minor = 0
|
|
||||||
|
|
||||||
class SetStream(ValueStream):
|
|
||||||
minor = 1
|
|
||||||
|
|
||||||
class DictStream(ValueStream):
|
|
||||||
minor = 2
|
|
||||||
def _emit(self, encoder):
|
|
||||||
for (k, v) in self._iterator:
|
|
||||||
encoder.append(k)
|
|
||||||
encoder.append(v)
|
|
||||||
|
|
||||||
class BinaryStream(Stream):
|
|
||||||
major = 1
|
|
||||||
minor = 2
|
|
||||||
def _emit(self, encoder):
|
|
||||||
for chunk in self._iterator:
|
|
||||||
if not isinstance(chunk, bytes):
|
|
||||||
raise EncodeError('Illegal chunk in BinaryStream %r' % (chunk,))
|
|
||||||
encoder.append(chunk)
|
|
||||||
|
|
||||||
class StringStream(BinaryStream):
|
|
||||||
minor = 1
|
|
||||||
|
|
||||||
class SymbolStream(BinaryStream):
|
|
||||||
minor = 3
|
|
||||||
|
|
||||||
class Decoder(Codec):
|
|
||||||
def __init__(self, packet):
|
|
||||||
super(Decoder, self).__init__()
|
|
||||||
self.packet = packet
|
|
||||||
self.index = 0
|
|
||||||
|
|
||||||
def peekbyte(self):
|
|
||||||
if self.index < len(self.packet):
|
|
||||||
return _ord(self.packet[self.index])
|
|
||||||
else:
|
|
||||||
raise DecodeError('Short packet')
|
|
||||||
|
|
||||||
def advance(self, count=1):
|
|
||||||
start = self.index
|
|
||||||
self.index = self.index + count
|
|
||||||
return start
|
|
||||||
|
|
||||||
def nextbyte(self):
|
|
||||||
val = self.peekbyte()
|
|
||||||
self.advance()
|
|
||||||
return val
|
|
||||||
|
|
||||||
def wirelength(self, arg):
|
|
||||||
if arg < 15:
|
|
||||||
return arg
|
|
||||||
return self.varint()
|
|
||||||
|
|
||||||
def varint(self):
|
|
||||||
v = self.nextbyte()
|
|
||||||
if v < 128:
|
|
||||||
return v
|
|
||||||
else:
|
|
||||||
return self.varint() * 128 + (v - 128)
|
|
||||||
|
|
||||||
def nextbytes(self, n):
|
|
||||||
start = self.advance(n)
|
|
||||||
return self.packet[start : self.index]
|
|
||||||
|
|
||||||
def nextvalues(self, n):
|
|
||||||
result = []
|
|
||||||
for i in range(n):
|
|
||||||
result.append(self.next())
|
|
||||||
return result
|
|
||||||
|
|
||||||
def peekop(self):
|
|
||||||
b = self.peekbyte()
|
|
||||||
major = b >> 6
|
|
||||||
minor = (b >> 4) & 3
|
|
||||||
arg = b & 15
|
|
||||||
return (major, minor, arg)
|
|
||||||
|
|
||||||
def nextop(self):
|
|
||||||
op = self.peekop()
|
|
||||||
self.advance()
|
|
||||||
return op
|
|
||||||
|
|
||||||
def peekend(self, arg):
|
|
||||||
return self.peekop() == (0, 3, arg)
|
|
||||||
|
|
||||||
def binarystream(self, arg, minor):
|
|
||||||
result = []
|
|
||||||
while not self.peekend(arg):
|
|
||||||
chunk = self.next()
|
|
||||||
if isinstance(chunk, bytes):
|
|
||||||
result.append(chunk)
|
|
||||||
else:
|
|
||||||
raise DecodeError('Unexpected non-binary chunk')
|
|
||||||
return self.decodebinary(minor, b''.join(result))
|
|
||||||
|
|
||||||
def valuestream(self, arg, minor, decoder):
|
|
||||||
result = []
|
|
||||||
while not self.peekend(arg):
|
|
||||||
result.append(self.next())
|
|
||||||
return decoder(minor, result)
|
|
||||||
|
|
||||||
def decodeint(self, bs):
|
|
||||||
if len(bs) == 0: return 0
|
|
||||||
acc = _ord(bs[0])
|
|
||||||
if acc & 0x80: acc = acc - 256
|
|
||||||
for b in bs[1:]:
|
|
||||||
acc = (acc << 8) | _ord(b)
|
|
||||||
return acc
|
|
||||||
|
|
||||||
def decodebinary(self, minor, bs):
|
|
||||||
if minor == 0: return self.decodeint(bs)
|
|
||||||
if minor == 1: return bs.decode('utf-8')
|
|
||||||
if minor == 2: return bs
|
|
||||||
if minor == 3: return Symbol(bs.decode('utf-8'))
|
|
||||||
|
|
||||||
def decoderecord(self, minor, vs):
|
|
||||||
if minor == 3:
|
|
||||||
if not vs: raise DecodeError('Too few elements in encoded record')
|
|
||||||
return Record(vs[0], vs[1:])
|
|
||||||
else:
|
|
||||||
return Record(self.shortForms[minor], vs)
|
|
||||||
|
|
||||||
def decodecollection(self, minor, vs):
|
|
||||||
if minor == 0: return tuple(vs)
|
|
||||||
if minor == 1: return frozenset(vs)
|
|
||||||
if minor == 2: return ImmutableDict.from_kvs(vs)
|
|
||||||
if minor == 3: raise DecodeError('Invalid collection type')
|
|
||||||
|
|
||||||
def next(self):
|
|
||||||
(major, minor, arg) = self.nextop()
|
|
||||||
if major == 0:
|
|
||||||
if minor == 0:
|
|
||||||
if arg == 0: return False
|
|
||||||
if arg == 1: return True
|
|
||||||
if arg == 2: return Float(struct.unpack('>f', self.nextbytes(4))[0])
|
|
||||||
if arg == 3: return struct.unpack('>d', self.nextbytes(8))[0]
|
|
||||||
raise DecodeError('Invalid format A encoding')
|
|
||||||
elif minor == 1:
|
|
||||||
return arg - 16 if arg > 12 else arg
|
|
||||||
elif minor == 2:
|
|
||||||
t = arg >> 2
|
|
||||||
n = arg & 3
|
|
||||||
if t == 0: raise DecodeError('Invalid format C start byte')
|
|
||||||
if t == 1: return self.binarystream(arg, n)
|
|
||||||
if t == 2: return self.valuestream(arg, n, self.decoderecord)
|
|
||||||
if t == 3: return self.valuestream(arg, n, self.decodecollection)
|
|
||||||
else: # minor == 3
|
|
||||||
raise DecodeError('Unexpected format C end byte')
|
|
||||||
elif major == 1:
|
|
||||||
return self.decodebinary(minor, self.nextbytes(self.wirelength(arg)))
|
|
||||||
elif major == 2:
|
|
||||||
return self.decoderecord(minor, self.nextvalues(self.wirelength(arg)))
|
|
||||||
else: # major == 3
|
|
||||||
return self.decodecollection(minor, self.nextvalues(self.wirelength(arg)))
|
|
||||||
|
|
||||||
class Encoder(Codec):
|
|
||||||
def __init__(self):
|
|
||||||
super(Encoder, self).__init__()
|
|
||||||
self.buffer = bytearray()
|
|
||||||
|
|
||||||
def contents(self):
|
|
||||||
return bytes(self.buffer)
|
|
||||||
|
|
||||||
def varint(self, v):
|
|
||||||
if v < 128:
|
|
||||||
self.buffer.append(v)
|
|
||||||
else:
|
|
||||||
self.buffer.append((v % 128) + 128)
|
|
||||||
self.varint(v // 128)
|
|
||||||
|
|
||||||
def leadbyte(self, major, minor, arg):
|
|
||||||
self.buffer.append(((major & 3) << 6) | ((minor & 3) << 4) | (arg & 15))
|
|
||||||
|
|
||||||
def header(self, major, minor, wirelength):
|
|
||||||
if wirelength < 15:
|
|
||||||
self.leadbyte(major, minor, wirelength)
|
|
||||||
else:
|
|
||||||
self.leadbyte(major, minor, 15)
|
|
||||||
self.varint(wirelength)
|
|
||||||
|
|
||||||
def encodeint(self, v):
|
|
||||||
bitcount = (~v if v < 0 else v).bit_length() + 1
|
|
||||||
bytecount = (bitcount + 7) // 8
|
|
||||||
self.header(1, 0, bytecount)
|
|
||||||
def enc(n,x):
|
|
||||||
if n > 0:
|
|
||||||
enc(n-1, x >> 8)
|
|
||||||
self.buffer.append(x & 255)
|
|
||||||
enc(bytecount, v)
|
|
||||||
|
|
||||||
def encodecollection(self, minor, items):
|
|
||||||
self.header(3, minor, len(items))
|
|
||||||
for i in items: self.append(i)
|
|
||||||
|
|
||||||
def append(self, v):
|
|
||||||
if hasattr(v, '__preserve_on__'):
|
|
||||||
v.__preserve_on__(self)
|
|
||||||
elif v is False:
|
|
||||||
self.leadbyte(0, 0, 0)
|
|
||||||
elif v is True:
|
|
||||||
self.leadbyte(0, 0, 1)
|
|
||||||
elif isinstance(v, float):
|
|
||||||
self.leadbyte(0, 0, 3)
|
|
||||||
self.buffer.extend(struct.pack('>d', v))
|
|
||||||
elif isinstance(v, numbers.Number):
|
|
||||||
if v >= -3 and v <= 12:
|
|
||||||
self.leadbyte(0, 1, v if v >= 0 else v + 16)
|
|
||||||
else:
|
|
||||||
self.encodeint(v)
|
|
||||||
elif isinstance(v, bytes):
|
|
||||||
self.header(1, 2, len(v))
|
|
||||||
self.buffer.extend(v)
|
|
||||||
elif isinstance(v, basestring):
|
|
||||||
bs = v.encode('utf-8')
|
|
||||||
self.header(1, 1, len(bs))
|
|
||||||
self.buffer.extend(bs)
|
|
||||||
elif isinstance(v, list):
|
|
||||||
self.encodecollection(0, v)
|
|
||||||
elif isinstance(v, tuple):
|
|
||||||
self.encodecollection(0, v)
|
|
||||||
elif isinstance(v, set):
|
|
||||||
self.encodecollection(1, v)
|
|
||||||
elif isinstance(v, frozenset):
|
|
||||||
self.encodecollection(1, v)
|
|
||||||
elif isinstance(v, dict):
|
|
||||||
self.encodecollection(2, list(dict_kvs(v)))
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
i = iter(v)
|
|
||||||
except TypeError:
|
|
||||||
raise EncodeError('Cannot encode %r' % (v,))
|
|
||||||
self.encodestream(3, 0, i)
|
|
|
@ -1,856 +0,0 @@
|
||||||
#lang racket/base
|
|
||||||
;; Preserve, as in Fruit Preserve, as in a remarkably weak pun on pickling/dehydration etc
|
|
||||||
|
|
||||||
(provide (struct-out stream-of)
|
|
||||||
(struct-out record)
|
|
||||||
short-form-labels
|
|
||||||
read-preserve
|
|
||||||
string->preserve
|
|
||||||
encode
|
|
||||||
decode
|
|
||||||
wire-value)
|
|
||||||
|
|
||||||
(require racket/bytes)
|
|
||||||
(require racket/dict)
|
|
||||||
(require racket/generator)
|
|
||||||
(require racket/match)
|
|
||||||
(require racket/set)
|
|
||||||
(require bitsyntax)
|
|
||||||
(require syndicate/support/struct)
|
|
||||||
(require (only-in syntax/readerr raise-read-error))
|
|
||||||
|
|
||||||
(require imperative-syndicate/assertions)
|
|
||||||
(require imperative-syndicate/pattern)
|
|
||||||
|
|
||||||
(struct stream-of (kind generator) #:transparent)
|
|
||||||
|
|
||||||
(struct record (label fields) #:transparent)
|
|
||||||
|
|
||||||
(define short-form-labels
|
|
||||||
(make-parameter (vector 'discard 'capture 'observe)))
|
|
||||||
|
|
||||||
(define (encode v)
|
|
||||||
(bit-string->bytes (bit-string (v :: (wire-value)))))
|
|
||||||
|
|
||||||
(define (decode bs [on-fail (lambda () (error 'decode "Invalid encoding: ~v" bs))])
|
|
||||||
(bit-string-case bs
|
|
||||||
([ (v :: (wire-value)) ] v)
|
|
||||||
(else (on-fail))))
|
|
||||||
|
|
||||||
(define-syntax wire-value
|
|
||||||
(syntax-rules ()
|
|
||||||
[(_ #t input ks kf) (decode-value input ks kf)]
|
|
||||||
[(_ #f v) (encode-value v)]))
|
|
||||||
|
|
||||||
(define-syntax wire-length
|
|
||||||
(syntax-rules ()
|
|
||||||
[(_ #t input ks kf) (decode-wire-length input ks kf)]
|
|
||||||
[(_ #f v) (encode-wire-length v)]))
|
|
||||||
|
|
||||||
(define (encode-wire-length v)
|
|
||||||
(when (negative? v) (error 'encode-wire-length "Cannot encode negative wire-length ~v" v))
|
|
||||||
(if (< v #b1111)
|
|
||||||
(bit-string (v :: bits 4))
|
|
||||||
(bit-string (#b1111 :: bits 4) ((encode-varint v) :: binary))))
|
|
||||||
|
|
||||||
(define (encode-varint v)
|
|
||||||
(if (< v 128)
|
|
||||||
(bytes v)
|
|
||||||
(bit-string ((+ (modulo v 128) 128) :: bits 8)
|
|
||||||
((encode-varint (quotient v 128)) :: binary))))
|
|
||||||
|
|
||||||
(define (encode-array-like major minor fields)
|
|
||||||
(bit-string (major :: bits 2)
|
|
||||||
(minor :: bits 2)
|
|
||||||
((length fields) :: (wire-length))
|
|
||||||
((apply bit-string-append (map encode-value fields)) :: binary)))
|
|
||||||
|
|
||||||
(define (encode-binary-like major minor bs)
|
|
||||||
(bit-string (major :: bits 2)
|
|
||||||
(minor :: bits 2)
|
|
||||||
((bytes-length bs) :: (wire-length))
|
|
||||||
(bs :: binary)))
|
|
||||||
|
|
||||||
(define (encode-start-byte major minor)
|
|
||||||
(bit-string (#b0010 :: bits 4) (major :: bits 2) (minor :: bits 2)))
|
|
||||||
|
|
||||||
(define (encode-end-byte major minor)
|
|
||||||
(bit-string (#b0011 :: bits 4) (major :: bits 2) (minor :: bits 2)))
|
|
||||||
|
|
||||||
(define (encode-stream major minor chunk-ok? generator)
|
|
||||||
(bit-string-append (encode-start-byte major minor)
|
|
||||||
(let loop ()
|
|
||||||
(match (generator)
|
|
||||||
[(? void?) #""]
|
|
||||||
[(? chunk-ok? v) (bit-string-append (encode-value v) (loop))]
|
|
||||||
[bad (error 'encode-stream "Cannot encode chunk: ~v" bad)]))
|
|
||||||
(encode-end-byte major minor)))
|
|
||||||
|
|
||||||
(define (dict-keys-and-values d)
|
|
||||||
(reverse (for/fold [(acc '())] [((k v) (in-dict d))] (cons v (cons k acc)))))
|
|
||||||
|
|
||||||
(define (short-form-for-label key)
|
|
||||||
(let ((labels (short-form-labels)))
|
|
||||||
(let loop ((i 0))
|
|
||||||
(cond [(= i 3) #f]
|
|
||||||
[(equal? (vector-ref labels i) key) i]
|
|
||||||
[else (loop (+ i 1))]))))
|
|
||||||
|
|
||||||
(define (encode-record key fields)
|
|
||||||
(define short (short-form-for-label key))
|
|
||||||
(if short
|
|
||||||
(encode-array-like 2 short fields)
|
|
||||||
(encode-array-like 2 3 (cons key fields))))
|
|
||||||
|
|
||||||
(define (encode-value v)
|
|
||||||
(match v
|
|
||||||
[#f (bytes #b00000000)]
|
|
||||||
[#t (bytes #b00000001)]
|
|
||||||
[(? single-flonum?) (bit-string #b00000010 (v :: float bits 32))]
|
|
||||||
[(? double-flonum?) (bit-string #b00000011 (v :: float bits 64))]
|
|
||||||
[(? integer? x) #:when (<= -3 x 12) (bit-string (#b0001 :: bits 4) (x :: bits 4))]
|
|
||||||
[(stream-of 'string p) (encode-stream 1 1 bytes? p)]
|
|
||||||
[(stream-of 'byte-string p) (encode-stream 1 2 bytes? p)]
|
|
||||||
[(stream-of 'symbol p) (encode-stream 1 3 bytes? p)]
|
|
||||||
[(stream-of 'sequence p) (encode-stream 3 0 (lambda (x) #t) p)]
|
|
||||||
[(stream-of 'set p) (encode-stream 3 1 (lambda (x) #t) p)]
|
|
||||||
[(stream-of 'dictionary p) (encode-stream 3 2 (lambda (x) #t) p)]
|
|
||||||
|
|
||||||
;; [0 (bytes #b10000000)]
|
|
||||||
[(? integer?)
|
|
||||||
(define raw-bit-count (+ (integer-length v) 1)) ;; at least one sign bit
|
|
||||||
(define byte-count (quotient (+ raw-bit-count 7) 8))
|
|
||||||
(bit-string (#b0100 :: bits 4) (byte-count :: (wire-length)) (v :: integer bytes byte-count))]
|
|
||||||
[(? string?) (encode-binary-like 1 1 (string->bytes/utf-8 v))]
|
|
||||||
[(? bytes?) (encode-binary-like 1 2 v)]
|
|
||||||
[(? symbol?) (encode-binary-like 1 3 (string->bytes/utf-8 (symbol->string v)))]
|
|
||||||
|
|
||||||
[(record label fields) (encode-record label fields)]
|
|
||||||
[(? non-object-struct?)
|
|
||||||
(define key (prefab-struct-key v))
|
|
||||||
(when (not key) (error 'encode-value "Cannot encode non-prefab struct ~v" v))
|
|
||||||
(encode-record key (cdr (vector->list (struct->vector v))))]
|
|
||||||
|
|
||||||
[(? list?) (encode-array-like 3 0 v)]
|
|
||||||
[(? set?) (encode-array-like 3 1 (set->list v))]
|
|
||||||
[(? dict?) (encode-array-like 3 2 (dict-keys-and-values v))]
|
|
||||||
|
|
||||||
[_ (error 'encode-value "Cannot encode value ~v" v)]))
|
|
||||||
|
|
||||||
;;---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
(define (decode-wire-length bs ks kf)
|
|
||||||
(bit-string-case bs
|
|
||||||
([ (= #b1111 :: bits 4) (rest :: binary) ]
|
|
||||||
(decode-varint rest
|
|
||||||
(lambda (v tail)
|
|
||||||
(if (< v #b1111)
|
|
||||||
(kf)
|
|
||||||
(ks v tail)))
|
|
||||||
kf))
|
|
||||||
([ (v :: bits 4) (rest :: binary) ] (ks v rest))
|
|
||||||
(else (kf))))
|
|
||||||
|
|
||||||
(define (decode-varint bs ks kf)
|
|
||||||
(bit-string-case bs
|
|
||||||
([ (= 1 :: bits 1) (v :: bits 7) (rest :: binary) ]
|
|
||||||
(decode-varint rest (lambda (acc tail) (ks (+ (* acc 128) v) tail)) kf))
|
|
||||||
([ (= 0 :: bits 1) (v :: bits 7) (rest :: binary) ]
|
|
||||||
(ks v rest))
|
|
||||||
(else
|
|
||||||
(kf))))
|
|
||||||
|
|
||||||
(define (decode-values n acc-rev bs ks kf)
|
|
||||||
(if (zero? n)
|
|
||||||
(ks (reverse acc-rev) bs)
|
|
||||||
(bit-string-case bs
|
|
||||||
([ (v :: (wire-value)) (rest :: binary) ]
|
|
||||||
(decode-values (- n 1) (cons v acc-rev) rest ks kf))
|
|
||||||
(else (kf)))))
|
|
||||||
|
|
||||||
(define (decode-binary minor bs rest ks kf)
|
|
||||||
(match minor
|
|
||||||
[0 (if (positive? (bit-string-length bs))
|
|
||||||
(ks (bit-string->signed-integer bs #t) rest)
|
|
||||||
(ks 0 rest))]
|
|
||||||
[2 (ks bs rest)]
|
|
||||||
[(or 1 3)
|
|
||||||
((with-handlers [(exn:fail:contract? (lambda (e) kf))]
|
|
||||||
(define s (bytes->string/utf-8 bs))
|
|
||||||
(lambda () (ks (if (= minor 3) (string->symbol s) s) rest))))]))
|
|
||||||
|
|
||||||
(define (build-record label fields)
|
|
||||||
(with-handlers [(exn:fail:contract? (lambda (e) (record label fields)))]
|
|
||||||
(apply make-prefab-struct label fields)))
|
|
||||||
|
|
||||||
(define (decode-record minor fields rest ks kf)
|
|
||||||
(match* (minor fields)
|
|
||||||
[(3 (list* key fs)) (ks (build-record key fs) rest)]
|
|
||||||
[(3 '()) (kf)]
|
|
||||||
[(n fs) (ks (build-record (vector-ref (short-form-labels) n) fs) rest)]))
|
|
||||||
|
|
||||||
(define (decode-collection minor vs rest ks kf)
|
|
||||||
(match minor
|
|
||||||
[0 (ks vs rest)]
|
|
||||||
[1 (ks (list->set vs) rest)]
|
|
||||||
[2 (if (even? (length vs))
|
|
||||||
(ks (apply hash vs) rest)
|
|
||||||
(kf))]
|
|
||||||
[_ (kf)]))
|
|
||||||
|
|
||||||
(define (decode-stream major minor chunk-ok? join-chunks decode rest ks kf)
|
|
||||||
(let loop ((acc-rev '()) (rest rest))
|
|
||||||
(bit-string-case rest
|
|
||||||
([ (= #b0011 :: bits 4) (emajor :: bits 2) (eminor :: bits 2) (rest :: binary) ]
|
|
||||||
(if (and (= major emajor) (= minor eminor))
|
|
||||||
(decode minor (join-chunks (reverse acc-rev)) rest ks kf)
|
|
||||||
(kf)))
|
|
||||||
(else
|
|
||||||
(decode-value rest
|
|
||||||
(lambda (chunk rest)
|
|
||||||
(if (chunk-ok? chunk)
|
|
||||||
(loop (cons chunk acc-rev) rest)
|
|
||||||
(kf)))
|
|
||||||
kf)))))
|
|
||||||
|
|
||||||
(define (decode-value bs ks kf)
|
|
||||||
(bit-string-case bs
|
|
||||||
([ (= #b00000000 :: bits 8) (rest :: binary) ] (ks #f rest))
|
|
||||||
([ (= #b00000001 :: bits 8) (rest :: binary) ] (ks #t rest))
|
|
||||||
([ (= #b00000010 :: bits 8) (v :: float bits 32) (rest :: binary) ] (ks (real->single-flonum v) rest))
|
|
||||||
([ (= #b00000011 :: bits 8) (v :: float bits 64) (rest :: binary) ] (ks v rest))
|
|
||||||
([ (= #b0001 :: bits 4) (x :: bits 4) (rest :: binary) ] (ks (if (> x 12) (- x 16) x) rest))
|
|
||||||
|
|
||||||
([ (= #b001001 :: bits 6) (minor :: bits 2) (rest :: binary) ]
|
|
||||||
(decode-stream 1 minor bytes? bytes-append* decode-binary rest ks kf))
|
|
||||||
([ (= #b001010 :: bits 6) (minor :: bits 2) (rest :: binary) ]
|
|
||||||
(decode-stream 2 minor (lambda (x) #t) values decode-record rest ks kf))
|
|
||||||
([ (= #b001011 :: bits 6) (minor :: bits 2) (rest :: binary) ]
|
|
||||||
(decode-stream 3 minor (lambda (x) #t) values decode-collection rest ks kf))
|
|
||||||
|
|
||||||
([ (= #b01 :: bits 2) (minor :: bits 2) (byte-count :: (wire-length))
|
|
||||||
(bits :: binary bytes byte-count)
|
|
||||||
(rest :: binary) ]
|
|
||||||
(decode-binary minor (bit-string->bytes bits) rest ks kf))
|
|
||||||
|
|
||||||
([ (= #b10 :: bits 2) (minor :: bits 2) (field-count :: (wire-length)) (rest :: binary) ]
|
|
||||||
(decode-values field-count '() rest
|
|
||||||
(lambda (fields rest) (decode-record minor fields rest ks kf))
|
|
||||||
kf))
|
|
||||||
|
|
||||||
([ (= #b11 :: bits 2) (minor :: bits 2) (count :: (wire-length)) (rest :: binary) ]
|
|
||||||
(decode-values count '() rest
|
|
||||||
(lambda (vs rest) (decode-collection minor vs rest ks kf))
|
|
||||||
kf))
|
|
||||||
|
|
||||||
(else (kf))))
|
|
||||||
|
|
||||||
;;---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
(define (skip-whitespace* i)
|
|
||||||
(regexp-match? #px#"^(\\s|,)*" i)
|
|
||||||
(match (peek-char i)
|
|
||||||
[#\; (regexp-match? #px#"[^\r\n]*[\r\n]" i) (skip-whitespace* i)]
|
|
||||||
[_ #t]))
|
|
||||||
|
|
||||||
(define (parse-error* i fmt . args)
|
|
||||||
(define-values [line column pos] (port-next-location i))
|
|
||||||
(raise-read-error (format "read-preserve: ~a" (apply format fmt args))
|
|
||||||
(object-name i)
|
|
||||||
line
|
|
||||||
column
|
|
||||||
pos
|
|
||||||
#f))
|
|
||||||
|
|
||||||
(define (read-preserve [i (current-input-port)])
|
|
||||||
(local-require net/base64)
|
|
||||||
(local-require file/sha1)
|
|
||||||
|
|
||||||
(define-match-expander px
|
|
||||||
(syntax-rules ()
|
|
||||||
[(_ re pat) (app (lambda (v) (regexp-try-match re v)) pat)]))
|
|
||||||
|
|
||||||
(define (parse-error fmt . args)
|
|
||||||
(apply parse-error* i fmt args))
|
|
||||||
|
|
||||||
(define (eof-guard ch)
|
|
||||||
(match ch
|
|
||||||
[(? eof-object?) (parse-error "Unexpected end of input")]
|
|
||||||
[ch ch]))
|
|
||||||
|
|
||||||
(define (peek/no-eof) (eof-guard (peek-char i)))
|
|
||||||
(define (read/no-eof) (eof-guard (read-char i)))
|
|
||||||
|
|
||||||
(define (skip-whitespace) (skip-whitespace* i))
|
|
||||||
|
|
||||||
(define (read-sequence terminator)
|
|
||||||
(sequence-fold '() (lambda (acc) (cons (read-value) acc)) reverse terminator))
|
|
||||||
|
|
||||||
(define (read-dictionary-or-set)
|
|
||||||
(sequence-fold #f
|
|
||||||
(lambda (acc)
|
|
||||||
(define k (read-value))
|
|
||||||
(skip-whitespace)
|
|
||||||
(match (peek-char i)
|
|
||||||
[#\: (when (set? acc) (parse-error "Unexpected key/value separator in set"))
|
|
||||||
(read-char i)
|
|
||||||
(define v (read-value))
|
|
||||||
(hash-set (or acc (hash)) k v)]
|
|
||||||
[_ (when (hash? acc) (parse-error "Missing expected key/value separator"))
|
|
||||||
(set-add (or acc (set)) k)]))
|
|
||||||
(lambda (acc) (or acc (hash)))
|
|
||||||
#\}))
|
|
||||||
|
|
||||||
(define PIPE #\|)
|
|
||||||
|
|
||||||
(define (read-raw-symbol acc)
|
|
||||||
(match (peek-char i)
|
|
||||||
[(or (? eof-object?)
|
|
||||||
(? char? (or #\( #\) #\{ #\} #\[ #\]
|
|
||||||
#\" #\; #\, #\# #\: (== PIPE)
|
|
||||||
(? char-whitespace?))))
|
|
||||||
(string->symbol (list->string (reverse acc)))]
|
|
||||||
[_ (read-raw-symbol (cons (read-char i) acc))]))
|
|
||||||
|
|
||||||
(define (read-base64-binary acc)
|
|
||||||
(skip-whitespace)
|
|
||||||
(define ch (read/no-eof))
|
|
||||||
(cond [(eqv? ch #\})
|
|
||||||
(base64-decode (string->bytes/latin-1 (list->string (reverse acc))))]
|
|
||||||
[(or (and (char>=? ch #\A) (char<=? ch #\Z))
|
|
||||||
(and (char>=? ch #\a) (char<=? ch #\z))
|
|
||||||
(and (char>=? ch #\0) (char<=? ch #\9))
|
|
||||||
(memv ch '(#\+ #\/ #\- #\_ #\=)))
|
|
||||||
(read-base64-binary (cons ch acc))]
|
|
||||||
[else
|
|
||||||
(parse-error "Invalid base64 character")]))
|
|
||||||
|
|
||||||
(define (hexdigit? ch)
|
|
||||||
(or (and (char>=? ch #\A) (char<=? ch #\F))
|
|
||||||
(and (char>=? ch #\a) (char<=? ch #\f))
|
|
||||||
(and (char>=? ch #\0) (char<=? ch #\9))))
|
|
||||||
|
|
||||||
(define (read-hex-binary acc)
|
|
||||||
(skip-whitespace)
|
|
||||||
(define ch (read/no-eof))
|
|
||||||
(cond [(eqv? ch #\})
|
|
||||||
(hex-string->bytes (list->string (reverse acc)))]
|
|
||||||
[(hexdigit? ch)
|
|
||||||
(define ch2 (read/no-eof))
|
|
||||||
(when (not (hexdigit? ch2))
|
|
||||||
(parse-error "Hex-encoded binary digits must come in pairs"))
|
|
||||||
(read-hex-binary (cons ch2 (cons ch acc)))]
|
|
||||||
[else
|
|
||||||
(parse-error "Invalid hex character")]))
|
|
||||||
|
|
||||||
(define (read-stringlike xform-item finish terminator-char hexescape-char hexescape-proc)
|
|
||||||
(let loop ((acc '()))
|
|
||||||
(match (read/no-eof)
|
|
||||||
[(== terminator-char) (finish (reverse acc))]
|
|
||||||
[#\\ (match (read/no-eof)
|
|
||||||
[(== hexescape-char) (loop (cons (hexescape-proc) acc))]
|
|
||||||
[(and c (or (== terminator-char) #\\ #\/)) (loop (cons (xform-item c) acc))]
|
|
||||||
[#\b (loop (cons (xform-item #\u08) acc))]
|
|
||||||
[#\f (loop (cons (xform-item #\u0C) acc))]
|
|
||||||
[#\n (loop (cons (xform-item #\u0A) acc))]
|
|
||||||
[#\r (loop (cons (xform-item #\u0D) acc))]
|
|
||||||
[#\t (loop (cons (xform-item #\u09) acc))]
|
|
||||||
[c (parse-error "Invalid escape code \\~a" c)])]
|
|
||||||
[c (loop (cons (xform-item c) acc))])))
|
|
||||||
|
|
||||||
(define (read-string terminator-char)
|
|
||||||
(read-stringlike values
|
|
||||||
list->string
|
|
||||||
terminator-char
|
|
||||||
#\u
|
|
||||||
(lambda ()
|
|
||||||
(integer->char
|
|
||||||
(match i
|
|
||||||
[(px #px#"^[a-fA-F0-9]{4}" (list hexdigits))
|
|
||||||
(define n1 (string->number (bytes->string/utf-8 hexdigits) 16))
|
|
||||||
(if (<= #xd800 n1 #xdfff) ;; surrogate pair first half
|
|
||||||
(match i
|
|
||||||
[(px #px#"^\\\\u([a-fA-F0-9]{4})" (list _ hexdigits2))
|
|
||||||
(define n2 (string->number (bytes->string/utf-8 hexdigits2) 16))
|
|
||||||
(if (<= #xdc00 n2 #xdfff)
|
|
||||||
(+ (arithmetic-shift (- n1 #xd800) 10)
|
|
||||||
(- n2 #xdc00)
|
|
||||||
#x10000)
|
|
||||||
(parse-error "Bad second half of surrogate pair"))]
|
|
||||||
[_ (parse-error "Missing second half of surrogate pair")])
|
|
||||||
n1)]
|
|
||||||
[_ (parse-error "Bad string \\u escape")])))))
|
|
||||||
|
|
||||||
(define (read-literal-binary)
|
|
||||||
(read-stringlike (lambda (c)
|
|
||||||
(define b (char->integer c))
|
|
||||||
(when (>= b 256)
|
|
||||||
(parse-error "Invalid code point ~a (~v) in literal binary" b c))
|
|
||||||
b)
|
|
||||||
list->bytes
|
|
||||||
#\"
|
|
||||||
#\x
|
|
||||||
(lambda ()
|
|
||||||
(match i
|
|
||||||
[(px #px#"^[a-fA-F0-9]{2}" (list hexdigits))
|
|
||||||
(string->number (bytes->string/utf-8 hexdigits) 16)]
|
|
||||||
[_ (parse-error "Bad binary \\x escape")]))))
|
|
||||||
|
|
||||||
(define (read-intpart acc-rev)
|
|
||||||
(match (peek-char i)
|
|
||||||
[#\0 (read-fracexp (cons (read-char i) acc-rev))]
|
|
||||||
[_ (read-digit+ acc-rev read-fracexp)]))
|
|
||||||
|
|
||||||
(define (read-digit* acc-rev k)
|
|
||||||
(match (peek-char i)
|
|
||||||
[(? char? (? char-numeric?)) (read-digit* (cons (read-char i) acc-rev) k)]
|
|
||||||
[_ (k acc-rev)]))
|
|
||||||
|
|
||||||
(define (read-digit+ acc-rev k)
|
|
||||||
(match (peek-char i)
|
|
||||||
[(? char? (? char-numeric?)) (read-digit* (cons (read-char i) acc-rev) k)]
|
|
||||||
[_ (parse-error "Incomplete number")]))
|
|
||||||
|
|
||||||
(define (read-fracexp acc-rev)
|
|
||||||
(match (peek-char i)
|
|
||||||
[#\. (read-digit+ (cons (read-char i) acc-rev) read-exp)]
|
|
||||||
[_ (read-exp acc-rev)]))
|
|
||||||
|
|
||||||
(define (read-exp acc-rev)
|
|
||||||
(match (peek-char i)
|
|
||||||
[(or #\e #\E) (read-sign-and-exp (cons (read-char i) acc-rev))]
|
|
||||||
[_ (finish-number acc-rev)]))
|
|
||||||
|
|
||||||
(define (read-sign-and-exp acc-rev)
|
|
||||||
(match (peek-char i)
|
|
||||||
[(or #\+ #\-) (read-digit+ (cons (read-char i) acc-rev) finish-number)]
|
|
||||||
[_ (read-digit+ acc-rev finish-number)]))
|
|
||||||
|
|
||||||
(define (finish-number acc-rev)
|
|
||||||
(define s (list->string (reverse acc-rev)))
|
|
||||||
(define n (string->number s))
|
|
||||||
(when (not n) (parse-error "Invalid number: ~v" s))
|
|
||||||
(if (flonum? n)
|
|
||||||
(match (peek-char i)
|
|
||||||
[(or #\f #\F) (read-char i) (real->single-flonum n)]
|
|
||||||
[_ n])
|
|
||||||
n))
|
|
||||||
|
|
||||||
(define (read-number)
|
|
||||||
(match (peek/no-eof)
|
|
||||||
[#\- (read-intpart (list (read-char i)))]
|
|
||||||
[_ (read-intpart (list))]))
|
|
||||||
|
|
||||||
(define (sequence-fold acc accumulate-one finish terminator-char)
|
|
||||||
(let loop ((acc acc))
|
|
||||||
(skip-whitespace)
|
|
||||||
(match (peek/no-eof)
|
|
||||||
[(== terminator-char) (read-char i) (finish acc)]
|
|
||||||
[_ (loop (accumulate-one acc))])))
|
|
||||||
|
|
||||||
(define (collect-fields head)
|
|
||||||
(match (peek-char i)
|
|
||||||
[#\(
|
|
||||||
(read-char i)
|
|
||||||
(collect-fields (build-record head (read-sequence #\))))]
|
|
||||||
[_
|
|
||||||
head]))
|
|
||||||
|
|
||||||
(define (read-value)
|
|
||||||
(skip-whitespace)
|
|
||||||
(collect-fields
|
|
||||||
(match (peek-char i)
|
|
||||||
[(? eof-object? o) o]
|
|
||||||
[#\{ (read-char i) (read-dictionary-or-set)]
|
|
||||||
[#\[ (read-char i) (read-sequence #\])]
|
|
||||||
[(or #\- #\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) (read-number)]
|
|
||||||
[#\" (read-char i) (read-string #\")]
|
|
||||||
[(== PIPE) (read-char i) (string->symbol (read-string PIPE))]
|
|
||||||
[#\# (match i
|
|
||||||
[(px #px#"^#set\\{" (list _))
|
|
||||||
(sequence-fold (set) (lambda (acc) (set-add acc (read-value))) values #\})]
|
|
||||||
[(px #px#"^#hexvalue\\{" (list _))
|
|
||||||
(decode (read-hex-binary '()) (lambda () (parse-error "Invalid #hexvalue encoding")))]
|
|
||||||
[(px #px#"^#true" (list _))
|
|
||||||
#t]
|
|
||||||
[(px #px#"^#false" (list _))
|
|
||||||
#f]
|
|
||||||
[(px #px#"^#\"" (list _))
|
|
||||||
(read-literal-binary)]
|
|
||||||
[(px #px#"^#hex\\{" (list _))
|
|
||||||
(read-hex-binary '())]
|
|
||||||
[(px #px#"^#base64\\{" (list _))
|
|
||||||
(read-base64-binary '())]
|
|
||||||
[_
|
|
||||||
(parse-error "Invalid preserve value")])]
|
|
||||||
[#\: (parse-error "Unexpected key/value separator between items")]
|
|
||||||
[_ (read-raw-symbol '())])))
|
|
||||||
|
|
||||||
(read-value))
|
|
||||||
|
|
||||||
(define (string->preserve s)
|
|
||||||
(define p (open-input-string s))
|
|
||||||
(define v (read-preserve p))
|
|
||||||
(skip-whitespace* p)
|
|
||||||
(when (not (eof-object? (peek-char p)))
|
|
||||||
(parse-error* p "Unexpected text following preserve"))
|
|
||||||
v)
|
|
||||||
|
|
||||||
;;---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
(module+ test
|
|
||||||
(require rackunit)
|
|
||||||
(require (for-syntax racket syntax/srcloc))
|
|
||||||
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 0)) (bytes 0))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 1)) (bytes 1))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 127)) (bytes 127))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 128)) (bytes 128 1))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 255)) (bytes 255 1))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 256)) (bytes 128 2))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 300)) (bytes #b10101100 #b00000010))
|
|
||||||
(check-equal? (bit-string->bytes (encode-varint 1000000000)) (bytes 128 148 235 220 3))
|
|
||||||
|
|
||||||
(define (ks* v rest) (list v (bit-string->bytes rest)))
|
|
||||||
(define (kf*) (void))
|
|
||||||
|
|
||||||
(check-equal? (decode-varint (bytes 0) ks* kf*) (list 0 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 0 99) ks* kf*) (list 0 (bytes 99)))
|
|
||||||
(check-equal? (decode-varint (bytes 1) ks* kf*) (list 1 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 127) ks* kf*) (list 127 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 128 1) ks* kf*) (list 128 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 128 1 99) ks* kf*) (list 128 (bytes 99)))
|
|
||||||
(check-equal? (decode-varint (bytes 255 1) ks* kf*) (list 255 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 128 2) ks* kf*) (list 256 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes #b10101100 #b00000010) ks* kf*) (list 300 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 128 148 235 220 3) ks* kf*) (list 1000000000 (bytes)))
|
|
||||||
(check-equal? (decode-varint (bytes 128 148 235 220 3 99) ks* kf*) (list 1000000000 (bytes 99)))
|
|
||||||
|
|
||||||
(check-equal? (bit-string->bytes (bit-string (0 :: bits 4) (0 :: (wire-length)))) (bytes 0))
|
|
||||||
(check-equal? (bit-string->bytes (bit-string (0 :: bits 4) (3 :: (wire-length)))) (bytes 3))
|
|
||||||
(check-equal? (bit-string->bytes (bit-string (0 :: bits 4) (14 :: (wire-length)))) (bytes 14))
|
|
||||||
(check-equal? (bit-string->bytes (bit-string (0 :: bits 4) (15 :: (wire-length)))) (bytes 15 15))
|
|
||||||
(check-equal? (bit-string->bytes (bit-string (0 :: bits 4) (100 :: (wire-length))))
|
|
||||||
(bytes 15 100))
|
|
||||||
(check-equal? (bit-string->bytes (bit-string (0 :: bits 4) (300 :: (wire-length))))
|
|
||||||
(bytes 15 #b10101100 #b00000010))
|
|
||||||
|
|
||||||
(define (dwl bs)
|
|
||||||
(bit-string-case bs
|
|
||||||
([ (= 0 :: bits 4) (w :: (wire-length)) ] w)
|
|
||||||
(else (void))))
|
|
||||||
|
|
||||||
(check-equal? (dwl (bytes 0)) 0)
|
|
||||||
(check-equal? (dwl (bytes 3)) 3)
|
|
||||||
(check-equal? (dwl (bytes 14)) 14)
|
|
||||||
(check-equal? (dwl (bytes 15)) (void))
|
|
||||||
(check-equal? (dwl (bytes 15 9)) (void)) ;; not canonical
|
|
||||||
(check-equal? (dwl (bytes 15 15)) 15)
|
|
||||||
(check-equal? (dwl (bytes 15 100)) 100)
|
|
||||||
(check-equal? (dwl (bytes 15 #b10101100 #b00000010)) 300)
|
|
||||||
|
|
||||||
(struct speak (who what) #:prefab)
|
|
||||||
|
|
||||||
(define (expected . pieces)
|
|
||||||
(bit-string->bytes
|
|
||||||
(apply bit-string-append
|
|
||||||
(map (match-lambda
|
|
||||||
[(? byte? b) (bytes b)]
|
|
||||||
[(? bytes? bs) bs]
|
|
||||||
[(? string? s) (string->bytes/utf-8 s)])
|
|
||||||
pieces))))
|
|
||||||
|
|
||||||
(define (d bs) (decode bs void))
|
|
||||||
|
|
||||||
(define-syntax (cross-check stx)
|
|
||||||
(syntax-case stx ()
|
|
||||||
((_ text v (b ...))
|
|
||||||
#'(let ((val v)) (cross-check text v v (b ...))))
|
|
||||||
((_ text forward back (b ...))
|
|
||||||
#`(let ((loc #,(source-location->string #'forward)))
|
|
||||||
(check-equal? (string->preserve text) back loc)
|
|
||||||
(check-equal? (d (encode forward)) back loc)
|
|
||||||
(check-equal? (d (encode back)) back loc)
|
|
||||||
(check-equal? (d (expected b ...)) back loc)
|
|
||||||
(check-equal? (encode forward) (expected b ...) loc)
|
|
||||||
))))
|
|
||||||
|
|
||||||
(define-syntax (cross-check/nondeterministic stx)
|
|
||||||
(syntax-case stx ()
|
|
||||||
((_ text v (b ...))
|
|
||||||
#'(let ((val v)) (cross-check/nondeterministic text v v (b ...))))
|
|
||||||
((_ text forward back (b ...))
|
|
||||||
#`(let ((loc #,(source-location->string #'forward)))
|
|
||||||
(check-equal? (string->preserve text) back loc)
|
|
||||||
(check-equal? (d (encode forward)) back loc)
|
|
||||||
(check-equal? (d (encode back)) back loc)
|
|
||||||
(check-equal? (d (expected b ...)) back loc)
|
|
||||||
))))
|
|
||||||
|
|
||||||
(cross-check "capture(discard())" (capture (discard)) (#x91 #x80))
|
|
||||||
(cross-check "observe(speak(discard(), capture(discard())))"
|
|
||||||
(observe (speak (discard) (capture (discard))))
|
|
||||||
(#xA1 #xB3 #x75 "speak" #x80 #x91 #x80))
|
|
||||||
(cross-check "[1, 2, 3, 4]" '(1 2 3 4) (#xC4 #x11 #x12 #x13 #x14))
|
|
||||||
(cross-check "[1 2 3 4]"
|
|
||||||
(stream-of 'sequence (sequence->generator '(1 2 3 4)))
|
|
||||||
'(1 2 3 4)
|
|
||||||
(#x2C #x11 #x12 #x13 #x14 #x3C))
|
|
||||||
(cross-check " [ -2 -1 0 1 ] " '(-2 -1 0 1) (#xC4 #x1E #x1F #x10 #x11))
|
|
||||||
(cross-check "\"hello\"" "hello" (#x55 "hello"))
|
|
||||||
(cross-check "\"hello\""
|
|
||||||
(stream-of 'string (sequence->generator '(#"he" #"llo")))
|
|
||||||
"hello"
|
|
||||||
(#x25 #x62 "he" #x63 "llo" #x35))
|
|
||||||
(cross-check "\"hello\""
|
|
||||||
(stream-of 'string (sequence->generator '(#"he" #"ll" #"" #"" #"o")))
|
|
||||||
"hello"
|
|
||||||
(#x25 #x62 "he" #x62 "ll" #x60 #x60 #x61 "o" #x35))
|
|
||||||
(cross-check "#\"hello\""
|
|
||||||
(stream-of 'byte-string (sequence->generator '(#"he" #"ll" #"" #"" #"o")))
|
|
||||||
#"hello"
|
|
||||||
(#x26 #x62 "he" #x62 "ll" #x60 #x60 #x61 "o" #x36))
|
|
||||||
(cross-check "hello"
|
|
||||||
(stream-of 'symbol (sequence->generator '(#"he" #"ll" #"" #"" #"o")))
|
|
||||||
'hello
|
|
||||||
(#x27 #x62 "he" #x62 "ll" #x60 #x60 #x61 "o" #x37))
|
|
||||||
(cross-check "[\"hello\" there #\"world\" [] #set{} #true #false]"
|
|
||||||
`("hello" there #"world" () ,(set) #t #f)
|
|
||||||
(#xC7 #x55 "hello" #x75 "there" #x65 "world" #xC0 #xD0 #x01 #x00))
|
|
||||||
|
|
||||||
(cross-check "#\"ABC\"" #"ABC" (#x63 #x41 #x42 #x43))
|
|
||||||
(cross-check "#hex{414243}" #"ABC" (#x63 #x41 #x42 #x43))
|
|
||||||
(cross-check "#hex{ 41 4A 4e }" #"AJN" (#x63 #x41 #x4A #x4E))
|
|
||||||
(cross-check "#hex{ 41;re\n 42 43 }" #"ABC" (#x63 #x41 #x42 #x43))
|
|
||||||
(check-exn exn? (lambda () (string->preserve "#hex{414 243}"))) ;; bytes must be 2-digits entire
|
|
||||||
(cross-check "#base64{Y29yeW1i}" #"corymb" (#x66 "corymb"))
|
|
||||||
(cross-check "#base64{Y29 yeW 1i}" #"corymb" (#x66 "corymb"))
|
|
||||||
(cross-check ";; a comment\n#base64{\n;x\nY29 yeW 1i}" #"corymb" (#x66 "corymb"))
|
|
||||||
(cross-check "#base64{SGk=}" #"Hi" (#x62 "Hi"))
|
|
||||||
(cross-check "#base64{SGk}" #"Hi" (#x62 "Hi"))
|
|
||||||
(cross-check "#base64{ S G k }" #"Hi" (#x62 "Hi"))
|
|
||||||
|
|
||||||
(check-equal? (string->preserve "[]") '())
|
|
||||||
(check-equal? (string->preserve "{}") (hash))
|
|
||||||
(check-equal? (string->preserve "\"\"") "")
|
|
||||||
(check-equal? (string->preserve "||") (string->symbol ""))
|
|
||||||
(check-equal? (string->preserve "#set{}") (set))
|
|
||||||
|
|
||||||
(check-equal? (string->preserve "{1 2 3}") (set 1 2 3))
|
|
||||||
(check-equal? (string->preserve "#set{1 2 3}") (set 1 2 3))
|
|
||||||
|
|
||||||
(cross-check "\"abc\\u6c34\\u6C34\\\\\\/\\\"\\b\\f\\n\\r\\txyz\""
|
|
||||||
"abc\u6c34\u6c34\\/\"\b\f\n\r\txyz"
|
|
||||||
(#x5f #x14
|
|
||||||
#x61 #x62 #x63 #xe6 #xb0 #xb4 #xe6 #xb0
|
|
||||||
#xb4 #x5c #x2f #x22 #x08 #x0c #x0a #x0d
|
|
||||||
#x09 #x78 #x79 #x7a))
|
|
||||||
|
|
||||||
(cross-check "|abc\\u6c34\\u6C34\\\\\\/\\|\\b\\f\\n\\r\\txyz|"
|
|
||||||
(string->symbol "abc\u6c34\u6c34\\/|\b\f\n\r\txyz")
|
|
||||||
(#x7f #x14
|
|
||||||
#x61 #x62 #x63 #xe6 #xb0 #xb4 #xe6 #xb0
|
|
||||||
#xb4 #x5c #x2f #x7c #x08 #x0c #x0a #x0d
|
|
||||||
#x09 #x78 #x79 #x7a))
|
|
||||||
|
|
||||||
(check-exn #px"Invalid escape code \\\\u" (lambda () (string->preserve "#\"\\u6c34\"")))
|
|
||||||
|
|
||||||
(cross-check "#\"abc\\x6c\\x34\\xf0\\\\\\/\\\"\\b\\f\\n\\r\\txyz\""
|
|
||||||
#"abc\x6c\x34\xf0\\/\"\b\f\n\r\txyz"
|
|
||||||
(#x6f #x11
|
|
||||||
#x61 #x62 #x63 #x6c #x34 #xf0 #x5c #x2f
|
|
||||||
#x22 #x08 #x0c #x0a #x0d #x09 #x78 #x79 #x7a))
|
|
||||||
|
|
||||||
(cross-check "\"\\uD834\\uDD1E\"" "\U0001D11E" (#x54 #xF0 #x9D #x84 #x9E))
|
|
||||||
|
|
||||||
(cross-check "-257" -257 (#x42 #xFE #xFF))
|
|
||||||
(cross-check "-256" -256 (#x42 #xFF #x00))
|
|
||||||
(cross-check "-255" -255 (#x42 #xFF #x01))
|
|
||||||
(cross-check "-254" -254 (#x42 #xFF #x02))
|
|
||||||
(cross-check "-129" -129 (#x42 #xFF #x7F))
|
|
||||||
(cross-check "-128" -128 (#x41 #x80))
|
|
||||||
(cross-check "-127" -127 (#x41 #x81))
|
|
||||||
(cross-check "-4" -4 (#x41 #xFC))
|
|
||||||
(cross-check "-3" -3 (#x1D))
|
|
||||||
(cross-check "-2" -2 (#x1E))
|
|
||||||
(cross-check "-1" -1 (#x1F))
|
|
||||||
(cross-check "0" 0 (#x10))
|
|
||||||
(cross-check "1" 1 (#x11))
|
|
||||||
(cross-check "12" 12 (#x1C))
|
|
||||||
(cross-check "13" 13 (#x41 #x0D))
|
|
||||||
(cross-check "127" 127 (#x41 #x7F))
|
|
||||||
(cross-check "128" 128 (#x42 #x00 #x80))
|
|
||||||
(cross-check "255" 255 (#x42 #x00 #xFF))
|
|
||||||
(cross-check "256" 256 (#x42 #x01 #x00))
|
|
||||||
(cross-check "32767" 32767 (#x42 #x7F #xFF))
|
|
||||||
(cross-check "32768" 32768 (#x43 #x00 #x80 #x00))
|
|
||||||
(cross-check "65535" 65535 (#x43 #x00 #xFF #xFF))
|
|
||||||
(cross-check "65536" 65536 (#x43 #x01 #x00 #x00))
|
|
||||||
(cross-check "131072" 131072 (#x43 #x02 #x00 #x00))
|
|
||||||
|
|
||||||
(cross-check "1.0f" 1.0f0 (#b00000010 #b00111111 #b10000000 0 0))
|
|
||||||
(cross-check "1.0" 1.0 (#b00000011 #b00111111 #b11110000 0 0 0 0 0 0))
|
|
||||||
(cross-check "-1.202e300" -1.202e300 (#x03 #xFE #x3C #xB7 #xB7 #x59 #xBF #x04 #x26))
|
|
||||||
|
|
||||||
(check-equal? (d (expected #x25 #x51 "a" #x35)) (void)) ;; Bad chunk type: must be bytes
|
|
||||||
(check-equal? (d (expected #x25 #x71 "a" #x35)) (void)) ;; Bad chunk type: must be bytes
|
|
||||||
(check-equal? (d (expected #x26 #x51 "a" #x36)) (void)) ;; Bad chunk type: must be bytes
|
|
||||||
(check-equal? (d (expected #x26 #x71 "a" #x36)) (void)) ;; Bad chunk type: must be bytes
|
|
||||||
(check-equal? (d (expected #x27 #x51 "a" #x37)) (void)) ;; Bad chunk type: must be bytes
|
|
||||||
(check-equal? (d (expected #x27 #x71 "a" #x37)) (void)) ;; Bad chunk type: must be bytes
|
|
||||||
(check-equal? (d (expected #x25 #x61 "a" #x35)) "a")
|
|
||||||
(check-equal? (d (expected #x26 #x61 "a" #x36)) #"a")
|
|
||||||
(check-equal? (d (expected #x27 #x61 "a" #x37)) 'a)
|
|
||||||
|
|
||||||
(struct date (year month day) #:prefab)
|
|
||||||
(struct thing (id) #:prefab)
|
|
||||||
(struct person thing (name date-of-birth) #:prefab)
|
|
||||||
(struct titled person (title) #:prefab)
|
|
||||||
|
|
||||||
(cross-check
|
|
||||||
"[titled person 2 thing 1](101, \"Blackwell\", date(1821 2 3), \"Dr\")"
|
|
||||||
(titled 101 "Blackwell" (date 1821 2 3) "Dr")
|
|
||||||
(#xB5 ;; Record, generic, 4+1
|
|
||||||
#xC5 ;; Sequence, 5
|
|
||||||
#x76 #x74 #x69 #x74 #x6C #x65 #x64 ;; Symbol, "titled"
|
|
||||||
#x76 #x70 #x65 #x72 #x73 #x6F #x6E ;; Symbol, "person"
|
|
||||||
#x12 ;; SignedInteger, "2"
|
|
||||||
#x75 #x74 #x68 #x69 #x6E #x67 ;; Symbol, "thing"
|
|
||||||
#x11 ;; SignedInteger, "1"
|
|
||||||
#x41 #x65 ;; SignedInteger, "101"
|
|
||||||
#x59 #x42 #x6C #x61 #x63 #x6B #x77 #x65 #x6C #x6C ;; String, "Blackwell"
|
|
||||||
#xB4 ;; Record, generic, 3+1
|
|
||||||
#x74 #x64 #x61 #x74 #x65 ;; Symbol, "date"
|
|
||||||
#x42 #x07 #x1D ;; SignedInteger, "1821"
|
|
||||||
#x12 ;; SignedInteger, "2"
|
|
||||||
#x13 ;; SignedInteger, "3"
|
|
||||||
#x52 #x44 #x72 ;; String, "Dr"
|
|
||||||
))
|
|
||||||
|
|
||||||
(cross-check "discard()" (record 'discard '()) (discard) (#x80))
|
|
||||||
(cross-check "discard(surprise)"
|
|
||||||
(record 'discard '(surprise))
|
|
||||||
'#s(discard surprise)
|
|
||||||
(#x81 #x78 "surprise"))
|
|
||||||
(cross-check "capture(x)" (record 'capture '(x)) (capture 'x) (#x91 #x71 "x"))
|
|
||||||
(cross-check "observe(x)" (record 'observe '(x)) (observe 'x) (#xA1 #x71 "x"))
|
|
||||||
(cross-check "observe(x y)" (record 'observe '(x y)) '#s(observe x y) (#xA2 #x71 "x" #x71 "y"))
|
|
||||||
(cross-check "other(x y)"
|
|
||||||
(record 'other '(x y))
|
|
||||||
'#s(other x y)
|
|
||||||
(#xB3 #x75 "other" #x71 "x" #x71 "y"))
|
|
||||||
(cross-check "\"aString\"(3 4)"
|
|
||||||
(record "aString" '(3 4))
|
|
||||||
(#xB3 #x57 "aString" #x13 #x14))
|
|
||||||
(cross-check "discard()(3, 4)"
|
|
||||||
(record (discard) '(3 4))
|
|
||||||
(#xB3 #x80 #x13 #x14))
|
|
||||||
|
|
||||||
(check-equal? (d (expected #x2C #x00 #x00)) (void)) ;; missing end byte
|
|
||||||
(check-equal? (d (expected #xC3 #x00 #x00)) (void)) ;; missing element
|
|
||||||
|
|
||||||
(cross-check/nondeterministic
|
|
||||||
"{a: 1, \"b\": #true, [1 2 3]: #\"c\", {first-name:\"Elizabeth\"}:{surname:\"Blackwell\"}}"
|
|
||||||
(hash 'a 1
|
|
||||||
"b" #t
|
|
||||||
'(1 2 3) #"c"
|
|
||||||
(hash 'first-name "Elizabeth") (hash 'surname "Blackwell"))
|
|
||||||
(#xE8 #x71 "a" #x11
|
|
||||||
#x51 "b" #x01
|
|
||||||
#xC3 #x11 #x12 #x13 #x61 "c"
|
|
||||||
#xE2 #x7A "first-name" #x59 "Elizabeth"
|
|
||||||
#xE2 #x77 "surname" #x59 "Blackwell"
|
|
||||||
))
|
|
||||||
|
|
||||||
(let ()
|
|
||||||
(local-require json)
|
|
||||||
(define rfc8259-example1 (string->preserve #<<EOF
|
|
||||||
{
|
|
||||||
"Image": {
|
|
||||||
"Width": 800,
|
|
||||||
"Height": 600,
|
|
||||||
"Title": "View from 15th Floor",
|
|
||||||
"Thumbnail": {
|
|
||||||
"Url": "http://www.example.com/image/481989943",
|
|
||||||
"Height": 125,
|
|
||||||
"Width": 100
|
|
||||||
},
|
|
||||||
"Animated" : false,
|
|
||||||
"IDs": [116, 943, 234, 38793]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
EOF
|
|
||||||
))
|
|
||||||
(define rfc8259-example2 (string->preserve #<<EOF
|
|
||||||
[
|
|
||||||
{
|
|
||||||
"precision": "zip",
|
|
||||||
"Latitude": 37.7668,
|
|
||||||
"Longitude": -122.3959,
|
|
||||||
"Address": "",
|
|
||||||
"City": "SAN FRANCISCO",
|
|
||||||
"State": "CA",
|
|
||||||
"Zip": "94107",
|
|
||||||
"Country": "US"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"precision": "zip",
|
|
||||||
"Latitude": 37.371991,
|
|
||||||
"Longitude": -122.026020,
|
|
||||||
"Address": "",
|
|
||||||
"City": "SUNNYVALE",
|
|
||||||
"State": "CA",
|
|
||||||
"Zip": "94085",
|
|
||||||
"Country": "US"
|
|
||||||
}
|
|
||||||
]
|
|
||||||
EOF
|
|
||||||
))
|
|
||||||
|
|
||||||
(cross-check/nondeterministic
|
|
||||||
"{\"Image\": {\"Width\": 800,\"Height\": 600,\"Title\": \"View from 15th Floor\",\"Thumbnail\": {\"Url\": \"http://www.example.com/image/481989943\",\"Height\": 125,\"Width\": 100},\"Animated\" : false,\"IDs\": [116, 943, 234, 38793]}}"
|
|
||||||
rfc8259-example1
|
|
||||||
(#xe2
|
|
||||||
#x55 "Image"
|
|
||||||
#xec
|
|
||||||
#x55 "Width" #x42 #x03 #x20
|
|
||||||
#x55 "Title" #x5f #x14 "View from 15th Floor"
|
|
||||||
#x58 "Animated" #x75 "false"
|
|
||||||
#x56 "Height" #x42 #x02 #x58
|
|
||||||
#x59 "Thumbnail"
|
|
||||||
#xe6
|
|
||||||
#x55 "Width" #x41 #x64
|
|
||||||
#x53 "Url" #x5f #x26 "http://www.example.com/image/481989943"
|
|
||||||
#x56 "Height" #x41 #x7d
|
|
||||||
#x53 "IDs" #xc4
|
|
||||||
#x41 #x74
|
|
||||||
#x42 #x03 #xaf
|
|
||||||
#x42 #x00 #xea
|
|
||||||
#x43 #x00 #x97 #x89
|
|
||||||
))
|
|
||||||
|
|
||||||
(cross-check/nondeterministic
|
|
||||||
"[{\"precision\": \"zip\",\"Latitude\": 37.7668,\"Longitude\": -122.3959,\"Address\": \"\",\"City\": \"SAN FRANCISCO\",\"State\": \"CA\",\"Zip\": \"94107\",\"Country\": \"US\"},{\"precision\": \"zip\",\"Latitude\": 37.371991,\"Longitude\": -122.026020,\"Address\": \"\",\"City\": \"SUNNYVALE\",\"State\": \"CA\",\"Zip\": \"94085\",\"Country\": \"US\"}]"
|
|
||||||
rfc8259-example2
|
|
||||||
(#xc2
|
|
||||||
#xef #x10
|
|
||||||
#x59 "precision" #x53 "zip"
|
|
||||||
#x58 "Latitude" #x03 #x40 #x42 #xe2 #x26 #x80 #x9d #x49 #x52
|
|
||||||
#x59 "Longitude" #x03 #xc0 #x5e #x99 #x56 #x6c #xf4 #x1f #x21
|
|
||||||
#x57 "Address" #x50
|
|
||||||
#x54 "City" #x5D "SAN FRANCISCO"
|
|
||||||
#x55 "State" #x52 "CA"
|
|
||||||
#x53 "Zip" #x55 "94107"
|
|
||||||
#x57 "Country" #x52 "US"
|
|
||||||
#xef #x10
|
|
||||||
#x59 "precision" #x53 "zip"
|
|
||||||
#x58 "Latitude" #x03 #x40 #x42 #xaf #x9d #x66 #xad #xb4 #x03
|
|
||||||
#x59 "Longitude" #x03 #xc0 #x5e #x81 #xaa #x4f #xca #x42 #xaf
|
|
||||||
#x57 "Address" #x50
|
|
||||||
#x54 "City" #x59 "SUNNYVALE"
|
|
||||||
#x55 "State" #x52 "CA"
|
|
||||||
#x53 "Zip" #x55 "94085"
|
|
||||||
#x57 "Country" #x52 "US"
|
|
||||||
))
|
|
||||||
)
|
|
||||||
)
|
|
|
@ -1,171 +0,0 @@
|
||||||
from preserve import *
|
|
||||||
import unittest
|
|
||||||
|
|
||||||
if isinstance(chr(123), bytes):
|
|
||||||
def _byte(x):
|
|
||||||
return chr(x)
|
|
||||||
def _hex(x):
|
|
||||||
return x.encode('hex')
|
|
||||||
else:
|
|
||||||
def _byte(x):
|
|
||||||
return bytes([x])
|
|
||||||
def _hex(x):
|
|
||||||
return x.hex()
|
|
||||||
|
|
||||||
def _buf(*args):
|
|
||||||
result = []
|
|
||||||
for chunk in args:
|
|
||||||
if isinstance(chunk, bytes):
|
|
||||||
result.append(chunk)
|
|
||||||
elif isinstance(chunk, basestring):
|
|
||||||
result.append(chunk.encode('utf-8'))
|
|
||||||
elif isinstance(chunk, numbers.Number):
|
|
||||||
result.append(_byte(chunk))
|
|
||||||
else:
|
|
||||||
raise Exception('Invalid chunk in _buf %r' % (chunk,))
|
|
||||||
result = b''.join(result)
|
|
||||||
return result
|
|
||||||
|
|
||||||
def _varint(v):
|
|
||||||
e = Encoder()
|
|
||||||
e.varint(v)
|
|
||||||
return e.contents()
|
|
||||||
|
|
||||||
def _d(bs):
|
|
||||||
d = Decoder(bs)
|
|
||||||
return d.next()
|
|
||||||
|
|
||||||
def _e(v):
|
|
||||||
e = Encoder()
|
|
||||||
e.append(v)
|
|
||||||
return e.contents()
|
|
||||||
|
|
||||||
def _R(k, *args):
|
|
||||||
return Record(Symbol(k), args)
|
|
||||||
|
|
||||||
class CodecTests(unittest.TestCase):
|
|
||||||
def _roundtrip(self, forward, expected, back=None, nondeterministic=False):
|
|
||||||
if back is None: back = forward
|
|
||||||
self.assertEqual(_d(_e(forward)), back)
|
|
||||||
self.assertEqual(_d(_e(back)), back)
|
|
||||||
self.assertEqual(_d(expected), back)
|
|
||||||
if not nondeterministic:
|
|
||||||
actual = _e(forward)
|
|
||||||
self.assertEqual(actual, expected, '%s != %s' % (_hex(actual), _hex(expected)))
|
|
||||||
|
|
||||||
def test_decode_varint(self):
|
|
||||||
with self.assertRaises(DecodeError):
|
|
||||||
Decoder(_buf()).varint()
|
|
||||||
self.assertEqual(Decoder(_buf(0)).varint(), 0)
|
|
||||||
self.assertEqual(Decoder(_buf(10)).varint(), 10)
|
|
||||||
self.assertEqual(Decoder(_buf(100)).varint(), 100)
|
|
||||||
self.assertEqual(Decoder(_buf(200, 1)).varint(), 200)
|
|
||||||
self.assertEqual(Decoder(_buf(0b10101100, 0b00000010)).varint(), 300)
|
|
||||||
self.assertEqual(Decoder(_buf(128, 148, 235, 220, 3)).varint(), 1000000000)
|
|
||||||
|
|
||||||
def test_encode_varint(self):
|
|
||||||
self.assertEqual(_varint(0), _buf(0))
|
|
||||||
self.assertEqual(_varint(10), _buf(10))
|
|
||||||
self.assertEqual(_varint(100), _buf(100))
|
|
||||||
self.assertEqual(_varint(200), _buf(200, 1))
|
|
||||||
self.assertEqual(_varint(300), _buf(0b10101100, 0b00000010))
|
|
||||||
self.assertEqual(_varint(1000000000), _buf(128, 148, 235, 220, 3))
|
|
||||||
|
|
||||||
def test_shorts(self):
|
|
||||||
self._roundtrip(_R('capture', _R('discard')), _buf(0x91, 0x80))
|
|
||||||
self._roundtrip(_R('observe', _R('speak', _R('discard'), _R('capture', _R('discard')))),
|
|
||||||
_buf(0xA1, 0xB3, 0x75, "speak", 0x80, 0x91, 0x80))
|
|
||||||
|
|
||||||
def test_simple_seq(self):
|
|
||||||
self._roundtrip([1,2,3,4], _buf(0xC4, 0x11, 0x12, 0x13, 0x14), back=(1,2,3,4))
|
|
||||||
self._roundtrip(SequenceStream([1,2,3,4]), _buf(0x2C, 0x11, 0x12, 0x13, 0x14, 0x3C),
|
|
||||||
back=(1,2,3,4))
|
|
||||||
self._roundtrip((-2,-1,0,1), _buf(0xC4, 0x1E, 0x1F, 0x10, 0x11))
|
|
||||||
|
|
||||||
def test_str(self):
|
|
||||||
self._roundtrip(u'hello', _buf(0x55, 'hello'))
|
|
||||||
self._roundtrip(StringStream([b'he', b'llo']), _buf(0x25, 0x62, 'he', 0x63, 'llo', 0x35),
|
|
||||||
back=u'hello')
|
|
||||||
self._roundtrip(StringStream([b'he', b'll', b'', b'', b'o']),
|
|
||||||
_buf(0x25, 0x62, 'he', 0x62, 'll', 0x60, 0x60, 0x61, 'o', 0x35),
|
|
||||||
back=u'hello')
|
|
||||||
self._roundtrip(BinaryStream([b'he', b'll', b'', b'', b'o']),
|
|
||||||
_buf(0x26, 0x62, 'he', 0x62, 'll', 0x60, 0x60, 0x61, 'o', 0x36),
|
|
||||||
back=b'hello')
|
|
||||||
self._roundtrip(SymbolStream([b'he', b'll', b'', b'', b'o']),
|
|
||||||
_buf(0x27, 0x62, 'he', 0x62, 'll', 0x60, 0x60, 0x61, 'o', 0x37),
|
|
||||||
back=Symbol(u'hello'))
|
|
||||||
|
|
||||||
def test_mixed1(self):
|
|
||||||
self._roundtrip((u'hello', Symbol(u'there'), b'world', (), set(), True, False),
|
|
||||||
_buf(0xc7, 0x55, 'hello', 0x75, 'there', 0x65, 'world', 0xc0, 0xd0, 1, 0))
|
|
||||||
|
|
||||||
def test_signedinteger(self):
|
|
||||||
self._roundtrip(-257, _buf(0x42, 0xFE, 0xFF))
|
|
||||||
self._roundtrip(-256, _buf(0x42, 0xFF, 0x00))
|
|
||||||
self._roundtrip(-255, _buf(0x42, 0xFF, 0x01))
|
|
||||||
self._roundtrip(-254, _buf(0x42, 0xFF, 0x02))
|
|
||||||
self._roundtrip(-129, _buf(0x42, 0xFF, 0x7F))
|
|
||||||
self._roundtrip(-128, _buf(0x41, 0x80))
|
|
||||||
self._roundtrip(-127, _buf(0x41, 0x81))
|
|
||||||
self._roundtrip(-4, _buf(0x41, 0xFC))
|
|
||||||
self._roundtrip(-3, _buf(0x1D))
|
|
||||||
self._roundtrip(-2, _buf(0x1E))
|
|
||||||
self._roundtrip(-1, _buf(0x1F))
|
|
||||||
self._roundtrip(0, _buf(0x10))
|
|
||||||
self._roundtrip(1, _buf(0x11))
|
|
||||||
self._roundtrip(12, _buf(0x1C))
|
|
||||||
self._roundtrip(13, _buf(0x41, 0x0D))
|
|
||||||
self._roundtrip(127, _buf(0x41, 0x7F))
|
|
||||||
self._roundtrip(128, _buf(0x42, 0x00, 0x80))
|
|
||||||
self._roundtrip(255, _buf(0x42, 0x00, 0xFF))
|
|
||||||
self._roundtrip(256, _buf(0x42, 0x01, 0x00))
|
|
||||||
self._roundtrip(32767, _buf(0x42, 0x7F, 0xFF))
|
|
||||||
self._roundtrip(32768, _buf(0x43, 0x00, 0x80, 0x00))
|
|
||||||
self._roundtrip(65535, _buf(0x43, 0x00, 0xFF, 0xFF))
|
|
||||||
self._roundtrip(65536, _buf(0x43, 0x01, 0x00, 0x00))
|
|
||||||
self._roundtrip(131072, _buf(0x43, 0x02, 0x00, 0x00))
|
|
||||||
|
|
||||||
def test_floats(self):
|
|
||||||
self._roundtrip(Float(1.0), _buf(2, 0x3f, 0x80, 0, 0))
|
|
||||||
self._roundtrip(1.0, _buf(3, 0x3f, 0xf0, 0, 0, 0, 0, 0, 0))
|
|
||||||
self._roundtrip(-1.202e300, _buf(3, 0xfe, 0x3c, 0xb7, 0xb7, 0x59, 0xbf, 0x04, 0x26))
|
|
||||||
|
|
||||||
def test_badchunks(self):
|
|
||||||
self.assertEqual(_d(_buf(0x25, 0x61, 'a', 0x35)), u'a')
|
|
||||||
self.assertEqual(_d(_buf(0x26, 0x61, 'a', 0x36)), b'a')
|
|
||||||
self.assertEqual(_d(_buf(0x27, 0x61, 'a', 0x37)), Symbol(u'a'))
|
|
||||||
for a in [0x25, 0x26, 0x27]:
|
|
||||||
for b in [0x51, 0x71]:
|
|
||||||
with self.assertRaises(DecodeError, msg='Unexpected non-binary chunk') as cm:
|
|
||||||
_d(_buf(a, b, 'a', 0x10+a))
|
|
||||||
|
|
||||||
def test_person(self):
|
|
||||||
self._roundtrip(Record((Symbol(u'titled'), Symbol(u'person'), 2, Symbol(u'thing'), 1),
|
|
||||||
[
|
|
||||||
101,
|
|
||||||
u'Blackwell',
|
|
||||||
_R(u'date', 1821, 2, 3),
|
|
||||||
u'Dr'
|
|
||||||
]),
|
|
||||||
_buf(0xB5, 0xC5, 0x76, 0x74, 0x69, 0x74, 0x6C, 0x65,
|
|
||||||
0x64, 0x76, 0x70, 0x65, 0x72, 0x73, 0x6F, 0x6E,
|
|
||||||
0x12, 0x75, 0x74, 0x68, 0x69, 0x6E, 0x67, 0x11,
|
|
||||||
0x41, 0x65, 0x59, 0x42, 0x6C, 0x61, 0x63, 0x6B,
|
|
||||||
0x77, 0x65, 0x6C, 0x6C, 0xB4, 0x74, 0x64, 0x61,
|
|
||||||
0x74, 0x65, 0x42, 0x07, 0x1D, 0x12, 0x13, 0x52,
|
|
||||||
0x44, 0x72))
|
|
||||||
|
|
||||||
def test_dict(self):
|
|
||||||
self._roundtrip({ Symbol(u'a'): 1,
|
|
||||||
u'b': True,
|
|
||||||
(1, 2, 3): b'c',
|
|
||||||
ImmutableDict({ Symbol(u'first-name'): u'Elizabeth', }):
|
|
||||||
{ Symbol(u'surname'): u'Blackwell' } },
|
|
||||||
_buf(0xE8,
|
|
||||||
0x71, "a", 0x11,
|
|
||||||
0x51, "b", 0x01,
|
|
||||||
0xC3, 0x11, 0x12, 0x13, 0x61, "c",
|
|
||||||
0xE2, 0x7A, "first-name", 0x59, "Elizabeth",
|
|
||||||
0xE2, 0x77, "surname", 0x59, "Blackwell"),
|
|
||||||
nondeterministic = True)
|
|
|
@ -8,7 +8,7 @@
|
||||||
(require racket/random file/sha1)
|
(require racket/random file/sha1)
|
||||||
(require imperative-syndicate/skeleton)
|
(require imperative-syndicate/skeleton)
|
||||||
(require imperative-syndicate/term)
|
(require imperative-syndicate/term)
|
||||||
(require "preserve.rkt")
|
(require preserves)
|
||||||
|
|
||||||
(define-logger mcds)
|
(define-logger mcds)
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue