More TODOs in the text; initial textual reader in Racket
This commit is contained in:
parent
f9497d64c5
commit
b807b38a44
|
@ -302,12 +302,20 @@ grouping of its field-`Value`s.
|
||||||
curly-brace-enclosed colon-separated pairs of values. `Set`s are
|
curly-brace-enclosed colon-separated pairs of values. `Set`s are
|
||||||
written either as a simple curly-brace-enclosed non-empty sequence of
|
written either as a simple curly-brace-enclosed non-empty sequence of
|
||||||
values, or as a possibly-empty sequence of values enclosed by the
|
values, or as a possibly-empty sequence of values enclosed by the
|
||||||
tokens `#set{` and `}`.
|
tokens `#set{` and `}`.[^printing-collections]
|
||||||
|
|
||||||
Sequence = "[" *Value ws "]"
|
Sequence = "[" *Value ws "]"
|
||||||
Dictionary = "{" *(Value ws ":" Value) ws "}"
|
Dictionary = "{" *(Value ws ":" Value) ws "}"
|
||||||
Set = %s"#set{" *Value ws "}" / "{" 1*Value ws "}"
|
Set = %s"#set{" *Value ws "}" / "{" 1*Value ws "}"
|
||||||
|
|
||||||
|
[^printing-collections]: **Implementation note.** When implementing
|
||||||
|
printing of `Value`s using the textual syntax, consider supporting
|
||||||
|
(a) optional pretty-printing with indentation, (b) optional
|
||||||
|
JSON-compatible print mode for that subset of `Value` that is
|
||||||
|
compatible with JSON, and (c) optional submodes for no commas,
|
||||||
|
commas separating, and commas terminating elements or key/value
|
||||||
|
pairs within a collection.
|
||||||
|
|
||||||
Any `Value` may be represented using the
|
Any `Value` may be represented using the
|
||||||
[compact binary syntax](#compact-binary-syntax) by directly prefixing
|
[compact binary syntax](#compact-binary-syntax) by directly prefixing
|
||||||
the binary form of the `Value` with ASCII `SOH` (`%x01`), or by
|
the binary form of the `Value` with ASCII `SOH` (`%x01`), or by
|
||||||
|
@ -419,28 +427,26 @@ URL-safe Base64 characters are allowed.
|
||||||
ByteString =/ %s"#base64{" *(ws / base64char) ws "}" /
|
ByteString =/ %s"#base64{" *(ws / base64char) ws "}" /
|
||||||
base64char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "-" / "_" / "="
|
base64char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "-" / "_" / "="
|
||||||
|
|
||||||
A `Symbol` may be written in a "bare" form,[^cf-sexp-token] so long as
|
A `Symbol` may be written in a "bare" form[^cf-sexp-token] so long as
|
||||||
it conforms to certain restrictions on the characters appearing in the
|
it conforms to certain restrictions on the characters appearing in the
|
||||||
symbol, or in a quoted form. The quoted form is much the same as the
|
symbol. Alternatively, it may be written in a quoted form. The quoted
|
||||||
syntax for `String`s, including embedded escape syntax, except using a
|
form is much the same as the syntax for `String`s, including embedded
|
||||||
bar or pipe character (`|`) instead of a double quote mark.
|
escape syntax, except using a bar or pipe character (`|`) instead of a
|
||||||
|
double quote mark.
|
||||||
|
|
||||||
Symbol = symstart *symcont / "|" *symchar "|"
|
Symbol = symstart *symcont / "|" *symchar "|"
|
||||||
symstart = ALPHA / sympunct
|
symstart = ALPHA / sympunct / symunicode
|
||||||
symcont = ALPHA / sympunct / DIGIT / "-" / "."
|
symcont = ALPHA / sympunct / symunicode / DIGIT / "-" / "."
|
||||||
sympunct = "~" / "!" / "@" / "$" / "%" / "^" / "&" / "*" /
|
sympunct = "~" / "!" / "@" / "$" / "%" / "^" / "&" / "*" /
|
||||||
"?" / "_" / "=" / "+" / "<" / ">" / "/"
|
"?" / "_" / "=" / "+" / "<" / ">" / "/"
|
||||||
symchar = unescaped / %x22 / escape (escaped / %x7C / %s"u" 4HEXDIG)
|
symchar = unescaped / %x22 / escape (escaped / %x7C / %s"u" 4HEXDIG)
|
||||||
|
symunicode = <any code point greater than 127 whose Unicode
|
||||||
|
category is Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd,
|
||||||
|
Nl, No, Pd, Pc, Po, Sc, Sm, Sk, So, or Co>
|
||||||
|
|
||||||
[^cf-sexp-token]: Compare with the [SPKI S-expression][sexp.txt]
|
[^cf-sexp-token]: Compare with the [SPKI S-expression][sexp.txt]
|
||||||
definition of "token representation".
|
definition of "token representation", and with the
|
||||||
|
[R6RS definition of identifiers](http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_sec_4.2.4).
|
||||||
TODO: More unicode in unescaped symbols?
|
|
||||||
|
|
||||||
### Printing
|
|
||||||
|
|
||||||
TODO: Recommend a JSON-compatible print mode. Recommend a submode with
|
|
||||||
trailing commas.
|
|
||||||
|
|
||||||
## Compact Binary Syntax
|
## Compact Binary Syntax
|
||||||
|
|
||||||
|
@ -722,6 +728,8 @@ The functions `binary32(F)` and `binary64(D)` yield big-endian 4- and
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
|
### Simple examples
|
||||||
|
|
||||||
<!-- TODO: Give some examples of large and small Preserves, perhaps -->
|
<!-- TODO: Give some examples of large and small Preserves, perhaps -->
|
||||||
<!-- translated from various JSON blobs floating around the internet. -->
|
<!-- translated from various JSON blobs floating around the internet. -->
|
||||||
|
|
||||||
|
@ -749,7 +757,7 @@ short form label number 0 to label `discard`, 1 to `capture`, and 2 to
|
||||||
| `1.0` | 03 3F F0 00 00 00 00 00 00 |
|
| `1.0` | 03 3F F0 00 00 00 00 00 00 |
|
||||||
| `-1.202e300` | 03 FE 3C B7 B7 59 BF 04 26 |
|
| `-1.202e300` | 03 FE 3C B7 B7 59 BF 04 26 |
|
||||||
|
|
||||||
Finally, a larger example, using a non-`Symbol` label for a record.[^extensibility2] The `Record`
|
The next example uses a non-`Symbol` label for a record.[^extensibility2] The `Record`
|
||||||
|
|
||||||
[titled person 2 thing 1](101, "Blackwell", date(1821 2 3), "Dr")
|
[titled person 2 thing 1](101, "Blackwell", date(1821 2 3), "Dr")
|
||||||
|
|
||||||
|
@ -783,6 +791,97 @@ encodes to
|
||||||
For more detail on Racket's representations of record labels, see
|
For more detail on Racket's representations of record labels, see
|
||||||
[the Racket documentation for `make-prefab-struct`](http://docs.racket-lang.org/reference/structutils.html#%28def._%28%28quote._~23~25kernel%29._make-prefab-struct%29%29).
|
[the Racket documentation for `make-prefab-struct`](http://docs.racket-lang.org/reference/structutils.html#%28def._%28%28quote._~23~25kernel%29._make-prefab-struct%29%29).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### JSON examples
|
||||||
|
|
||||||
|
The examples from
|
||||||
|
[RFC 8259](https://tools.ietf.org/html/rfc8259#section-13) read as
|
||||||
|
valid Preserves, though the JSON literals `true`, `false` and `null`
|
||||||
|
read as `Symbol`s. The first example:
|
||||||
|
|
||||||
|
{
|
||||||
|
"Image": {
|
||||||
|
"Width": 800,
|
||||||
|
"Height": 600,
|
||||||
|
"Title": "View from 15th Floor",
|
||||||
|
"Thumbnail": {
|
||||||
|
"Url": "http://www.example.com/image/481989943",
|
||||||
|
"Height": 125,
|
||||||
|
"Width": 100
|
||||||
|
},
|
||||||
|
"Animated" : false,
|
||||||
|
"IDs": [116, 943, 234, 38793]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
encodes to binary as follows:
|
||||||
|
|
||||||
|
E2
|
||||||
|
55 "Image"
|
||||||
|
EC
|
||||||
|
55 "Width" 42 03 20
|
||||||
|
55 "Title" 5F 14 "View from 15th Floor"
|
||||||
|
58 "Animated" 75 "false"
|
||||||
|
56 "Height" 42 02 58
|
||||||
|
59 "Thumbnail"
|
||||||
|
E6
|
||||||
|
55 "Width" 41 64
|
||||||
|
53 "Url" 5F 26 "http://www.example.com/image/481989943"
|
||||||
|
56 "Height" 41 7D
|
||||||
|
53 "IDs" C4
|
||||||
|
41 74
|
||||||
|
42 03 AF
|
||||||
|
42 00 EA
|
||||||
|
43 00 97 89
|
||||||
|
|
||||||
|
and the second example:
|
||||||
|
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"precision": "zip",
|
||||||
|
"Latitude": 37.7668,
|
||||||
|
"Longitude": -122.3959,
|
||||||
|
"Address": "",
|
||||||
|
"City": "SAN FRANCISCO",
|
||||||
|
"State": "CA",
|
||||||
|
"Zip": "94107",
|
||||||
|
"Country": "US"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"precision": "zip",
|
||||||
|
"Latitude": 37.371991,
|
||||||
|
"Longitude": -122.026020,
|
||||||
|
"Address": "",
|
||||||
|
"City": "SUNNYVALE",
|
||||||
|
"State": "CA",
|
||||||
|
"Zip": "94085",
|
||||||
|
"Country": "US"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
encodes to binary as follows:
|
||||||
|
|
||||||
|
C2
|
||||||
|
EF 10
|
||||||
|
59 "precision" 53 "zip"
|
||||||
|
58 "Latitude" 03 40 42 E2 26 80 9D 49 52
|
||||||
|
59 "Longitude" 03 C0 5E 99 56 6C F4 1F 21
|
||||||
|
57 "Address" 50
|
||||||
|
54 "City" 5D "SAN FRANCISCO"
|
||||||
|
55 "State" 52 "CA"
|
||||||
|
53 "Zip" 55 "94107"
|
||||||
|
57 "Country" 52 "US"
|
||||||
|
EF 10
|
||||||
|
59 "precision" 53 "zip"
|
||||||
|
58 "Latitude" 03 40 42 AF 9D 66 AD B4 03
|
||||||
|
59 "Longitude" 03 C0 5E 81 AA 4F CA 42 AF
|
||||||
|
57 "Address" 50
|
||||||
|
54 "City" 59 "SUNNYVALE"
|
||||||
|
55 "State" 52 "CA"
|
||||||
|
53 "Zip" 55 "94085"
|
||||||
|
57 "Country" 52 "US"
|
||||||
|
|
||||||
## Conventions for Common Data Types
|
## Conventions for Common Data Types
|
||||||
|
|
||||||
The `Value` data type is essentially an S-Expression, able to
|
The `Value` data type is essentially an S-Expression, able to
|
||||||
|
@ -918,8 +1017,6 @@ or `date-time` productions of
|
||||||
|
|
||||||
## Security Considerations
|
## Security Considerations
|
||||||
|
|
||||||
TODO: Lots of whitespace is just like lots of empty chunks
|
|
||||||
|
|
||||||
**Empty chunks.** Streamed (format C) `String`s, `ByteString`s and
|
**Empty chunks.** Streamed (format C) `String`s, `ByteString`s and
|
||||||
`Symbol`s may include chunks of zero length. This opens up a
|
`Symbol`s may include chunks of zero length. This opens up a
|
||||||
possibility for denial-of-service: an attacker may begin streaming a
|
possibility for denial-of-service: an attacker may begin streaming a
|
||||||
|
@ -929,6 +1026,12 @@ optional reasonable restrictions on the number of consecutive empty
|
||||||
chunks that may appear in a stream, and may even supply an optional
|
chunks that may appear in a stream, and may even supply an optional
|
||||||
mode that rejects empty chunks entirely.
|
mode that rejects empty chunks entirely.
|
||||||
|
|
||||||
|
**Whitespace.** Similarly, the textual format for `Value`s allows
|
||||||
|
arbitrary whitespace in many positions. In streaming transfer
|
||||||
|
situations, consider optional restrictions on the amount of
|
||||||
|
consecutive whitespace and comments that may appear in a serialized
|
||||||
|
`Value`.
|
||||||
|
|
||||||
**Canonical form for cryptographic hashing and signing.** As
|
**Canonical form for cryptographic hashing and signing.** As
|
||||||
specified, neither the textual nor the compact binary encoding rules
|
specified, neither the textual nor the compact binary encoding rules
|
||||||
for `Value`s force canonical serializations. Two serializations of the
|
for `Value`s force canonical serializations. Two serializations of the
|
||||||
|
|
|
@ -4,6 +4,8 @@
|
||||||
(provide (struct-out stream-of)
|
(provide (struct-out stream-of)
|
||||||
(struct-out record)
|
(struct-out record)
|
||||||
short-form-labels
|
short-form-labels
|
||||||
|
read-preserve
|
||||||
|
string->preserve
|
||||||
encode
|
encode
|
||||||
decode
|
decode
|
||||||
wire-value)
|
wire-value)
|
||||||
|
@ -176,15 +178,15 @@
|
||||||
(define s (bytes->string/utf-8 bs))
|
(define s (bytes->string/utf-8 bs))
|
||||||
(lambda () (ks (if (= minor 3) (string->symbol s) s) rest))))]))
|
(lambda () (ks (if (= minor 3) (string->symbol s) s) rest))))]))
|
||||||
|
|
||||||
|
(define (build-record label fields)
|
||||||
|
(with-handlers [(exn:fail:contract? (lambda (e) (record label fields)))]
|
||||||
|
(apply make-prefab-struct label fields)))
|
||||||
|
|
||||||
(define (decode-record minor fields rest ks kf)
|
(define (decode-record minor fields rest ks kf)
|
||||||
(define (build key fs)
|
|
||||||
(ks (with-handlers [(exn:fail:contract? (lambda (e) (record key fs)))]
|
|
||||||
(apply make-prefab-struct key fs))
|
|
||||||
rest))
|
|
||||||
(match* (minor fields)
|
(match* (minor fields)
|
||||||
[(3 (list* key fs)) (build key fs)]
|
[(3 (list* key fs)) (ks (build-record key fs) rest)]
|
||||||
[(3 '()) (kf)]
|
[(3 '()) (kf)]
|
||||||
[(n fs) (build (vector-ref (short-form-labels) n) fs)]))
|
[(n fs) (ks (build-record (vector-ref (short-form-labels) n) fs) rest)]))
|
||||||
|
|
||||||
(define (decode-collection minor vs rest ks kf)
|
(define (decode-collection minor vs rest ks kf)
|
||||||
(match minor
|
(match minor
|
||||||
|
@ -244,6 +246,241 @@
|
||||||
|
|
||||||
;;---------------------------------------------------------------------------
|
;;---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
(define (read-preserve [i (current-input-port)])
|
||||||
|
(local-require (only-in syntax/readerr raise-read-error))
|
||||||
|
(local-require net/base64)
|
||||||
|
(local-require file/sha1)
|
||||||
|
|
||||||
|
(define-match-expander px
|
||||||
|
(syntax-rules ()
|
||||||
|
[(_ re pat) (app (lambda (v) (regexp-try-match re v)) pat)]))
|
||||||
|
|
||||||
|
(define (skip-whitespace)
|
||||||
|
(regexp-match? #px#"^(\\s|,)*" i)
|
||||||
|
(match (peek-char i)
|
||||||
|
[#\; (regexp-match? #px#"[^\r\n]*[\r\n]" i) (skip-whitespace)]
|
||||||
|
[_ #t]))
|
||||||
|
|
||||||
|
(define (parse-error fmt . args)
|
||||||
|
(define-values [line column pos] (port-next-location i))
|
||||||
|
(raise-read-error (format "read-preserve: ~a" (apply format fmt args))
|
||||||
|
(object-name i)
|
||||||
|
line
|
||||||
|
column
|
||||||
|
pos
|
||||||
|
#f))
|
||||||
|
|
||||||
|
(define (eof-guard ch)
|
||||||
|
(match ch
|
||||||
|
[(? eof-object?) (parse-error "Unexpected end of input")]
|
||||||
|
[ch ch]))
|
||||||
|
|
||||||
|
(define (peek/no-eof) (eof-guard (peek-char i)))
|
||||||
|
(define (read/no-eof) (eof-guard (read-char i)))
|
||||||
|
|
||||||
|
(define (read-sequence terminator)
|
||||||
|
(sequence-fold '() accumulate-value reverse terminator))
|
||||||
|
|
||||||
|
(define (accumulate-value acc)
|
||||||
|
(cons (read-value) acc))
|
||||||
|
|
||||||
|
(define (read-dictionary-or-set)
|
||||||
|
(sequence-fold #f
|
||||||
|
(lambda (acc)
|
||||||
|
(define k (read-value))
|
||||||
|
(skip-whitespace)
|
||||||
|
(match (peek-char i)
|
||||||
|
[#\: (when (set? acc) (parse-error "Unexpected key/value separator in set"))
|
||||||
|
(read-char i)
|
||||||
|
(define v (read-value))
|
||||||
|
(hash-set (or acc (hash)) k v)]
|
||||||
|
[_ (when (hash? acc) (parse-error "Missing expected key/value separator"))
|
||||||
|
(set-add (or acc (set)) k)]))
|
||||||
|
values
|
||||||
|
#\}))
|
||||||
|
|
||||||
|
(define PIPE #\|)
|
||||||
|
|
||||||
|
(define (read-raw-symbol)
|
||||||
|
(let loop ((acc '()))
|
||||||
|
(match (peek-char i)
|
||||||
|
[(or #\{ #\} #\[ #\] #\" #\; #\, #\# #\: (? char-whitespace?) (? eof-object?) (== PIPE))
|
||||||
|
(string->symbol (list->string (reverse acc)))]
|
||||||
|
[_ (loop (cons (read-char i) acc))])))
|
||||||
|
|
||||||
|
(define (read-encoded-binary kind valid-char? finish)
|
||||||
|
(let loop ((acc '()))
|
||||||
|
(skip-whitespace)
|
||||||
|
(let ((ch (read/no-eof)))
|
||||||
|
(cond [(valid-char? ch)
|
||||||
|
(loop (cons ch acc))]
|
||||||
|
[(eqv? ch #\})
|
||||||
|
(finish (list->string (reverse acc)))]
|
||||||
|
[else (parse-error "Invalid ~a character" kind)]))))
|
||||||
|
|
||||||
|
(define (read-base64-binary)
|
||||||
|
(read-encoded-binary "base64"
|
||||||
|
(lambda (ch)
|
||||||
|
(or (and (char>=? ch #\A) (char<=? ch #\Z))
|
||||||
|
(and (char>=? ch #\a) (char<=? ch #\z))
|
||||||
|
(and (char>=? ch #\0) (char<=? ch #\9))
|
||||||
|
(memv ch '(#\+ #\/ #\- #\_ #\=))))
|
||||||
|
(lambda (s)
|
||||||
|
(base64-decode (string->bytes/latin-1 s)))))
|
||||||
|
|
||||||
|
(define (read-hex-binary)
|
||||||
|
(read-encoded-binary "hex"
|
||||||
|
(lambda (ch)
|
||||||
|
(or (and (char>=? ch #\A) (char<=? ch #\F))
|
||||||
|
(and (char>=? ch #\a) (char<=? ch #\f))
|
||||||
|
(and (char>=? ch #\0) (char<=? ch #\9))))
|
||||||
|
(lambda (s)
|
||||||
|
(hex-string->bytes s))))
|
||||||
|
|
||||||
|
(define (read-stringlike xform-item finish terminator-char hexescape-char hexescape-proc)
|
||||||
|
(let loop ((acc '()))
|
||||||
|
(match (read/no-eof)
|
||||||
|
[(== terminator-char) (finish (reverse acc))]
|
||||||
|
[#\\ (match (read/no-eof)
|
||||||
|
[(== hexescape-char) (loop (cons (hexescape-proc) acc))]
|
||||||
|
[(and c (or (== terminator-char) #\\ #\/)) (loop (cons (xform-item c) acc))]
|
||||||
|
[#\b (loop (cons (xform-item #\u08) acc))]
|
||||||
|
[#\f (loop (cons (xform-item #\u0C) acc))]
|
||||||
|
[#\n (loop (cons (xform-item #\u0A) acc))]
|
||||||
|
[#\r (loop (cons (xform-item #\u0D) acc))]
|
||||||
|
[#\t (loop (cons (xform-item #\u09) acc))]
|
||||||
|
[c (parse-error "Invalid escape code \\~a" c)])]
|
||||||
|
[c (loop (cons (xform-item c) acc))])))
|
||||||
|
|
||||||
|
(define (read-string terminator-char)
|
||||||
|
(read-stringlike values
|
||||||
|
list->string
|
||||||
|
terminator-char
|
||||||
|
#\u
|
||||||
|
(lambda ()
|
||||||
|
(integer->char
|
||||||
|
(match i
|
||||||
|
[(px #px#"^[a-fA-F0-9]{4}" (list hexdigits))
|
||||||
|
(define n1 (string->number (bytes->string/utf-8 hexdigits)))
|
||||||
|
(if (<= #xd800 n1 #xdfff) ;; surrogate pair first half
|
||||||
|
(match i
|
||||||
|
[(px #px#"^\\\\u([a-fA-F0-9]{4})" (list hexdigits2))
|
||||||
|
(define n2 (string->number (bytes->string/utf-8 hexdigits2)))
|
||||||
|
(if (<= #xdc00 n2 #xdfff)
|
||||||
|
(+ (arithmetic-shift (- n1 #xd800) 10)
|
||||||
|
(- n2 #xdc00)
|
||||||
|
#x10000)
|
||||||
|
(parse-error "Bad second half of surrogate pair"))]
|
||||||
|
[_ (parse-error "Missing second half of surrogate pair")])
|
||||||
|
n1)]
|
||||||
|
[_ (parse-error "Bad string \\u escape")])))))
|
||||||
|
|
||||||
|
(define (read-literal-binary)
|
||||||
|
(read-stringlike (lambda (c)
|
||||||
|
(define b (char->integer c))
|
||||||
|
(when (>= b 256)
|
||||||
|
(parse-error "Invalid code point ~a (~v) in literal binary" b c))
|
||||||
|
b)
|
||||||
|
list->bytes
|
||||||
|
#\"
|
||||||
|
#\x
|
||||||
|
(lambda ()
|
||||||
|
(match i
|
||||||
|
[(px #px#"^[a-fA-F0-9]{2}" (list hexdigits))
|
||||||
|
(string->number (bytes->string/utf-8 hexdigits))]
|
||||||
|
[_ (parse-error "Bad binary \\x escape")]))))
|
||||||
|
|
||||||
|
(define (read-intpart acc-rev)
|
||||||
|
(match (peek-char i)
|
||||||
|
[#\0 (read-fracexp (cons (read-char i) acc-rev))]
|
||||||
|
[_ (read-digit+ acc-rev read-fracexp)]))
|
||||||
|
|
||||||
|
(define (read-digit* acc-rev k)
|
||||||
|
(match (peek-char i)
|
||||||
|
[(? char-numeric?) (read-digit* (cons (read-char i) acc-rev) k)]
|
||||||
|
[_ (k acc-rev)]))
|
||||||
|
|
||||||
|
(define (read-digit+ acc-rev k)
|
||||||
|
(match (peek-char i)
|
||||||
|
[(? char-numeric?) (read-digit* (cons (read-char i) acc-rev) k)]
|
||||||
|
[_ (parse-error "Incomplete number")]))
|
||||||
|
|
||||||
|
(define (read-fracexp acc-rev)
|
||||||
|
(match (peek-char i)
|
||||||
|
[#\. (read-digit+ (cons (read-char i) acc-rev) read-exp)]
|
||||||
|
[_ (read-exp acc-rev)]))
|
||||||
|
|
||||||
|
(define (read-exp acc-rev)
|
||||||
|
(match (peek-char i)
|
||||||
|
[(or #\e #\E) (read-sign-and-exp (cons (read-char i) acc-rev))]
|
||||||
|
[_ (finish-number acc-rev)]))
|
||||||
|
|
||||||
|
(define (read-sign-and-exp acc-rev)
|
||||||
|
(match (peek-char i)
|
||||||
|
[(or #\+ #\-) (read-digit+ (cons (read-char i) acc-rev) finish-number)]
|
||||||
|
[_ (read-digit+ acc-rev finish-number)]))
|
||||||
|
|
||||||
|
(define (finish-number acc-rev)
|
||||||
|
(define s (list->string (reverse acc-rev)))
|
||||||
|
(define n (string->number s))
|
||||||
|
(when (not n) (parse-error "Invalid number: ~v" s))
|
||||||
|
n)
|
||||||
|
|
||||||
|
(define (read-number)
|
||||||
|
(match (peek/no-eof)
|
||||||
|
[#\- (read-intpart (list (read-char i)))]
|
||||||
|
[_ (read-intpart (list))]))
|
||||||
|
|
||||||
|
(define (sequence-fold acc accumulate-one finish terminator-char)
|
||||||
|
(let loop ((acc acc))
|
||||||
|
(skip-whitespace)
|
||||||
|
(match (peek/no-eof)
|
||||||
|
[(== terminator-char) (read-char i) (finish acc)]
|
||||||
|
[_ (loop (accumulate-one acc))])))
|
||||||
|
|
||||||
|
(define (collect-fields head)
|
||||||
|
(skip-whitespace)
|
||||||
|
(match (peek-char i)
|
||||||
|
[#\( (collect-fields (build-record head (read-sequence #\))))]
|
||||||
|
[_ head]))
|
||||||
|
|
||||||
|
(define (read-value)
|
||||||
|
(skip-whitespace)
|
||||||
|
(collect-fields
|
||||||
|
(match (peek-char i)
|
||||||
|
[(? eof-object? o) o]
|
||||||
|
[#\{ (read-char i) (read-dictionary-or-set)]
|
||||||
|
[#\[ (read-char i) (read-sequence #\])]
|
||||||
|
[(or #\- #\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) (read-number)]
|
||||||
|
[#\" (read-char i) (read-string #\")]
|
||||||
|
[(== PIPE) (read-char i) (string->symbol (read-string PIPE))]
|
||||||
|
[#\# (match i
|
||||||
|
[(px #px#"^#set\\{" (list _))
|
||||||
|
(sequence-fold (set) accumulate-value values #\})]
|
||||||
|
[(px #px#"^#hexvalue\\{" (list _))
|
||||||
|
(decode (read-hex-binary) (lambda () (parse-error "Invalid #hexvalue encoding")))]
|
||||||
|
[(px #px#"^#true" (list _))
|
||||||
|
#t]
|
||||||
|
[(px #px#"^#false" (list _))
|
||||||
|
#f]
|
||||||
|
[(px #px#"^#\"" (list _))
|
||||||
|
(read-literal-binary)]
|
||||||
|
[(px #px#"^#hex\\{" (list _))
|
||||||
|
(read-hex-binary)]
|
||||||
|
[(px #px#"^#base64\\{" (list _))
|
||||||
|
(read-base64-binary)]
|
||||||
|
[_
|
||||||
|
(parse-error "Invalid preserve value")])]
|
||||||
|
[#\: (parse-error "Unexpected key/value separator between items")]
|
||||||
|
[_ (read-raw-symbol)])))
|
||||||
|
|
||||||
|
(read-value))
|
||||||
|
|
||||||
|
(define (string->preserve s)
|
||||||
|
(read-preserve (open-input-string s)))
|
||||||
|
|
||||||
|
;;---------------------------------------------------------------------------
|
||||||
|
|
||||||
(module+ test
|
(module+ test
|
||||||
(require rackunit)
|
(require rackunit)
|
||||||
(require (for-syntax racket syntax/srcloc))
|
(require (for-syntax racket syntax/srcloc))
|
||||||
|
@ -444,4 +681,94 @@
|
||||||
#xE2 #x7A "first-name" #x59 "Elizabeth"
|
#xE2 #x7A "first-name" #x59 "Elizabeth"
|
||||||
#xE2 #x77 "surname" #x59 "Blackwell"
|
#xE2 #x77 "surname" #x59 "Blackwell"
|
||||||
))
|
))
|
||||||
|
|
||||||
|
(let ()
|
||||||
|
(local-require json)
|
||||||
|
(define rfc8259-example1 (string->preserve #<<EOF
|
||||||
|
{
|
||||||
|
"Image": {
|
||||||
|
"Width": 800,
|
||||||
|
"Height": 600,
|
||||||
|
"Title": "View from 15th Floor",
|
||||||
|
"Thumbnail": {
|
||||||
|
"Url": "http://www.example.com/image/481989943",
|
||||||
|
"Height": 125,
|
||||||
|
"Width": 100
|
||||||
|
},
|
||||||
|
"Animated" : false,
|
||||||
|
"IDs": [116, 943, 234, 38793]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
))
|
||||||
|
(define rfc8259-example2 (string->preserve #<<EOF
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"precision": "zip",
|
||||||
|
"Latitude": 37.7668,
|
||||||
|
"Longitude": -122.3959,
|
||||||
|
"Address": "",
|
||||||
|
"City": "SAN FRANCISCO",
|
||||||
|
"State": "CA",
|
||||||
|
"Zip": "94107",
|
||||||
|
"Country": "US"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"precision": "zip",
|
||||||
|
"Latitude": 37.371991,
|
||||||
|
"Longitude": -122.026020,
|
||||||
|
"Address": "",
|
||||||
|
"City": "SUNNYVALE",
|
||||||
|
"State": "CA",
|
||||||
|
"Zip": "94085",
|
||||||
|
"Country": "US"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
EOF
|
||||||
|
))
|
||||||
|
|
||||||
|
(check-both-directions/nondeterministic
|
||||||
|
rfc8259-example1
|
||||||
|
(#xe2
|
||||||
|
#x55 "Image"
|
||||||
|
#xec
|
||||||
|
#x55 "Width" #x42 #x03 #x20
|
||||||
|
#x55 "Title" #x5f #x14 "View from 15th Floor"
|
||||||
|
#x58 "Animated" #x75 "false"
|
||||||
|
#x56 "Height" #x42 #x02 #x58
|
||||||
|
#x59 "Thumbnail"
|
||||||
|
#xe6
|
||||||
|
#x55 "Width" #x41 #x64
|
||||||
|
#x53 "Url" #x5f #x26 "http://www.example.com/image/481989943"
|
||||||
|
#x56 "Height" #x41 #x7d
|
||||||
|
#x53 "IDs" #xc4
|
||||||
|
#x41 #x74
|
||||||
|
#x42 #x03 #xaf
|
||||||
|
#x42 #x00 #xea
|
||||||
|
#x43 #x00 #x97 #x89
|
||||||
|
))
|
||||||
|
|
||||||
|
(check-both-directions/nondeterministic
|
||||||
|
rfc8259-example2
|
||||||
|
(#xc2
|
||||||
|
#xef #x10
|
||||||
|
#x59 "precision" #x53 "zip"
|
||||||
|
#x58 "Latitude" #x03 #x40 #x42 #xe2 #x26 #x80 #x9d #x49 #x52
|
||||||
|
#x59 "Longitude" #x03 #xc0 #x5e #x99 #x56 #x6c #xf4 #x1f #x21
|
||||||
|
#x57 "Address" #x50
|
||||||
|
#x54 "City" #x5D "SAN FRANCISCO"
|
||||||
|
#x55 "State" #x52 "CA"
|
||||||
|
#x53 "Zip" #x55 "94107"
|
||||||
|
#x57 "Country" #x52 "US"
|
||||||
|
#xef #x10
|
||||||
|
#x59 "precision" #x53 "zip"
|
||||||
|
#x58 "Latitude" #x03 #x40 #x42 #xaf #x9d #x66 #xad #xb4 #x03
|
||||||
|
#x59 "Longitude" #x03 #xc0 #x5e #x81 #xaa #x4f #xca #x42 #xaf
|
||||||
|
#x57 "Address" #x50
|
||||||
|
#x54 "City" #x59 "SUNNYVALE"
|
||||||
|
#x55 "State" #x52 "CA"
|
||||||
|
#x53 "Zip" #x55 "94085"
|
||||||
|
#x57 "Country" #x52 "US"
|
||||||
|
))
|
||||||
|
)
|
||||||
)
|
)
|
||||||
|
|
Loading…
Reference in New Issue