Feed back clarifications from the cheatsheet version of the text grammar
This commit is contained in:
parent
6a56dad886
commit
9cc537abf8
|
@ -29,8 +29,7 @@ UTF-8 where possible.
|
|||
**Whitespace.** Whitespace is defined as any number of spaces, tabs,
|
||||
carriage returns, line feeds, or commas.
|
||||
|
||||
ws = *(%x20 / %x09 / newline / ",")
|
||||
newline = CR / LF
|
||||
ws = *(%x20 / %x09 / CR / LF / ",")
|
||||
|
||||
## Grammar
|
||||
|
||||
|
@ -90,23 +89,15 @@ the same as for JSON,[^string-json-correspondence]
|
|||
[surrogate code points](https://unicode.org/glossary/#surrogate_code_point)
|
||||
*MUST NOT* be generated or accepted.[^unpaired-surrogates]
|
||||
|
||||
String = %x22 *char %x22
|
||||
char = unescaped / %x7C / escape (escaped / %x22 / %s"u" 4HEXDIG)
|
||||
unescaped = %x20-21 / %x23-5B / %x5D-7B / %x7D-10FFFF
|
||||
escape = %x5C ; \
|
||||
escaped = ( %x5C / ; \ reverse solidus U+005C
|
||||
%x2F / ; / solidus U+002F
|
||||
%x62 / ; b backspace U+0008
|
||||
%x66 / ; f form feed U+000C
|
||||
%x6E / ; n line feed U+000A
|
||||
%x72 / ; r carriage return U+000D
|
||||
%x74 ) ; t tab U+0009
|
||||
String = DQUOTE *char DQUOTE
|
||||
char = <any unicode scalar value except "\" or DQUOTE> / escaped / "\" DQUOTE
|
||||
escaped = "\\" / "\/" / %s"\b" / %s"\f" / %s"\n" / %s"\r" / %s"\t"
|
||||
/ %s"\u" 4HEXDIG
|
||||
|
||||
[^string-json-correspondence]: The grammar for `String` has the same
|
||||
effect as the
|
||||
[JSON](https://tools.ietf.org/html/rfc8259#section-7) grammar for
|
||||
`string`. Some auxiliary definitions (e.g. `escaped`) are lifted
|
||||
largely unmodified from the text of RFC 8259.
|
||||
`string`.
|
||||
|
||||
[^escaping-surrogate-pairs]: In particular, note JSON's rules around
|
||||
the use of surrogate pairs for scalar values not in the Basic
|
||||
|
@ -135,14 +126,16 @@ Many bytes map directly to printable 7-bit ASCII; the remainder must be
|
|||
escaped, either as `\x` followed by a two-digit hexadecimal number, or
|
||||
following the usual rules for double quote and backslash.
|
||||
|
||||
ByteString = "#" %x22 *binchar %x22
|
||||
binchar = binunescaped / escape (escaped / %x22 / %s"x" 2HEXDIG)
|
||||
binunescaped = %x20-21 / %x23-5B / %x5D-7E
|
||||
ByteString = "#" DQUOTE *binchar DQUOTE
|
||||
binchar = <any unicode scalar value ≥32 and ≤126 except "\" or DQUOTE>
|
||||
/ "\" ("\" / "/" / %s"b" / %s"f" / %s"n" / %s"r" / %s"t")
|
||||
/ %s"\x" 2HEXDIG
|
||||
/ "\" DQUOTE
|
||||
|
||||
The second is a sequence of pairs of hexadecimal digits interleaved
|
||||
with whitespace and surrounded by `#x"` and `"`.
|
||||
|
||||
ByteString =/ %s"#x" %x22 *(ws / 2HEXDIG) ws %x22
|
||||
ByteString =/ %s"#x" DQUOTE *(ws 2HEXDIG) ws DQUOTE
|
||||
|
||||
The third is a sequence of [Base64](https://tools.ietf.org/html/rfc4648)
|
||||
characters, interleaved with whitespace and surrounded by `#[` and `]`.
|
||||
|
@ -153,8 +146,8 @@ and [URL-safe](https://datatracker.ietf.org/doc/html/rfc4648#section-5)
|
|||
(`-`,`_`) characters *SHOULD* be generated by default. Padding characters
|
||||
(`=`) may be omitted.
|
||||
|
||||
ByteString =/ "#[" *(ws / base64char) ws "]"
|
||||
base64char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "-" / "_" / "="
|
||||
ByteString =/ "#[" *(ws base64char) ws "]"
|
||||
base64char = ALPHA / DIGIT / "+" / "/" / "-" / "_" / "="
|
||||
|
||||
A `Symbol` may be written in either of two forms.
|
||||
|
||||
|
@ -163,7 +156,7 @@ including embedded escape syntax, except using a bar or pipe character
|
|||
(`|`) instead of a double quote mark.
|
||||
|
||||
QuotedSymbol = "|" *symchar "|"
|
||||
symchar = unescaped / %x22 / escape (escaped / %x7C / %s"u" 4HEXDIG)
|
||||
symchar = <any unicode scalar value except "\" or "|"> / escaped / "\|"
|
||||
|
||||
Alternatively, a `Symbol` may be written in a “bare” form[^cf-sexp-token].
|
||||
The grammar for numeric data is a subset of the grammar for bare `Symbol`s,
|
||||
|
@ -171,12 +164,11 @@ so if a `SymbolOrNumber` also matches the grammar for `Float`, `Double` or
|
|||
`SignedInteger`, then it must be interpreted as one of those, and otherwise
|
||||
it must be interpreted as a bare `Symbol`.
|
||||
|
||||
SymbolOrNumber = 1*baresymchar
|
||||
baresymchar = ALPHA / DIGIT / sympunct / symuchar
|
||||
SymbolOrNumber = 1*(ALPHA / DIGIT / sympunct / symuchar)
|
||||
sympunct = "~" / "!" / "$" / "%" / "^" / "&" / "*" /
|
||||
"?" / "_" / "=" / "+" / "-" / "/" / "."
|
||||
symuchar = <any scalar value greater than 127 whose Unicode
|
||||
category is Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd,
|
||||
symuchar = <any scalar value ≥128 whose Unicode category is
|
||||
Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd,
|
||||
Nl, No, Pc, Pd, Po, Sc, Sm, Sk, So, or Co>
|
||||
|
||||
[^cf-sexp-token]: Compare with the [SPKI S-expression][sexp.txt]
|
||||
|
@ -239,9 +231,8 @@ represented as raw hexadecimal strings similar to hexadecimal
|
|||
syntax whereever convenient, even for values representable using the
|
||||
grammar above.[^rationale-no-general-machine-syntax]
|
||||
|
||||
Value =/ HexFloat / HexDouble
|
||||
HexFloat = "#xf" %x22 4(ws 2HEXDIG) ws %x22
|
||||
HexDouble = "#xd" %x22 8(ws 2HEXDIG) ws %x22
|
||||
Float =/ "#xf" DQUOTE 4(ws 2HEXDIG) ws DQUOTE
|
||||
Double =/ "#xd" DQUOTE 8(ws 2HEXDIG) ws DQUOTE
|
||||
|
||||
[^rationale-no-general-machine-syntax]: **Rationale.** Previous versions
|
||||
of this specification included an escape to the [machine-oriented
|
||||
|
@ -277,12 +268,11 @@ named “`Value`” without altering the semantic class of `Value`s.
|
|||
interpreted as comments associated with that value. Comments are
|
||||
sufficiently common that special syntax exists for them.
|
||||
|
||||
Value =/ ws
|
||||
";" *(%x00-09 / %x0B-0C / %x0E-10FFFF) newline
|
||||
Value
|
||||
Value =/ ws ";" linecomment (CR / LF) Value
|
||||
linecomment = *<any unicode scalar value except CR or LF>
|
||||
|
||||
When written this way, everything between the `;` and the newline is
|
||||
included in the string annotating the `Value`.
|
||||
When written this way, everything between the `;` and the end of the line
|
||||
is included in the string annotating the `Value`.
|
||||
|
||||
**Equivalence.** Annotations appear within syntax denoting a `Value`;
|
||||
however, the annotations are not part of the denoted value. They are
|
||||
|
|
Loading…
Reference in New Issue