Feed back clarifications from the cheatsheet version of the text grammar
This commit is contained in:
parent
6a56dad886
commit
9cc537abf8
|
@ -29,8 +29,7 @@ UTF-8 where possible.
|
||||||
**Whitespace.** Whitespace is defined as any number of spaces, tabs,
|
**Whitespace.** Whitespace is defined as any number of spaces, tabs,
|
||||||
carriage returns, line feeds, or commas.
|
carriage returns, line feeds, or commas.
|
||||||
|
|
||||||
ws = *(%x20 / %x09 / newline / ",")
|
ws = *(%x20 / %x09 / CR / LF / ",")
|
||||||
newline = CR / LF
|
|
||||||
|
|
||||||
## Grammar
|
## Grammar
|
||||||
|
|
||||||
|
@ -90,23 +89,15 @@ the same as for JSON,[^string-json-correspondence]
|
||||||
[surrogate code points](https://unicode.org/glossary/#surrogate_code_point)
|
[surrogate code points](https://unicode.org/glossary/#surrogate_code_point)
|
||||||
*MUST NOT* be generated or accepted.[^unpaired-surrogates]
|
*MUST NOT* be generated or accepted.[^unpaired-surrogates]
|
||||||
|
|
||||||
String = %x22 *char %x22
|
String = DQUOTE *char DQUOTE
|
||||||
char = unescaped / %x7C / escape (escaped / %x22 / %s"u" 4HEXDIG)
|
char = <any unicode scalar value except "\" or DQUOTE> / escaped / "\" DQUOTE
|
||||||
unescaped = %x20-21 / %x23-5B / %x5D-7B / %x7D-10FFFF
|
escaped = "\\" / "\/" / %s"\b" / %s"\f" / %s"\n" / %s"\r" / %s"\t"
|
||||||
escape = %x5C ; \
|
/ %s"\u" 4HEXDIG
|
||||||
escaped = ( %x5C / ; \ reverse solidus U+005C
|
|
||||||
%x2F / ; / solidus U+002F
|
|
||||||
%x62 / ; b backspace U+0008
|
|
||||||
%x66 / ; f form feed U+000C
|
|
||||||
%x6E / ; n line feed U+000A
|
|
||||||
%x72 / ; r carriage return U+000D
|
|
||||||
%x74 ) ; t tab U+0009
|
|
||||||
|
|
||||||
[^string-json-correspondence]: The grammar for `String` has the same
|
[^string-json-correspondence]: The grammar for `String` has the same
|
||||||
effect as the
|
effect as the
|
||||||
[JSON](https://tools.ietf.org/html/rfc8259#section-7) grammar for
|
[JSON](https://tools.ietf.org/html/rfc8259#section-7) grammar for
|
||||||
`string`. Some auxiliary definitions (e.g. `escaped`) are lifted
|
`string`.
|
||||||
largely unmodified from the text of RFC 8259.
|
|
||||||
|
|
||||||
[^escaping-surrogate-pairs]: In particular, note JSON's rules around
|
[^escaping-surrogate-pairs]: In particular, note JSON's rules around
|
||||||
the use of surrogate pairs for scalar values not in the Basic
|
the use of surrogate pairs for scalar values not in the Basic
|
||||||
|
@ -135,14 +126,16 @@ Many bytes map directly to printable 7-bit ASCII; the remainder must be
|
||||||
escaped, either as `\x` followed by a two-digit hexadecimal number, or
|
escaped, either as `\x` followed by a two-digit hexadecimal number, or
|
||||||
following the usual rules for double quote and backslash.
|
following the usual rules for double quote and backslash.
|
||||||
|
|
||||||
ByteString = "#" %x22 *binchar %x22
|
ByteString = "#" DQUOTE *binchar DQUOTE
|
||||||
binchar = binunescaped / escape (escaped / %x22 / %s"x" 2HEXDIG)
|
binchar = <any unicode scalar value ≥32 and ≤126 except "\" or DQUOTE>
|
||||||
binunescaped = %x20-21 / %x23-5B / %x5D-7E
|
/ "\" ("\" / "/" / %s"b" / %s"f" / %s"n" / %s"r" / %s"t")
|
||||||
|
/ %s"\x" 2HEXDIG
|
||||||
|
/ "\" DQUOTE
|
||||||
|
|
||||||
The second is a sequence of pairs of hexadecimal digits interleaved
|
The second is a sequence of pairs of hexadecimal digits interleaved
|
||||||
with whitespace and surrounded by `#x"` and `"`.
|
with whitespace and surrounded by `#x"` and `"`.
|
||||||
|
|
||||||
ByteString =/ %s"#x" %x22 *(ws / 2HEXDIG) ws %x22
|
ByteString =/ %s"#x" DQUOTE *(ws 2HEXDIG) ws DQUOTE
|
||||||
|
|
||||||
The third is a sequence of [Base64](https://tools.ietf.org/html/rfc4648)
|
The third is a sequence of [Base64](https://tools.ietf.org/html/rfc4648)
|
||||||
characters, interleaved with whitespace and surrounded by `#[` and `]`.
|
characters, interleaved with whitespace and surrounded by `#[` and `]`.
|
||||||
|
@ -153,8 +146,8 @@ and [URL-safe](https://datatracker.ietf.org/doc/html/rfc4648#section-5)
|
||||||
(`-`,`_`) characters *SHOULD* be generated by default. Padding characters
|
(`-`,`_`) characters *SHOULD* be generated by default. Padding characters
|
||||||
(`=`) may be omitted.
|
(`=`) may be omitted.
|
||||||
|
|
||||||
ByteString =/ "#[" *(ws / base64char) ws "]"
|
ByteString =/ "#[" *(ws base64char) ws "]"
|
||||||
base64char = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "-" / "_" / "="
|
base64char = ALPHA / DIGIT / "+" / "/" / "-" / "_" / "="
|
||||||
|
|
||||||
A `Symbol` may be written in either of two forms.
|
A `Symbol` may be written in either of two forms.
|
||||||
|
|
||||||
|
@ -163,7 +156,7 @@ including embedded escape syntax, except using a bar or pipe character
|
||||||
(`|`) instead of a double quote mark.
|
(`|`) instead of a double quote mark.
|
||||||
|
|
||||||
QuotedSymbol = "|" *symchar "|"
|
QuotedSymbol = "|" *symchar "|"
|
||||||
symchar = unescaped / %x22 / escape (escaped / %x7C / %s"u" 4HEXDIG)
|
symchar = <any unicode scalar value except "\" or "|"> / escaped / "\|"
|
||||||
|
|
||||||
Alternatively, a `Symbol` may be written in a “bare” form[^cf-sexp-token].
|
Alternatively, a `Symbol` may be written in a “bare” form[^cf-sexp-token].
|
||||||
The grammar for numeric data is a subset of the grammar for bare `Symbol`s,
|
The grammar for numeric data is a subset of the grammar for bare `Symbol`s,
|
||||||
|
@ -171,12 +164,11 @@ so if a `SymbolOrNumber` also matches the grammar for `Float`, `Double` or
|
||||||
`SignedInteger`, then it must be interpreted as one of those, and otherwise
|
`SignedInteger`, then it must be interpreted as one of those, and otherwise
|
||||||
it must be interpreted as a bare `Symbol`.
|
it must be interpreted as a bare `Symbol`.
|
||||||
|
|
||||||
SymbolOrNumber = 1*baresymchar
|
SymbolOrNumber = 1*(ALPHA / DIGIT / sympunct / symuchar)
|
||||||
baresymchar = ALPHA / DIGIT / sympunct / symuchar
|
|
||||||
sympunct = "~" / "!" / "$" / "%" / "^" / "&" / "*" /
|
sympunct = "~" / "!" / "$" / "%" / "^" / "&" / "*" /
|
||||||
"?" / "_" / "=" / "+" / "-" / "/" / "."
|
"?" / "_" / "=" / "+" / "-" / "/" / "."
|
||||||
symuchar = <any scalar value greater than 127 whose Unicode
|
symuchar = <any scalar value ≥128 whose Unicode category is
|
||||||
category is Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd,
|
Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd,
|
||||||
Nl, No, Pc, Pd, Po, Sc, Sm, Sk, So, or Co>
|
Nl, No, Pc, Pd, Po, Sc, Sm, Sk, So, or Co>
|
||||||
|
|
||||||
[^cf-sexp-token]: Compare with the [SPKI S-expression][sexp.txt]
|
[^cf-sexp-token]: Compare with the [SPKI S-expression][sexp.txt]
|
||||||
|
@ -239,9 +231,8 @@ represented as raw hexadecimal strings similar to hexadecimal
|
||||||
syntax whereever convenient, even for values representable using the
|
syntax whereever convenient, even for values representable using the
|
||||||
grammar above.[^rationale-no-general-machine-syntax]
|
grammar above.[^rationale-no-general-machine-syntax]
|
||||||
|
|
||||||
Value =/ HexFloat / HexDouble
|
Float =/ "#xf" DQUOTE 4(ws 2HEXDIG) ws DQUOTE
|
||||||
HexFloat = "#xf" %x22 4(ws 2HEXDIG) ws %x22
|
Double =/ "#xd" DQUOTE 8(ws 2HEXDIG) ws DQUOTE
|
||||||
HexDouble = "#xd" %x22 8(ws 2HEXDIG) ws %x22
|
|
||||||
|
|
||||||
[^rationale-no-general-machine-syntax]: **Rationale.** Previous versions
|
[^rationale-no-general-machine-syntax]: **Rationale.** Previous versions
|
||||||
of this specification included an escape to the [machine-oriented
|
of this specification included an escape to the [machine-oriented
|
||||||
|
@ -277,12 +268,11 @@ named “`Value`” without altering the semantic class of `Value`s.
|
||||||
interpreted as comments associated with that value. Comments are
|
interpreted as comments associated with that value. Comments are
|
||||||
sufficiently common that special syntax exists for them.
|
sufficiently common that special syntax exists for them.
|
||||||
|
|
||||||
Value =/ ws
|
Value =/ ws ";" linecomment (CR / LF) Value
|
||||||
";" *(%x00-09 / %x0B-0C / %x0E-10FFFF) newline
|
linecomment = *<any unicode scalar value except CR or LF>
|
||||||
Value
|
|
||||||
|
|
||||||
When written this way, everything between the `;` and the newline is
|
When written this way, everything between the `;` and the end of the line
|
||||||
included in the string annotating the `Value`.
|
is included in the string annotating the `Value`.
|
||||||
|
|
||||||
**Equivalence.** Annotations appear within syntax denoting a `Value`;
|
**Equivalence.** Annotations appear within syntax denoting a `Value`;
|
||||||
however, the annotations are not part of the denoted value. They are
|
however, the annotations are not part of the denoted value. They are
|
||||||
|
|
Loading…
Reference in New Issue