Introduce the notion of a "delimiter" to follow Boolean and SymbolOrNumber.
This commit is contained in:
parent
55deeea343
commit
5b8c07cb3f
|
@ -14,4 +14,4 @@ defaults:
|
|||
|
||||
title: "Preserves"
|
||||
version_date: "October 2023"
|
||||
version: "0.990.0"
|
||||
version: "0.990.1"
|
||||
|
|
|
@ -23,14 +23,29 @@ ABNF allows easy definition of US-ASCII-based languages. However,
|
|||
Preserves is a Unicode-based language. Therefore, we reinterpret ABNF as
|
||||
a grammar for recognising sequences of Unicode scalar values.
|
||||
|
||||
<a id="encoding"></a>
|
||||
**Encoding.** Textual syntax for a `Value` *SHOULD* be encoded using
|
||||
UTF-8 where possible.
|
||||
|
||||
<a id="whitespace"></a>
|
||||
**Whitespace.** Whitespace is defined as any number of spaces, tabs,
|
||||
carriage returns, line feeds, or commas.
|
||||
|
||||
ws = *(%x20 / %x09 / CR / LF / ",")
|
||||
|
||||
<a id="delimiters"></a>
|
||||
**Delimiters.** Some tokens (`Boolean`, `SymbolOrNumber`) *MUST* be
|
||||
followed by a `delimiter` or by the end of the input.[^delimiters-lookahead]
|
||||
|
||||
delimiter = ws
|
||||
/ "<" / ">" / "[" / "]" / "{" / "}"
|
||||
/ "#" / ":" / DQUOTE / "|" / "@" / ";"
|
||||
|
||||
[^delimiters-lookahead]: The addition of this constraint means that
|
||||
implementations must now use some kind of lookahead to make sure a
|
||||
delimiter follows a `Boolean`; this should not be onerous, as
|
||||
something similar is required to read `SymbolOrNumber`s correctly.
|
||||
|
||||
## Grammar
|
||||
|
||||
Standalone documents may have trailing whitespace.
|
||||
|
|
|
@ -109,7 +109,7 @@ label, then by field sequence.
|
|||
labels as specially-formatted lists.
|
||||
|
||||
[^iri-labels]: It is occasionally (but seldom) necessary to
|
||||
interpret such `Symbol` labels as UTF-8 encoded IRIs. Where a
|
||||
interpret such `Symbol` labels as IRIs. Where a
|
||||
label can be read as a relative IRI, it is notionally interpreted
|
||||
with respect to the IRI
|
||||
`urn:uuid:6bf094a6-20f1-4887-ada7-46834a9b5b34`; where a label can
|
||||
|
|
12
questions.md
12
questions.md
|
@ -5,9 +5,16 @@ title: "Open questions"
|
|||
Q. Should "symbols" instead be URIs? Relative, usually; relative to
|
||||
what? Some domain-specific base URI?
|
||||
|
||||
> No. They may be interpreted as URIs, of course; see
|
||||
> [here](preserves.html#fn:iri-labels).
|
||||
|
||||
Q. Literal small integers: are they pulling their weight? They're not
|
||||
absolutely necessary.
|
||||
|
||||
> No. They were removed in the simplification of the syntax that was the
|
||||
> outcome of [issue
|
||||
> 41](https://gitlab.com/preserves/preserves/-/issues/41).
|
||||
|
||||
Q. Should we go for trying to make the data ordering line up with the
|
||||
encoding ordering? We'd have to only use streaming forms, and avoid
|
||||
the small integer encoding, and not store record arities, and sort
|
||||
|
@ -37,3 +44,8 @@ require any whitespace at all between elements of a list, making it
|
|||
ambiguous: does `[123]` denote a single-element or a three-element
|
||||
list? Compare JSON where `[1,2,3]` is unambiguously different from
|
||||
`[123]`.
|
||||
|
||||
> With the addition of the notion of
|
||||
> [delimiters](preserves-text.html#delimiters) to the text syntax, we at
|
||||
> least answer the question of how `[123]` parses: it must yield a
|
||||
> single-element list.
|
||||
|
|
Loading…
Reference in New Issue