synit-manual/src/protocols/syndicate/dataspacePatterns.md

128 lines
4.6 KiB
Markdown
Raw Permalink Normal View History

# Patterns over assertions
- [`[syndicate-protocols]/schemas/dataspacePatterns.prs`](https://git.syndicate-lang.org/syndicate-lang/syndicate-protocols/src/branch/main/schemas/dataspacePatterns.prs)
Each [subscription record](./dataspace.md) asserted at a dataspace entity contains a pattern
over [Preserves values](../../guide/preserves.md#grammar-of-values).
The pattern language is carefully chosen to be reasonably expressive without closing the door
to efficient indexing of dataspace contents.[^efficient-indexing]
## Interpretation of patterns
A pattern is matched against a candidate input value. Matching can either fail or succeed; if
matching succeeds, a sequence of (numbered) bindings is produced. Each binding in the sequence
corresponds to a (possibly-nested) [binding pattern](#binding) in the overall pattern.
## Example
Consider the pattern:
```preserves
2024-04-09 09:07:03 +00:00
<group <arr> {
0: <lit 1>
1: <bind <group <arr> {
0: <bind <_>>
1: <_>
}>>
2: <_>
}>
```
Each of the following values yields different results when matched against it:
- `[1 2 3]` fails, because `2` is not an array.
- `[1 [2 3] 4]` succeeds, yielding a binding sequence `[[2 3] 2]`, because the outer `bind`
captures the whole `[2 3]` array, and the inner (nested) `bind` captures the `2`.
- `[1 [2] 5]` fails, because `[2]` lacks an element at index 1.
- However, `[1 [2 3 4] 5]` succeeds, yielding a binding sequence `[[2 3 4] 2]`, because
`[2 3 4]` has *at least* the expected two elements at indexes 0 and 1.
- `[1 [<x> <y>] []]` succeeds, yielding a binding sequence `[[<x> <y>] <x>]`. Each discard
pattern (`<_>`) ignores the specific input it is given.
## Abstract syntax of patterns
2024-04-09 09:07:03 +00:00
A *pattern* may be either a *discard*, a (nested) *binding*, a *literal*, or a *group* (compound).
```preserves-schema
2024-04-09 09:07:03 +00:00
Pattern =
/ @discard <_>
/ <bind @pattern Pattern>
/ <lit @value AnyAtom>
/ <group @type GroupType @entries { any: Pattern ...:... }>
.
```
### Discard
2024-04-09 09:07:03 +00:00
A *discard pattern*, `<_>`, matches any input value.
### Binding
2024-04-09 09:07:03 +00:00
A *binding pattern*, `<bind ...>`, speculatively pushes the portion of the input under
consideration onto the end of the binding sequence being built, and then recursively evaluates
its subpattern. If the subpattern succeeds, so does the overall binding pattern (keeping the
binding); otherwise, the speculative addition to the binding sequence is undone, and the
overall binding pattern fails.
### Literal
2024-04-09 09:07:03 +00:00
A *literal pattern*, `<lit ...>`, matches any **atomic** Preserves value:
```preserves-schema
AnyAtom =
/ @bool bool
/ @double double
/ @int int
/ @string string
/ @bytes bytes
/ @symbol symbol
/ @embedded #:any
.
```
2024-04-09 09:07:03 +00:00
In order to match a literal **compound** value, a combination of group and literal patterns
must be used.
2024-04-09 09:07:03 +00:00
### Group
Each *group pattern* first checks the type of its input: a `rec` pattern fails unless it is
given a Record having the specified `label`, an `arr` demands a Sequence and a `dict` only
matches a Dictionary.
```preserves-schema
2024-04-09 09:07:03 +00:00
GroupType =
/ <rec @label any>
/ <arr>
/ <dict>
.
```
2024-04-09 09:07:03 +00:00
If the type check fails, the pattern match fails. Otherwise, matching continues. Subpatterns in
the `entries` field in the group pattern are considered in increasing Preserves order of
key.[^dict-pattern-ordering] For Records, each key must be a field index; for Sequences, each
key is an element index; and for Dictionaries, keys are just keys in the dictionary to be
matched. Each subpattern is matched against the corresponding portion of the input, failing if
no such item exists.
Group patterns ignore keys in values being matched that are not mentioned in the pattern.
Matching succeeds if such values have *more* than the structure and information required of
them by a given pattern, but not if they have less. This allows for protocol extension: for
example, records with "extra" fields will continue to match.
---
[^efficient-indexing]: Most implementations of Syndicated Actor Model dataspaces use an
efficient index datastructure [described here](https://git.syndicate-lang.org/syndicate-lang/syndicate-rkt/src/commit/90c4c60699069b496491b81ee63b5a45ffd638cb/syndicate/HOWITWORKS.md).
[^dict-pattern-ordering]: The ordering of visiting of keys in a `dict` pattern is important
because bindings are *numbered* in this pattern language, not named. Recall that `<dict {a:
<bind <_>>, b: <bind <_>>}>` is an identical Preserves value to `<dict {b: <<bind <_>>, a:
<bind <_>>}>`, so to guarantee consistent binding results, we must choose some
deterministic order for visiting the subpatterns of the `dict`. (In this example, `a` will
be visited before `b`, because `a` < `b`).