Scripting language documentation

This commit is contained in:
Tony Garnock-Jones 2022-02-25 14:41:17 +01:00
parent ac68d3f3d3
commit 34f7378ccf
3 changed files with 459 additions and 1 deletions

View File

@ -18,6 +18,7 @@ handle.
## Configuration Scripting Language
## Conversational State
## Dataspace
## Dataspace Pattern
## E
## Embedded References
## Entity
@ -46,7 +47,7 @@ associated [scope](#scope).
## Record
a Preserves record
## Reference
a.k.a. Ref
a.k.a. "Ref", "Entity Reference"
## Relay
## Relay Entity
## S6

View File

@ -28,6 +28,7 @@ bus](./system-bus.md) (NB. not at PID 1).
```
<!--
Here's an example of `ps` output from a Synit prototype running on a mobile phone:
PID TTY STAT TIME COMMAND
@ -44,6 +45,7 @@ Here's an example of `ps` output from a Synit prototype running on a mobile phon
1497 ? Sl 0:01 \_ python3 /usr/lib/synit/wifi-daemon
1516 ? S 0:00 \_ udhcpc -i wlan0 -fR -s /usr/lib/synit/udhcpc.script
1035 ? S 0:00 s6-log t s999999 n500 /var/log/synit
-->
## Boot process

View File

@ -1 +1,456 @@
# Configuration scripting language
The `syndicate-server` program includes a mechanism that was originally intended for populating
a dataspace with assertions, for use in configuring the server, but which has since grown into
a small Syndicated Actor Model scripting language in its own right. This seems to be the
destiny of "configuration formats"—why fight it?—but the current language is inelegant and
artificially limited in many ways. I have an as-yet-unimplemented sketch of a more refined
design to replace it. Please forgive the ad-hoc nature of the actually-implemented language
described below, and be warned that this is an unstable area of the Synit design.
See near the end of this document for [a few illustrative examples](#examples).
## Evaluation model
The language consists of sequences of instructions. For example, one of the most important
instructions simply publishes (asserts) a value at a given entity (which will often be a
[dataspace](../glossary.md#dataspace)).
The language evaluation context includes an environment mapping variable names to Preserves
[`Value`](../guide/preserves.md#values-and-types)s.
Variable references are lexically scoped.
Each source file is interpreted in a top-level environment. The top-level environment is
supplied by the context invoking the script, and is generally non-empty. It frequently includes
a binding for the variable `config`, which happens to be the default [target variable
name](#the-active-target).
## Source file syntax
*Program* = *Instruction* ...
A configuration source file is a file whose name ends in `.pr` that contains zero or more
Preserves [text-syntax](https://preserves.gitlab.io/preserves/preserves.html#textual-syntax)
values, which are together interpreted as a sequence of *Instruction*s.
**Comments.** [Preserves
comments](https://preserves.gitlab.io/preserves/conventions.html#comments) are ignored. One
unfortunate wart is that because Preserves comments are really
[annotations](https://preserves.gitlab.io/preserves/preserves.html#annotations), they are
required by the Preserves data model to be attached to some other value. Syntactically, this
manifests as the need for *some non-comment following every comment*. In scripts written to
date, often an empty *SequencingInstruction* serves to anchor comments at the end of a file:
```preserves
; A comment
; Another comment
; The following empty sequence is needed to give the comments
; something to attach to
[]
```
## Patterns, variable references, and variable bindings
Symbols are treated specially throughout the language. Perl-style
[sigils](https://en.wikipedia.org/wiki/Sigil_(computer_programming)) control the interpretation
of any given symbol:
- `$`*var* is a **variable reference**. The variable *var* will be looked up in the
environment, and the corresponding value substituted.
- `?`*var* is a **variable binder**, used in pattern-matching. The value being matched at that
position will be captured into the environment under the name *var*.
- `_` is a **discard** or **wildcard**, used in pattern-matching. The value being matched at
that position will be accepted (and otherwise ignored), and pattern matching will continue.
- `=`*sym* denotes the **literal symbol** *sym*. It is used whereever syntactic ambiguity
could prevent use of a bare literal symbol. For example, `=?foo` denotes the literal symbol
`?foo`, where `?foo` on its own would denote a variable binder for the variable named `foo`.
- all other symbols are **bare literal symbols**, denoting just themselves.
The special variable `.` (referenced using `$.`) denotes "the current environment, as a
dictionary".
## The active target
During compilation (!) of a source file, the compiler maintains a compile-time register called
the *active target* (often simply the "target"), containing the *name* of a variable that will
be used at runtime to select an [entity reference](../glossary.md#reference) to act upon. At
the beginning of compilation, it is set to the name `config`, so that whatever is bound to
`config` in the initial environment at runtime is used as the default target for targeted
*Instruction*s.
This is one of the awkward parts of the current language design.
## Instructions
*Instruction* =
    *SequencingInstruction* |
    *RetargetInstruction* |
    *AssertionInstruction* |
    *SendInstruction* |
    *ReactionInstruction* |
    *LetInstruction* |
    *ConditionalInstruction*
### Sequencing
*SequencingInstruction* = `[`*Instruction*...`]`
A sequence of instructions is written as a Preserves sequence. The carried instructions are
compiled and executed in order. NB: to publish a sequence of values, use the `+=` form of
*AssertionInstruction*.
### Setting the active target
*RetargetInstruction* = `$`*var*
The target can be set with a plain variable reference. After compiling such an instruction, the
active target register will contain the variable name *var*. NB: to publish the contents of a
variable, use the `+=` form of *AssertionInstruction*.
### Publishing an assertion
*AssertionInstruction* =
    `+= `*ValueExpr* |
    *AttenuationExpr* |
    `<`*ValueExpr*` `*ValueExpr*...`>` |
    `{`*ValueExpr*`:`*ValueExpr*` `...`}`
The most general form of *AssertionInstruction* is "`+= `*ValueExpr*". When evaluated, the
result of evaluating *ValueExpr* will be published (asserted) at the entity denoted by the
active target register.
As a convenient shorthand, the compiler also interprets every plain Preserves record or
dictionary in *Instruction* position as denoting a *ValueExpr* to be used to produce a value to
be asserted.
### Sending a message
*SendInstruction* = `! `*ValueExpr*
When evaluated, the result of evaluating *ValueExpr* will be sent as a message to the entity
denoted by the active target register.
### Reacting to events
*ReactionInstruction* =
    *DuringInstruction* |
    *OnMessageInstruction* |
    *OnStopInstruction*
These instructions establish event handlers of one kind or another.
#### Subscribing to assertions and messages
*DuringInstruction* = `? `*PatternExpr*` `*Instruction*
*OnMessageInstruction* = `?? `*PatternExpr*` `*Instruction*
These instructions publish assertions of the form `<Observe `*pat*` #!`*ref*`>` at the entity
denoted by the active target register, where *pat* is the [dataspace
pattern](../glossary.md#dataspace-pattern) resulting from evaluation of *PatternExpr*, and
*ref* is a fresh [entity](../glossary.md#entity) whose behaviour is to execute *Instruction* in
response to assertions (resp. messages) carrying captured values from the binding-patterns in
*pat*.
Each time a matching assertion arrives at a *ref*, a new [facet](../glossary.md#facet) is
created, and *Instruction* is executed in the new facet. If the instruction creating the facet
is a *DuringInstruction*, then the facet is automatically terminated when the triggering
assertion is retracted. If the instruction is an *OnMessageInstruction*, the facet is not
automatically terminated.[^automatic-termination]
Programs can react to facet termination using *OnStopInstruction*s, and can trigger early facet
termination themselves using the `facet` form of *ConvenienceExpr* (see below).
#### Reacting to facet termination
*OnStopInstruction* = `?- `*Instruction*
This instruction installs a "stop handler" on the facet active during its execution. When the
facet terminates, *Instruction* is run.
### Destructuring-bind and convenience expressions
*LetInstruction* = `let `*PatternExpr*` = `*ConvenienceExpr*
*ConvenienceExpr* =
    `dataspace` |
    `timestamp` |
    `facet` |
    *ValueExpr*
Values can be destructured and new variables introduced into the environment with `let`, which
is a "destructuring bind" or "pattern-match definition" statement. When evaluated, the result
of evaluating *ConvenienceExpr* is matched against the result of evaluating *PatternExpr*. If
the match fails, the actor crashes. If the match succeeds, the resulting binding variables (if
any) are introduced into the environment.
The right-hand-side of a `let`, after the equals sign, is either a normal *ValueExpr* or one of
the following special "convenience" expressions:
- `dataspace`: Evaluates to a fresh, empty [dataspace](../glossary.md#dataspace) entity.
- `timestamp`: Evaluates to a string containing an
[RFC-3339](https://datatracker.ietf.org/doc/html/rfc3339)-formatted timestamp.
- `facet`: Evaluates to a fresh entity representing the current facet. Sending the message
`stop` to the entity (using e.g. the *SendInstruction* "`! stop`") triggers termination of
its associated facet. The entity does not respond to any other assertion or message.
### Conditional evaluation
*ConditionalInstruction* = `$`*var*` =~ `*PatternExpr*` `*Instruction*` `*Instruction* ...
When evaluated, the value in variable *var* is matched against the result of evaluating
*PatternExpr*.
- If the match succeeds, the resulting bound variables are placed in the environment and
evaluation continues with the first *Instruction*. The subsequent *Instruction*s are not
executed in this case.
- If the match fails, then the first *Instruction* is skipped, and the subsequent
*Instruction*s are executed.
## Value Expressions
*ValueExpr* =
    `#t` | `#f` | *float* | *double* | *int* | *string* | *bytes* |
    `$`*var* | `=`*symbol* | *bare-symbol* |
    *AttenuationExpr* |
    `<`*ValueExpr*` `*ValueExpr*...`>` |
    `[`*ValueExpr*...`]` |
    `#{`*ValueExpr*...`}` |
    `{`*ValueExpr*`:`*ValueExpr*` `...`}`
Value expressions are recursively evaluated and yield a Preserves
[`Value`](../guide/preserves.md#values-and-types). Syntactically, they consist of literal
non-symbol atoms, compound data structures (records, sequences, sets and dictionaries), plus
special syntax for *[attenuated](../glossary.md#attenuation) entity references*, *variable references*, and literal symbols:
- *AttenuationExpr*, described below, evaluates to an entity reference with an attached
[attenuation](../glossary.md#attenuation).
- `$`*var* evaluates to the binding for *var* in the environment, if there is one, or crashes
the actor, if there is not.
- `=`*symbol* and *bare-symbol* (i.e. any symbols except [a binding, a reference, or a
discard](#patterns-variable-references-and-variable-bindings)) denote literal symbols.
## Attenuation Expressions
*AttenuationExpr* = `<* $`*var*` [`*Rewrite* ...`]>`
*Rewrite* =
    `<filter `*PatternExpr*`>` |
    `<rewrite `*PatternExpr*` `*TemplateExpr*`>`
An attenuation expression looks up *var* in the environment, asserts that it is an entity
reference *orig*, and returns a new entity reference *ref*, like *orig* but
[attenuated](../glossary.md#attenuation) with zero or more *Rewrite*s. The result of evaluation
is *ref*, the new attenuated entity reference.
When an assertion is published or a message body arrives at *ref*, the sequence of *Rewrite*s
is executed left-to-right. If a *Rewrite* succeeds, the value if produces is forwarded on to
*orig*. If all *Rewrite*s fail, the assertion or message is silently ignored.
A `rewrite` *Rewrite* matches values with *PatternExpr*. If the match fails, the next *Rewrite*
is tried; if it succeeds, the resulting bindings are used along with the current environment to
evaluate *TemplateExpr*, and the resulting value is forwarded on to *orig*.
A `filter` *Rewrite* is the same as `<rewrite <?`*v*` `*PatternExpr*`> $`*v*`>`, for some fresh
*v*.
Supplying zero *Rewrite*s will cause the new entity to reject *all* assertions and messages
sent to it.
## Pattern Expressions
*PatternExpr* =
    `#t` | `#f` | *float* | *double* | *int* | *string* | *bytes* |
    `$`*var* | `?`*var* | `_` | `=`*symbol* | *bare-symbol* |
    *AttenuationExpr* |
    `<?`*var*` `*PatternExpr*`>` |
    `<`*PatternExpr*` `*PatternExpr*...`>` |
    `[`*PatternExpr*...`]` |
    `{`*literal*`:`*PatternExpr*` `...`}`
Pattern expressions are recursively evaluated to yield a [dataspace
pattern](../glossary.md#dataspace-pattern). Evaluation of a *PatternExpr* is like evaluation of
a *ValueExpr*, except that binders and wildcards are allowed, set syntax is not allowed, and
dictionary keys are constrained to being literal values rather than *PatternExpr*s.
Two kinds of binder are supplied. The more general is `<?`*var*` `*PatternExpr*`>`, which
evaluates to a pattern that succeeds, capturing the matched value in a variable named *var*,
only if *PatternExpr* succeeds. For the special case of `<?`*var*` _>`, the shorthand form
`?`*var* is supported.
The pattern `_` ([*discard*, *wildcard*](#patterns-variable-references-and-variable-bindings))
always succeeds, matching any value.
## Template Expressions
*TemplateExpr* =
    `#t` | `#f` | *float* | *double* | *int* | *string* | *bytes* |
    `$`*var* | `=`*symbol* | *bare-symbol* |
    *AttenuationExpr* |
    `<`*TemplateExpr*` `*TemplateExpr*...`>` |
    `[`*TemplateExpr*...`]` |
    `{`*literal*`:`*TemplateExpr*` `...`}`
Template expressions are used in [attenuation expressions](#attenuation-expressions) as part of
value-rewriting instructions. Evaluation of a *TemplateExpr* is like evaluation of a
*ValueExpr*, except that set syntax is not allowed and dictionary keys are constrained to being
literal values rather than *TemplateExpr*s.
Additionally, record template labels (just after a "`<`") must be "literal-enough". If any
sub-part of the label *TemplateExpr* refers to a variable's value, the variable must have been
bound in the environment surrounding the *AttenuationExpr* that the *TemplateExpr* is part of,
and must not be any of the capture variables from the *PatternExpr* corresponding to the
template. This is a constraint stemming from the definition of the syntax used for expressing
capability attenuation in the underlying Syndicated Actor Model. (TODO: link to sturdy.prs
documentation)
## Examples
<span id="example-1">**Example 1.**</span> The simplest example uses no variables, publishing
constant assertions to the implicit default target, `$config`:
```preserves
<require-service <daemon console-getty>>
<daemon console-getty "getty 0 /dev/console">
```
<span id="example-2">**Example 2.**</span> A more complex example subscribes to two kinds of
`service-state` assertion at the dataspace named by the default target, `$config`, and in
response to their existence asserts a rewritten variation on them:
```preserves
? <service-state ?x ready> <service-state $x up>
? <service-state ?x complete> <service-state $x up>
```
In prose, it reads as "during any assertion at `$config` of a `service-state` record with state
`ready` for any service name `x`, assert (also at `$config`) that `x`'s `service-state` is `up`
in addition to `ready`," and similar for state `complete`.
<span id="example-3">**Example 3.**</span> The following example first attenuates `$config`,
binding the resulting capability to `$sys`. Any `require-service` record published to `$sys` is
rewritten into a `require-core-service` record; other assertions are forwarded unchanged.
```preserves
let ?sys = <* $config [
<rewrite <require-service ?s> <require-core-service $s>>
<filter _>
]>
```
Then, `$sys` is used to build the initial environment for a [configuration
tracker](./builtin/config-watcher.md), which executes script files in the `/etc/syndicate/core`
directory using the environment given.
```preserves
<require-service <config-watcher "/etc/syndicate/core" {
config: $sys
gatekeeper: $gatekeeper
log: $log
}>>
```
<span id="example-4">**Example 4.**</span> The final example executes a script in response to
an `exec` record being sent as a message to `$config`. The use of `??` indicates a
message-event-handler, rather than `?`, which would indicate an assertion-event-handler.
```preserves
?? <exec ?argv ?restartPolicy> [
let ?id = timestamp
let ?facet = facet
let ?d = <temporary-exec $id $argv>
<run-service <daemon $d>>
<daemon $d {
argv: $argv,
readyOnStart: #f,
restart: $restartPolicy,
}>
? <service-state <daemon $d> complete> [$facet ! stop]
? <service-state <daemon $d> failed> [$facet ! stop]
]
```
First, the current timestamp is bound to `$id`, and a fresh entity representing the facet
established in response to the `exec` message is created and bound to `$facet`. The variable
`$d` is then initialized to a value uniquely identifying this particular `exec` request. Next,
`run-service` and `daemon` assertions are placed in `$config`. These assertions communicate
with the [built-in program execution and supervision service](./builtin/daemon.md), causing a
Unix subprocess to be created to execute the command in `$argv`. Finally, the script responds
to `service-state` assertions from the execution service by terminating the facet by sending
its representative entity, `$facet`, a `stop` message.
## Programming idioms
**Conventional top-level variable bindings.** Besides `config`, many scripts are executed in a
context where `gatekeeper` names a server-wide [gatekeeper](./builtin/gatekeeper.md) entity,
and `log` names an entity that logs messages of a certain shape that are delivered to it.
**Setting the active target register.** The following *pairs* of *Instruction*s first set and
then use the [active target register](#the-active-target):
```preserves
$log ! <log "-" { line: "Hello, world!" }>
```
```preserves
$config ? <configure-interface ?ifname <dhcp>> [
<require-service <daemon <udhcpc $ifname>>>
]
```
```preserves
$config ? <service-object <daemon interface-monitor> ?cap> [
$cap {
machine: $machine
}
]
```
In the last one, `$cap` is captured from `service-object` records at `$config` and is then used
as a target for publication of a dictionary (containing key `machine`).
**Using conditionals.** The syntax of *ConditionalInstruction* is such that it can be easily
chained:
```preserves
$val =~ pat1 [ ... if pat1 matches ...]
$val =~ pat2 [ ... if pat2 matches ...]
... if neither pat1 nor pat2 matches ...
```
**Using dataspaces as ad-hoc entities.** Constructing a dataspace, attaching subscriptions to
it, and then passing it to somewhere else is a useful trick for creating scripted entities able
to respond to a few different kinds of assertion or message:
```preserves
let ?ds = dataspace ; create the dataspace
$config += <my-entity $ds> ; send it to peers for them to use
$ds [ ; select $ds as the active target for `DuringInstruction`s inside the [...]
? pat1 [ ... ] ; respond to assertions of the form `pat1`
? pat2 [ ... ] ; respond to assertions of the form `pat2`
?? pat3 [ ... ] ; respond to messages of the form `pat3`
?? pat4 [ ... ] ; respond to messages of the form `pat4`
]
```
---
#### Notes
[^automatic-termination]: This isn't quite true. If, after execution of *Instruction*, the new
facet is "inert"—roughly speaking, has published no assertions and has no subfacets—then it
is terminated. However, since inert facets are unreachable and cannot interact with
anything or affect the future of a program in any way, this is operationally
indistinguishable from being left in existence, and so serves only to release memory for
later reuse.