preserves/preserves-path.md

3.1 KiB

no_site_title title
true Preserves Path

Tony Garnock-Jones tonyg@leastfixedpoint.com
August 2021. Version 0.1.0.

XML documents can move into attributes, into text, or into children.

Preserves documents don't have attributes, but they do have children generally and keyed children in particular. You might want to move into the child with a particular key (number, for sequences, or general-value for dictionaries); into all keys; into all mapped-to-values, i.e. children (n.b. not just for sequences and dicts, but also for sets).

Expressions

Expressions: compute a sequence or set (or dictionary?) of results from a stream of input values.

Precedence groupings from highest to lowest. Within a grouping, no mixed precedence is permitted.

    step ...          ;; Applies steps one after the other, flatmap-style

    ! expr            ;; If no nodes, yields a dummy #t node; if some, yields none

    expr ~ expr ~ ... ;; "interleave" of expressions (sequence-valued, duplicates allowed)
    expr + expr + ... ;; "union" of expressions (set-valued)
    expr & expr & ... ;; "intersection" of expressions (set-valued)

A step is an axis, a filter, or [expr], a parenthesis for overriding precedence.

Axes

Axes: move around, applying filters after moving

    .=           ;; Doesn't move anywhere
    /            ;; Moves into immediate children (values / fields)
    //           ;; Flattens children recursively
    . key        ;; Moves into named child
    .^           ;; Moves into record label
    .keys        ;; Moves into *keys* rather than values
    .length      ;; Moves into the number of keys
    .annotations ;; Moves into any annotations that might be present
    .embedded    ;; Moves into the representation of an embedded value

Filters

Filters: narrow down a selection without moving

    =*                ;; Accepts all
    =!                ;; Rejects all

    = literal         ;; Matches values equal to the literal
    =r regex          ;; Matches strings and symbols by regular expression

    ?[expr]           ;; Applies the expression to each node; keeps nodes that yield nonempty

    ^ literal         ;; Matches a record having a the literal as its label -- equivalent to ?[.^ = literal]

    bool              ;; Type filters
    float
    double
    int
    string
    bytes
    symbol
    rec
    seq
    set
    dict
    embedded

Transformers

e.g. stringify results; sequenceify results (see "+" operator); setify results (see "/" and "&" operators); join stringified results with a separator

Tool design

When processing multiple input documents sequentially, will sometimes want a list of results for each document, a set of results for each document, or a list flattened into a sequence of outputs for all input documents in the sequence. (A flattened set doesn't make sense for streaming since the input documents come in a sequence; if the inputs were treated as a set represented as a sequence, and outputs were buffered in a single large set, that could work out...)