From 441a2c20a35b72bad572ed60c930914ac8a1eb0b Mon Sep 17 00:00:00 2001 From: Tony Garnock-Jones Date: Sat, 11 May 2013 05:11:13 -0400 Subject: [PATCH] Remove placeholder --- marketplace/scribblings/MISC.scrbl | 1041 ---------------------------- 1 file changed, 1041 deletions(-) delete mode 100644 marketplace/scribblings/MISC.scrbl diff --git a/marketplace/scribblings/MISC.scrbl b/marketplace/scribblings/MISC.scrbl deleted file mode 100644 index 8a80303..0000000 --- a/marketplace/scribblings/MISC.scrbl +++ /dev/null @@ -1,1041 +0,0 @@ -#lang scribble/manual -@require[racket/include] -@include{prelude.inc} - -@title{REMAINDER} - -Figure~\ref{vm-interface-types} specifies the framework and its -underlying library via stylized type signatures.@note{The actual -implementation supports secondary features not essential to the -system, such as debug-names for processes and user-accessible process -identifiers. Also, in Typed Racket, we must encode the -existentially-quantified types of Process and Spawn using second-order -polymorphism.} - - - -Additionally, our framework allows the recursive nesting of -marketplaces, thus realizing Dijkstra's vision of a layered, -virtualized operating system. Processes within a layer can themselves -be the substrate for a further layer of sub-processes. Each layer -communicates internally using protocols appropriate to -just that layer. Relay processes translate messages between protocols as they -cross layer boundaries. - -Within a marketplace, the appearance or disappearance of a service -becomes an event that affects interested parties. Our architecture -comes with a notion of presence and absence notification -integrated with each nested layer. Using presence, our -architectural framework naturally delimits conversational contexts and -manages associated resources. - -While many existing environments use a "mailbox" metaphor, where programs -exchange messages with peers, -real distributed systems do not behave like orderly postal services. -In practice, messages frequently get lost, through corruption and -congestion. Programs engage in multiple simultaneous conversations. -The services a program depends on may be crashed, down for -maintenance, or still going through their startup procedures. An -orderly startup sequence is an impossibility. The system as a whole -frequently cannot be rebooted, existing instead in a state of constant -recovery. Addresses become stale. Demand for services often outstrips -their supply. - -The marketplace metaphor implies that such complications are not -problems to be solved anew by each application, but issues that the -programming environment should solve, once and for all. -In this paper we report on initial progress toward this vision. - -We take a three-pronged approach to scaling Worlds and Universes to -systems programming. We make Worlds nestable, transform their event -system into a pub/sub network, and integrate -presence and absence notifications. In addition to satisfying -the criteria of Hudak and Sundaresh, the combination of nesting and -presence gives a principled approach to resource management and to -subsystem isolation and composition. Presence gives a flexible -communications topology to each layer in the layered architecture and -provides a clean account of error signalling. - -Our design is at heart a @emph{distributed operating system}. This -idea, together with the recent virtualization trend, suggests the -introduction of a @emph{virtual machine (VM)} in which user programs -run. To each VM, we add pub/sub messaging. We escape the constraints -of a hub-and-spoke routing topology by automatically deducing routing tables from -the set of active pub/sub subscriptions. - -Basing message routing on active subscriptions in this way has a -pleasant side effect. Our VM notifies processes when routes relevant -to their interests appear or disappear, yielding a generalized form of -@emph{presence}, a concept -originating in more restricted form in instant messaging networks such -as XMPP. Presence notifications are a common, though -often disguised, feature of communications media, but to date have not -received wide attention. - - Our approach separates discrete actions such as spawning new processes - and sending and receiving messages from more continuous reactions to - changes in a process' environment, such as arrival of a new service or - the crashing of a peer. - - To illustrate the idea of presence, consider a widely-used - Internet-scale pub/sub network: Twitter. Each Twitter user is - analogous to a process in our system. Following a user is equivalent - to subscribing to their message stream. The analog of presence - notification is the email Twitter sends to inform a user of a new - follower. In some sense, users tailor their message stream to match - the perceived interests of their followers; similarly, processes in - our system base their decisions about what to send on @emph{who is - listening}. Our system goes further in that presence is - bidirectional, informing processes not only of subscribers matching - their advertised intent to publish, but also of publishers matching - their declared interest in receiving messages. - - To illustrate the idea of presence, consider the essence of the - BitTorrent file-sharing protocol, as it might be implemented in our - system. A group of processes share a communication space and - collaborate to ensure all members have a copy of the file being - shared. Each process advertises the chunks of a file it holds. Peers - subscribe to chunks they wish to receive. The VM infrastructure - computes the intersections between advertisements and subscriptions, - and conveys that routing information to the processes. As processes - arrive and depart, the subscription set changes, and the routes - computed from subscriptions indicate changing demand and supply - levels for blocks within the network. Presence, then, indicates what - it is profitable to send to whom. - -In order to properly encapsulate and isolate groups of processes -collaborating on subtasks within a larger system, we take care to -ensure that the type of our VM kernel program is a subtype of the type -of its processes, which makes our system @emph{recursively -nestable}. A VM instance can be run as a process within another VM. A -layered structure of nested VMs arises, with each VM encapsulating a -group of related processes. A @emph{ground VM} maps events and actions -to real communication with the outside world. Each subsequent layer -translates between its clients above and its substrate below, in a way -similar both to layers in network architectures such as the OSI -model and to the architecture envisaged -in Hoare's quote above. - - Because presence operates both inside and outside a nestable VM, it - can be used to automatically propagate demand for services across - layers. Consider a cloud scenario where a single physical machine - hosts $n$ Linux virtual machines, each of which hosts $m$ socket-based - services. Using an approach such as @tt{systemd}'s - socket-activated OS containers, incoming - connections not only cause processes to be spawned, but cause whole - virtual machines to be started. Our system achieves the same - responsiveness to changing demand, while avoiding the manual - configuration step necessary with @tt{systemd}: presence expressed - by the innermost processes flows across successive levels of - containment to the ground VM, where it can be turned into actual - TCP activity by a TCP driver. - -\begin{figure}[t] - \centering - \begin{tabular}{|l|c|c|} - \hline - Challenge & Traditional model & Marketplace model \\ - \hline - Application logic & App & App \\ - User interface & App & App \\ - Service discovery & App & Language \\ - Session lifetime & App & Language \\ - Demand tracking & App & Language \\ - Fault isolation & App & Language \\ - Routing & App & Language \\ - Messaging & Language & Language \\ - Concurrency & Language & Language \\ - \hline - \end{tabular} - \ruledcaption{Challenges faced, and division of responsibility} - \label{asynchronous-challenges} -\end{figure} - -In this way, we have moved from a "mailbox" model based strictly -around producing and consuming messages toward a "marketplace" -model. Figure~\ref{asynchronous-challenges} summarizes the burdens -that our marketplace architecture lifts from applications. Each VM makes a -"bazaar" of interacting vendors and buyers. Groups of collaborating -programs are placed within task-specific VMs to scope their -interactions. Conversations between programs are multi-party, and -programs naturally participate in many such conversations at once. Not -only are messages sent and received, but programs react to presence -notifications that report the comings and goings of their peers. -Presence also serves to communicate changes in demand for and supply -of services, both within a VM and across layers. Programs are no longer -responsible for maintaining presence information or for -scoping group communications; their containing virtual machine takes -on those tasks for them. - - -@section{Interface} - -Our @emph{processes} generalize World programs by replacing the -latter's special-purpose input handlers with @emph{endpoints}, a -single, general construct for handling (possibly message-carrying) -@emph{events}. Existentially-quantified types hide process -states (\State) from the kernel, and we hide kernel state from -processes by never passing it into user code. Given an event and a -current process state, event handlers respond with a -@emph{transition}, which bundles a new process state with a list of -@emph{actions}. The containing VM interprets these action data -structures. Actions can be -communication-related (@racket[add-endpoint] and -@racket[delete-endpoint], @racket[send-message]), -process-related (@racket[spawn], @racket[quit]), or cross-layer -(@racket[at-meta-level]). - -A virtual machine groups and isolates a collection of processes; in -turn, it presents itself as a process to another group of processes. -That is, a system consists of nested layers of processes that interact -via messages. The bottom-most (ground) layer is the runtime library of -our language, and interacts with the real world. - -\paragraph*{Starting an Application.} - -Applications differ from normal Racket modules only in their selection -of language. A Racket module written -with @tt{#lang marketplace}, such as the echo server in -figure~\ref{echo-paper3}, specifies a sequence of definitions and -startup actions for an application. Typically, initial actions spawn -application processes and nested VMs, which in turn subscribe to -sources of events from the outside world. - -\paragraph*{Endpoints, Conversations, Messaging and Feedback.} -Processes engage in multiple simultaneous conversations. Each -process therefore has a set of active subscription @emph{endpoints}, -each of which selects a subset of the messages on the network. Roughly -speaking, each endpoint plays a @emph{role} within an -ongoing conversation. Publishers and subscribers declare their -interests to their containing VM via @emph{advertisements} and @emph{ -subscriptions}, respectively, created with @racket[add-endpoint] -actions: -@#reader scribble/comment-reader (racketblock -(add-endpoint @emph{endpoint-id} - (role @emph{orientation} @emph{topic} @emph{interest-type}) - (\LAMBDA (event) - (\LAMBDA (state) - @emph{... computation resulting in:} - (transition @emph{new-state} @emph{action0} @emph{action1} ...)))) -) -Endpoints are the most complex structure in our system's interface, -and so deserve careful explanation. They are named, for later -reference in @racket[delete-endpoint] actions: -@#reader scribble/comment-reader (racketblock -(delete-endpoint @emph{endpoint-id}) -) - - TGJ: This sentence is probably not required? - Endpoint IDs must be unique within the scope of a process. - -\noindent -Endpoints contain a @emph{role}, which generalizes traditional notions -of advertisement and subscription by combining a topic of conversation -with an orientation: @emph{publisher} or @emph{subscriber}. The -topic filter is a pattern over S-expression-shaped -messages@note{In pub/sub terminology, -this is a @emph{content-based filter}.} -expressed as a general datum with embedded wildcards. Choosing this -representation gives both an intuitive pattern language -and, with unification, a conventional operation for computing topic -intersections. - -Borrowing an example from the chat server implementation of section~\ref{sec:example}, -the following constructs an endpoint advertising intent to -publish@note{This endpoint exists solely to indicate presence to -others, and its event handler therefore ignores incoming events.} on the -"$X$ says $Y$" topic, where $X$ is bound to a user's name -(@racket[me]) and $Y$ is wild (@racket[?]): -@#reader scribble/comment-reader (racketblock -(add-endpoint 'speaker - (role 'publisher `(,me says ,?) 'participant) - (\LAMBDA (event) (\LAMBDA (state) (transition state)))) -) - -Event handlers dispatch on the type of event and current process -state, returning a transition structure for the VM to process. An -endpoint matching @racket['speaker] might be: -@#reader scribble/comment-reader (racketblock -(add-endpoint 'listener - (role 'subscriber `(,? says ,?) 'participant) - (\LAMBDA (event) - (\LAMBDA (state) - (match event - [(presence-event arriving-role) - ...] @emph{;; describe the arrival of a user} - [(absence-event departing-role reason) - ...] @emph{;; describe the departure of a user} - [(message-event sender-role `(,who says ,what)) - ...])))) @emph{;; inform the user that }who@emph{ said }what -) -Since the @emph{presence} of processes is as important as exchanging -messages, we include (dis)appearances of processes as essential -events of a conversation alongside regular message deliveries. -Concretely, presence and absence events carry a -VM-computed @racket[role] structure describing the @emph{intersection} -between the advertised interests of the recipient and the appearing or -disappearing peer. - -For example, if endpoint "A" takes on the role of subscriber to -topic @racket[(? says ?)], and a peer process creates an endpoint -"B" taking on the role of publisher within the topic @racket[(Bob ? -?)], then the VM sends a presence event to "A" noting that a -publisher on topic @racket[(Bob says ?)] has appeared. Likewise, the -VM informs "B" of a new subscriber on the same topic. Shared topics -of conversation are just the intersections of the topics of the -endpoints viewed as sets of messages. - -@defstruct*[send-message ([body any/c] - [orientation orientation?])]{ -Processes send messages to peers with @racket[send-message] actions. - -The optional orientation is by default @racket['publisher], when -@racket[message-body] is intended for matching @racket['subscriber]s. -Because our system enjoys publisher/subscriber symmetry in its -presence notifications and routing tables, @emph{subscribers} may offer -feedback to @emph{publishers}: with @racket[send-message] -orientation @racket['subscriber], messages can flow @emph{upstream} to -processes playing the conversational role of publisher. Feedback -can express flow-control, mode-selection and message -acknowledgement. To illustrate, endpoint "B" from above might take a -transition -@#reader scribble/comment-reader (racketblock -(transition (compute-bob-state) - (send-message '(Bob says hello) 'publisher) - (send-message '(Bob goes-to the-shop) 'publisher)) -) -Endpoint "A" would receive just the first message, and might give -feedback with -@#reader scribble/comment-reader (racketblock -(transition (compute-alice-state) - (send-message '(Alice hears (Bob says hello)) - 'subscriber)) -) - -As another example, the chat program in section~\ref{sec:example} -uses such feedback to manage flow-control between the chat process -and the TCP driver. -} - -\paragraph*{Participants and Observers.} -The @emph{interest type} given in an endpoint's @racket[role] structure -allows endpoints to monitor interest in some topic of conversation without -offering to participate in such conversations, or equivalently, to monitor -demand for some service without offering to supply or consume that service. - -Endpoints with an interest type of @racket['participant] are regular -subscribers, both receiving and causing presence notifications for -matching participant endpoints in the system. Those with -type @racket['observer], however, @emph{receive} presence notifications -about participants but do not @emph{cause} any. Finally, endpoints -using interest type @racket['everything] receive notifications about -all three types of endpoint in the system. - -The ability to passively observe other participants in a conversation -naturally supports supervisor processes. -Such supervisors can create and destroy services in response to changes in demand. - - \begin{figure} - \centering - \begin{tabular}{|r|c|c|} - \hline - & Participant & Observer \\ - \hline - Subscriber & Informed of pubs. & Informed of pubs. \\ - & Acts as listener & \\ - \hline - Publisher & Informed of subs. & Informed of subs. \\ - & Acts as speaker & \\ - \hline - \end{tabular} - \ruledcaption{Interest types, roles, and presence events.} - \label{interest-types-in-our-architecture} - \end{figure} - -\paragraph*{Linguistic Simplifications.} -Often, only a subset of the flexibility of @racket[add-endpoint] is -needed. Hence, definitions like that of the @racket['listener] endpoint look -long-winded. For such cases, a small, optional -endpoint creation domain-specific language provides sensible -defaults. The endpoints in figure~\ref{echo-paper3}, for example, are -created using the DSL instead of building @racket[add-endpoint] structures directly. - -\begin{figure} -\begin{tabular}{rcl} -$endpoint$ & := & @tt{(endpoint }$orientation$ $topic$ \\ - & & $\quad\quad\{interest\}$ \\ -\\ - & & $\quad\quad$\{@tt{#:state }$pattern$\} \\ - & & $\quad\quad$\{@tt{#:conversation }$pattern$\} \\ - & & $\quad\quad$\{@tt{#:reason }$identifier$\} \\ -\\ - & & $\quad\quad$\{@tt{#:let-name }$identifier$\} \\ - & & $\quad\quad$\{@tt{#:name }$expr$\} \\ -\\ - & & $\quad\quad$\{@tt{#:on-presence }$handler$\} \\ - & & $\quad\quad$\{@tt{#:on-absence }$handler$\} \\ -\\ - & & $\quad\quad message\mhyphen handler^*$@tt{)}\\ -\\ -$orientation$ & := & @tt{#:publisher} $|$ @tt{#:subscriber} \\ -\\ -$topic$ & := & $expr$ \\ -\\ -$interest$ & := & @tt{#:participant} $|$ \\ - & & @tt{#:observer} $|$ \\ - & & @tt{#:everything} \\ -\\ -$message\mhyphen handler$ & := & @tt{(}$pattern$ $handler$@tt{)} \\ -\\ -$handler$ & := & $expr$ -\end{tabular} -\ruledcaption{Syntax of the @racket[endpoint] DSL. Braces indicate optional elements; Kleene star indicates repetition.} -\label{endpoint-dsl-syntax} -\end{figure} - - \begin{figure} - \centering - \begin{tabular}{|r|c|c|c|c|} - \hline - Handler & - \begin{sideways}@tt{#:state}\end{sideways} & - \begin{sideways}@tt{#:conversation}\end{sideways} & - \begin{sideways}@tt{#:reason}\end{sideways} & - \begin{sideways}@tt{#:let-name}$\quad$\end{sideways} \\ - \hline - message & \checkmark & \checkmark & & \checkmark \\ - @tt{#:on-presence} & \checkmark & \checkmark & & \checkmark \\ - @tt{#:on-absence} & \checkmark & \checkmark & \checkmark & \checkmark \\ - \hline - \end{tabular} - \caption{Scope of bindings in @racket[endpoint] handlers} - \label{endpoint-dsl-scope} - \end{figure} - -Figure~\ref{endpoint-dsl-syntax} specifies the syntax of the -@racket[endpoint] language. The only mandatory parts of an -@racket[endpoint] are its @emph{orientation}, that is whether it is -a subscription or a publication advertisement, and its @emph{topic}. -Many of the optional clauses introduce new bindings into the scope of -the endpoint's handlers. - Figure~\ref{endpoint-dsl-scope} summarizes - the visibility of new bindings in each kind of handler. - -With a @racket[#:state] clause, handler expressions can refer to and -update the current process state. -Variables introduced in the associated pattern are scoped over all three types of handler. -If @racket[#:state] is present, handler -expressions are expected to return a full transition structure -including a new process state. If it is absent, however, handler -expressions are expected to return only a list of actions. -This permits concision in the -common case of a stateless process or endpoint. For example, consider -the "no-op" event handler in the @racket['speaker] endpoint example -above. Using @racket[endpoint], it becomes -@#reader scribble/comment-reader (racketblock -(endpoint #:publisher `(,me says ,?)) -) - -The @racket[#:conversation] clause, again scoped over all handlers, -gives access to the topic of conversation -carried in each notification. The @racket[#:reason] clause, scoped solely over @racket[#:on-absence] handlers, conveys the exit reason code -carried in absence notifications. Endpoint names are introduced -with @racket[#:name], if the program wishes to supply an -explicitly-computed name, or @racket[#:let-name], if programs wish to -delegate name construction to the VM. When @racket[#:let-name] is -used, a guaranteed-fresh endpoint name is supplied to handlers. This permits -an idiom for declaring a temporary endpoint: -@#reader scribble/comment-reader (racketblock -(endpoint #:subscriber some-topic - #:let-name e - ;; message handler: - [request - (let ([reply (compute-reply request)]) - (list (delete-endpoint e) - (send-message reply)))]) -) -Message handling clauses at the end of an @racket[endpoint] expression -are run against delivered messages in the usual left-to-right order. -If no clauses match, the delivered message is silently discarded. - -\paragraph*{Cross-layer communication.} -Each VM has access to @emph{two} inter-process communication (IPC) -facilities: the external network connecting it to its siblings and -the internal network connecting its contained processes to each other. -When a process hands normal -@racket[add-endpoint], @racket[delete-endpoint] and -@racket[send-message] actions to its VM, they apply to the internal -network of the VM. Actions must be wrapped in an -@racket[at-meta-level] structure to signal to the VM that they are to -apply to the VM's external network. - -\begin{figure}[tb] -@#reader scribble/comment-reader (racketblock -(define relay-down - (endpoint #:subscriber ? - ;; message handler: - [message (at-meta-level - (send-message message))])) - -(define relay-up - (at-meta-level - (endpoint #:subscriber ? - ;; (meta-level) message handler: - [message (send-message message)]))) -) -\ruledcaption{Examples of the use of @racket[at-meta-level]} -\label{at-meta-level-examples} -\end{figure} - -Figure~\ref{at-meta-level-examples} demonstrates the use of -@racket[at-meta-level]. Both examples evaluate to -@racket[add-endpoint] action structures. The @racket[relay-down] -endpoint subscribes to the wildcard pattern on the internal network, -and upon receipt of a message, transmits it on the external network. -The @racket[relay-up] endpoint subscribes to the external network and -transmits on the internal network. - -Relaying messages between layers is straightforward, but relaying -presence across layers requires the passive @racket['observer] interest-type. An -observer subscription can be used to measure demand for some service -at an upper layer and project it as demand for analogous service at a -lower layer, without appearing to satisfy the upper-layer demand until -matching supply is detected at the lower layer. - -\paragraph*{Creating Processes.} -A @racket[spawn] action requests the launch of a new process. -Each @racket[spawn] contains a function producing an initial -transition for the new process: -@#reader scribble/comment-reader (racketblock -(make-spawn - (\LAMBDA () (transition @emph{state0} @emph{action0} @emph{action1} ...))) -) -The function delays computation of the initial state and initial -actions until the VM installs an appropriate exception handler, -so that blame for any exceptions is correctly apportioned. Because -this is syntactically awkward, a simple shorthand is provided: -@#reader scribble/comment-reader (racketblock -(spawn #:child (transition @emph{state0} @emph{action0} @emph{action1} ...)) -) -The VM interpreting the @racket[spawn] datum creates a new process -record with the initial state and queues up the associated actions for -execution. At the type level, a @racket[spawn] action involves a fresh, -existentially-quantified state type variable. - -\paragraph*{Exceptions and Process Termination.} - -@defstruct*[quit ([pid pid?] - [reason any/c])]{ -A @racket[quit] action terminates the invoking process, cancelling all -its subscriptions. - -The optional @emph{reason code} is passed along to other -processes in any absence notifications arising from the process's -termination. This is analogous to the "exit reason" carried by -Erlang's process exit signals~\cite[\S3.5.6]{Armstrong2003}. - -Any exception thrown in an event handler (or during the computation of -an initial transition from a @racket[spawn] action) is caught by the -VM and translated into a @racket[quit] action. This isolates processes, but -not endpoints within processes, from each other's failures. -} - -\paragraph*{Scheduling, Management and Monitoring.} -Our current VM implementations cooperatively schedule their processes, -and so support an additional @racket[yield] action, which cedes control -of the CPU to other processes: -@#reader scribble/comment-reader (racketblock -(make-yield (\LAMBDA (state) (transition ...))) -) - - TODO: Check that this is mentioned elsewhere: - - VMs treat processes under their care as linear resources, leaving them - free to use either a pure-functional approach to managing their state - or to use side-effecting actions as they see fit. - -Finally, many real operating and networking systems -provide reflective facilities which permit listing of running -processes, listing of active network endpoints, killing of processes -by ID, attachment of debuggers to running processes, and so on. -Programmers working with systems that do not provide such facilities -often find themselves implementing makeshift substitutes. Our current -implementation has limited support for such features; we conjecture -that our design will naturally extend to this kind of reflection, but -properly integrating these ideas remains future work. - -@section{Implementation} - -We have two interworking implementations of our VM -abstraction: one nestable VM used to organize applications, and one -ground VM mapping abstract events to actions in the outside world. - -\paragraph*{The Nestable VM.} -The workhorses of our system, nested VM instances are created by a -new linguistic construct, @racket[nested-vm]. Given a list of actions for a primordial -process to run in the new VM, @racket[nested-vm] returns a @racket[spawn] -action that requests the launch of the new VM: -@#reader scribble/comment-reader (racketblock - (transition @emph{spawner-state} - (nested-vm @emph{primordial-action} ...)) -) -Figure~\ref{spawning-nested-vm} illustrates the creation of a new VM. - -\begin{figure}[tb] -@#reader scribble/comment-reader (racketblock -) - \centering - \includegraphics[height=3cm]{spawning-nested-vm.eps} - \ruledcaption{Spawning a nested VM} - \label{spawning-nested-vm} -\end{figure} - -Nested VM instances are implemented as ordinary processes, and so have -state, a state type, and a collection of active subscriptions. Their private -state is nothing more than the table of contained processes: -$$ \State_{vm} = \textrm{PID} \mapsto \textrm{Process} $$ -Recall from figure~\ref{vm-interface-types} that the Process type -involves "EPs" and "MetaEPs", which are sets of endpoints -interacting with the VM's internal and external networks, -respectively. - -Nested VMs interpret actions from contained processes as they respond -to VM events. Ordinary actions, such as @racket[add-endpoint] -and @racket[spawn], operate on the VM's resources. Meta-level actions, -wrapped in an @racket[at-meta-level] action structure, are translated -into actions that the VM hands back to its container. -Where @racket[spawn] creates a process that is a sibling of the acting -process, an @racket[at-meta-level] @racket[spawn] creates a -process that is a sibling of the VM itself. Similarly, @racket[quit] -can be used with @racket[at-meta-level] to terminate the entire VM, -and @racket[send-message] with @racket[at-meta-level] transmits a -message on the VM's external network, not its internal one. - -A meta-level @racket[add-endpoint] action requests the creation of an -endpoint in the @emph{external} network. The VM translates the request -into an action at the VM-as-process level that creates an relaying -endpoint in the @emph{internal} network of the VM's own container. A -record of the relaying "meta-endpoint" is placed in the "MetaEPs" -set of the requesting process, so that when the relaying event handler -fires, the event can be passed to the correct handler in the contained -process. The relaying event handler level-shifts events to compensate -for the level-shifting that took place when the meta-endpoint was -established. - -\paragraph*{The Ground VM.} -Virtual machines can only be stacked so far. At some point, they must -connect to the outside world. Our "ground" VM implementation does -just that. Its processes produce real-world output by judicious use of -side-effecting Racket procedures, and await input by using ordinary -subscription endpoints with topics describing Racket's core events. - -The ground VM is automatically started for applications written in the -language. Programs written in other languages built on Racket can also -make use of our system by explicitly invoking the @racket[ground-vm] -procedure. - -The ground VM monitors subscriptions involving CML-style -event descriptors, interpreting their presence -as demand for the corresponding events and translating them into -concrete I/O requests. When underlying Racket events fire, the -resulting values are sent as messages on the ground VM's internal -network. There, they match subscription topics that caused the event -to be activated in the first place and are delivered to corresponding -endpoints. - -Concretely, I/O subscription topic patterns are structured as a pair of a Racket event -descriptor and a pattern matching the values the event yields upon firing. -For example, the timer driver process asks for events when the system -clock advances past a certain point as follows: -@#reader scribble/comment-reader (racketblock -(endpoint - #:subscriber (cons (timer-evt deadline) ?) - ;; message handler: - [(cons \_ current-system-clock-value) - (begin (display "Time's up!\textbackslashn") - '())]) -) -where @racket[deadline] is the time of the next pending event and -the @racket[timer-evt] function maps such a deadline to an I/O event descriptor. -In the subscription topic, the @racket[car] of the pair is the -event descriptor of interest, and the @racket[cdr] is a wildcard. -In the message-handling pattern, however, the @racket[car] is -ignored since it is simply the event descriptor subscribed to, and -the @racket[cdr] is expected to be the current value of -the system clock. Drivers for other devices construct analogous -subscriptions. - -The ground VM is in some ways similar to an @emph{ -exokernel} in that it exposes the underlying -"hardware" I/O mechanisms in terms of its own communication -interface. In other words, it multiplexes access to the underlying -system without abstracting away from it. - -\paragraph*{Other VM Implementations.} -We have chosen to build our VM implementations in a completely functional -style. Our VM API is deliberately formulated to permit -side-effect-free implementations. Nothing in the interface forces this -choice, however. It is both possible and useful to consider -implementations that internally use imperative features to manage -their process tables, that use Racket's concurrency and parallelism to -improve scalability on multicore machines, that transparently -distribute their contained processes across different machines in a -LAN, and so on. - -Because the observable behaviour of a VM is independent of its -implementation, changing the way in which an -application scales may be as simple as switching one -VM implementation for another. We hope to -explore this territory in the future. - - - - -\begin{figure}[tb] -@verbatim{ - $ telnet localhost 5999 - Trying 127.0.0.1... - Connected to localhost. - Escape character is '^]'. - You are user63. - user81 arrived. - hello - user63: hello - user81: hi user63 - user81 departed. -} -\ruledcaption{Transcript of a session with the chat service.} -\label{example-transcript} -\end{figure} - balance emacs syntax highlighting: $ - -To illustrate how the pieces of our system fit together, we analyze the source code for a -hub-and-spoke style, TCP-based chat server. The code in this -section is the entirety of the program. Clients connect to the server -with @tt{telnet}. The server assigns a unique name, such as -@tt{user63}, to each connecting client. The arrivals and -departures of peers in the chatroom are announced to connected -clients. Each line of text sent by a client is relayed to every -connected client; figure~\ref{example-transcript} shows a transcript. - -Our chat service has two layers, shown in -figure~\ref{chat-service-layering}: a ground layer for the TCP driver -and a nested VM for chats. The latter hosts one process for accepting -incoming connections plus one process per accepted chat connection. -Three types of conversation take place: \circled{1} between the network socket -and its socket manager process; \circled{2} between the socket manager and -its associated chat process; and \circled{3} the multi-party conversation between these -chat processes. Note how each process engages in two distinct conversations -simultaneously. - -The server's entry point is a module written in the -@racket[marketplace] language, which automatically starts the ground -VM with the actions given in the module's body: -@#reader scribble/comment-reader (racketblock -;; \ensuremath{\forall\State . \Action{\State}} -(nested-vm - (at-meta-level - (endpoint - #:subscriber (tcp-channel ? (tcp-listener 5999) ?) - #:observer - #:conversation (tcp-channel them us _) - #:on-presence (spawn #:child (chat-session them us)))))) -) -This initial action spawns a @racket[nested-vm] to contain -processes specific to our chat -service. Initially, its only process is the primordial process, which -takes on the role of listening for incoming connections. - -Recall that each VM has access to two IPC facilities: the external -network of its container and the internal network for its -own processes. The primordial @racket[endpoint] is wrapped in an -@racket[at-meta-level] structure to indicate that it relates to -activity in the VM's external network. Specifically, it is interested -in observing, but not participating in, TCP conversations on local -port number {\tt 5999}. It is this advertisement of interest that @emph{ - implicitly} coordinates with the TCP driver through the presence -mechanism. - -\begin{figure}[tb] - \centering - \includegraphics[width=6cm]{chat-revised.eps} - \ruledcaption{Layering and levels of discourse within the chat - service. Processes started automatically by the system are - shaded.} - \label{chat-service-layering} -\end{figure} - -The system's TCP driver responds to the appearance of this observer -subscription by creating a listening TCP server socket. When a new TCP -connection arrives, the TCP driver spawns a "socket manager" process -(see figure~\ref{chat-service-layering}) to manage the new socket and -that process creates a subscription for discussing activity on the -socket. The new subscription matches the one shown above in the -listening endpoint. The VM detects the match and sends an -@racket[#:on-presence] notification to the listening endpoint, which -then spawns a process within the App VM whose initial state and -actions are given by @racket[chat-session]: -@#reader scribble/comment-reader (racketblock -;; \TcpAddress\Times\TcpAddress \RArr \Transition{\Stateless} -(define (chat-session them us) - (define user (gensym 'user)) - (transition stateless - (listen-to-user user them us) - (speak-to-user user them us))) -) -The arguments @racket[them] and @racket[us], representing the new -connection's remote and local TCP/IP endpoint addresses, are extracted -from the topic of the conversation that the new peer, the -TCP socket manager process, is willing to have with the chat session: -a conversation about management of a specific TCP -connection. - No longer true for our simplified case: - @note{The associated protocol has a lot in common with - Erlang's I/O protocol, - \url{http://www.erlang.org/doc/apps/stdlib/io_protocol.html}.} - -The initial actions requested by a newly-spawned @racket[chat-session] -are produced by the routines @racket[listen-to-user] and -@racket[speak-to-user]. The @racket[listen-to-user] function -subscribes to incoming TCP data and converts it to messages describing -speech acts, which it then publishes on the internal (nested) network: -@#reader scribble/comment-reader (racketblock -;; \ensuremath{\forall\State. \Symbol\Times\TcpAddress\Times\TcpAddress\rightarrow\Actions{\State}} -(define (listen-to-user user them us) - (list - (endpoint #:publisher `(,user says ,?)) - (at-meta-level - (endpoint #:subscriber (tcp-channel them us ?) - #:on-absence (quit) - [(tcp-channel _ _ (? bytes? text)) - (send-message `(,user says ,text))])))) -) -It is the @racket[#:subscriber] endpoint that starts the ongoing -conversation with the TCP socket manager (marked \circled{2} in -figure~\ref{chat-service-layering}). The use of @racket[at-meta-level] -attaches the endpoint to the VM's @emph{external} network, where the domain -of discourse is TCP. The @racket[#:publisher] endpoint, by contrast, -attaches to the @emph{internal} network, where a higher-level chat-specific -protocol is used, and advertises an intent to send chat messages of -the form "$X$ says $Y$." - -The presence mechanism appears, for the second time, in -@racket[listen-to-user]. Its @racket[#:on-absence] notification -handler responds to a drop in presence on the topic for the socket's -inbound data stream. This happens when the TCP connection is closed by -the remote @tt{telnet} process; the TCP socket manager process -responds to termination of the TCP connection by @racket[quit]ting. All its -subscriptions are thus deleted, causing matching absence -notifications. In particular, the handler in @racket[listen-to-user] -terminates the chat session process, which causes @emph{its} -subscriptions to be deleted in turn. Thus, changes in presence cascade -through the system along lines determined by the subscriptions of -processes. - -The @racket[speak-to-user] function sends a greeting to the user and -then relays events from the internal network to the user via the -connected TCP socket: -@#reader scribble/comment-reader (racketblock -;; \ensuremath{\forall\State. \Symbol\Times\TcpAddress\Times\TcpAddress\rightarrow\Actions{\State}} -(define (speak-to-user user them us) - \ensuremath{...\textrm{definitions of} say \textrm{and} announce...} - (list - (say "You are ~s.~n" user) - (at-meta-level - (endpoint #:publisher (tcp-channel us them ?))) - (endpoint #:subscriber `(,? says ,?) - #:conversation `(,who says ,_) - #:on-presence (announce who 'arrived) - #:on-absence (announce who 'departed) - [`(,who says ,what) (say "~a: ~a" who what)]))) -) - -\ \\ -@#reader scribble/comment-reader (racketblock -;; \ensuremath{\forall\State. \String\Times\Any\Times...\rightarrow\Action{\State}} -(define (say fmt . args) - (at-meta-level - (send-message - (tcp-channel us them (apply format fmt args))))) - -;; \ensuremath{\forall\State. \Symbol\Times\Symbol\rightarrow\Action{\State}} -(define (announce who did-what) - (unless (equal? who user) - (say "~s ~s.~n" who did-what))) -) -Here we see presence used a third time. In @racket[listen-to-user], -sessions advertise presence as a @emph{publisher} on the "$X$ says -$Y$" topic. This ensures that @emph{subscribers} matching this topic -are informed of the presence of each such publisher. Concretely, when -the publisher endpoint is created, the @racket[#:on-presence] -handlers in @racket[speak-to-user]'s subscriber endpoints in existing -sessions are run. The subscriber endpoint in @racket[speak-to-user] -responds to presence or absence by describing the change to the user. - -In sum, a single connection is represented in the system by a -three-party relationship: the remote peer, the TCP socket manager -process, and the chat session process. The remote peer communicates -with the system over TCP as usual (marked \circled{1} in -figure~\ref{chat-service-layering}). The bytes it sends manifest -themselves as Racket-level events on the ground VM's pub/sub network. -The TCP socket manager translates between these low-level events and -the high-level conversational representation of the connection used -with the chat session process (\circled{2} in -figure~\ref{chat-service-layering}). - -Each chat session process manages its half of the conversation with -its corresponding TCP socket manager as part of its other -responsibilities. In this case, it relays input from the remote peer -as speech acts on the nested VM's pub/sub network. The other chat -sessions within the nested VM, each one representing the application -side of another TCP connection, subscribe to these relayed speech acts -(\circled{3} in figure~\ref{chat-service-layering}) and format and deliver -them to their remote peers. - - -According to Hudak and Sundaresh, a - -functional I/O system should provide support for -(1) equational reasoning, (2) efficiency, (3) interactivity, (4) -extensibility, and (5) handling of "anomalous situations," or -errors. Broadening our focus to systems programming, we add (6) -resource management and (7) subsystem encapsulation to this list of criteria. - - We have found that the individual elements of our - approach work well together to address this complex of issues as a whole. Pub/sub - subscriptions not only permit flexible communications topologies, but - also give rise to presence information. Presence, in turn, allows resource - management and crash notification and interacts with our - nestable VMs to provide encapsulation, isolation, and layering. - -\paragraph*{1: Equational Reasoning.} -Like Worlds and Universes, our system allows for equational -reasoning because event handlers are functional state -transducers. When side-effects are absolutely required, they can be -encapsulated in a process, limiting their scope as in our SSH server. The state of the -system as a whole can be partitioned into independent processes, -allowing programmers to avoid global reasoning when designing and unit-testing -their code. - -\paragraph*{2: Efficiency.} -Our VM implementations manage both their own state and the state of -their contained processes in a linear way. Hudak and Sundaresh, -discussing their "stream" model of I/O, remark that the state of -their kernel "is a single-threaded object, and so can be implemented -efficiently". Our system shares this advantage with streams. - -There are no theoretical obstacles to providing more efficient and -scalable implementations of our core abstractions. -Siena and Hermes both use -subscription and advertisement information to construct efficient -routing @emph{trees}. Using a similar technique for implementing a -virtual machine would permit scale-out of the corresponding -layer without changing any code in the application processes. - -\paragraph*{3: Interactivity.} -The term "interactivity" in this context relates to the ability of -the system to interleave communication and computation with other -actors in the system, in particular, to permit user actions to affect -the evolution of the system. Our system -naturally satisfies this requirement because all processes are -concurrently-evolving, communicating entities. - -\paragraph*{4: Extensibility.} -Our system is extensible in that our ground VM multiplexes raw Racket -events without abstracting away from them. Hence, driver -processes can be written for our system to adapt it to any I/O -facilities that Racket offers in the future. The collection of -request and response types for the "stream" model given by Hudak and -Sundaresh~\cite[\S 4.1]{Hudak1988} is static and non-extensible -because their operating system is monolithic, with -device drivers baked in to the kernel. On the one hand, monolithicity -means that the possible communication failures are obvious from the -set of device drivers available; on the other hand, its simplistic -treatment of user-to-driver communication means that the system cannot -express the kinds of failures that arise in microkernel or distributed -systems. Put differently, a monolithic stream system is not suitable -for a functional approach to systems programming. - -Our action type (figure~\ref{vm-interface-types}) appears to block -future extensions because it consists of a finite set of variants. -This appearance is deceiving. Actions -are merely the interface between a program and its VM. -Extensibility is due to the messages exchanged between a program and -its peers. In other words, the Action type is similar to the limited set of core forms -in the lambda calculus, the limited set of methods in HTTP and the -handful of core system calls in Unix: a finite kernel generating an -infinite spectrum of possibilities. - - a fixed core that can express many other things when combined. - - Protocols such as HTTP and the @tt{9p} - file-system of Plan 9 take similar approaches: they provide a simple - protocol with a small number of general-purpose actions which can - express a wide variety of effects in combination. - -\paragraph*{5: Errors.} -In distributed systems, a request can fail in two distinct ways. Some -"failures" are successful communications with a -service, which just happens to fail at some requested -task; but some failures are caused by the unreachability of the -service requested. Our system represents the former kind of failure -via protocols capable of expressing error responses -to requests. For the latter kind of failure, it uses absence -notifications. - -\paragraph*{6: Resource Management.} -Presence and absence notifications are also the basis for -resource management in our system. Through the presence mechanism, -programs can measure demand for some resource and allocate or -release it in response. - There's a really interesting connection to garbage collection here, - which this comment is too narrow to explain. -Presence arises from considering the intersection of pub/sub topic -filters, but using pub/sub has another benefit. It generalizes -point-to-point, multicast, broadcast and even anycast -communication; the same few primitive actions are able to express any -point along this spectrum. The VM network is responsible -for routing based on interest, decoupling the -language for declaring interest from the semantics of routing. - -\paragraph*{7: Subsystem Encapsulation and Isolation.} -Finally, our use of layered, nested VMs encapsulates and -isolates subsystems in a complete program. Our use of a -fixed API between a VM and its processes decouples the implementation -of each layer's virtual machine from its content. We can therefore swap -out one VM implementation for another without altering its processes. - -Isolation of process groups is required in a pub/sub system to avoid -potential crosstalk between logically separate groups of processes. In -our system, VMs provide the necessary isolation. If we had chosen -point-to-point communication instead, nesting would not be absolutely -required; however, the use of -pub/sub is a key advantage of our system, since it gives rise to -presence. Presence can be combined with nesting to build -supervision hierarchies that restart entire -nested VM instances in response to failures. - - -We present a novel approach to functional systems programming, -building on previous work on functional approaches to managing state -and I/O. By incorporating multi-party communications -and explicitly considering concurrency, our model factors out numerous -cross-cutting concerns including discovery, synchronization, failure -detection, and state lifetime. The connection between a process and -its container is declarative. Our model encourages the programmer to -think declaratively, yet in concurrent rather than sequential terms, -writing programs that react smoothly to changes in their environment. - -Placing the combination of presence, nested virtualization, -and event-based publish/subscribe communication at the heart of a -system design eliminates a large amount of scattered -application code that recurs across many different kinds of projects. -As a result, programs become smaller and more robust, and programmers -are freed to concentrate on the functionality of their applications. - - Integrating treatment of lost messages, congestion and queue - management into our approach remains as future work. - -\paragraph*{Code.} -The source-code for our system, examples, and case studies is -available at \url{https://github.com/tonyg/marketplace}.