racket-matrix-2012/presentation/talk-outline.org

#lang racket/base
(require "../os2.rkt")
(require "../os2-tcp.rkt")

#|

Desperately needs a name - "os-big-bang", "os2"

Outline ideas:
 - the core of the idea is pubsub and presence. conversational
   context - not raw messaging and actors.
 - (frequently first-order-)functional programming of networked systems.
    - similar to a generalisation of big-bang
    - similar advantages obtain
 - show code. Stuff that looks nice and illustrates the features of os2.
    - transitions
    - then the actions (spawn, roles, presence, messaging)
    - then ground-vm
    - then nested-vm
    - then at-meta-level

Prepare for whiteboard + showing code. Light on the slides, if any.

Recent work in Network Application Architecture.

---------------------------------------------------------------------------

Lots of systems, both formal models and practical implementations,
work with messaging between actors or actor-like entities. Notably
Erlang and the π-calculus.

In these systems, actors/processes have conversations with one
another, but the programming languages themselves don't talk about
conversations or patterns of interaction larger than a single exchange
of a message from a single sender to a single receiver.

  Whiteboard
  First point: receivers = 1 i.e. point-to-point/unicast
  Second point: granularity = single message

By contrast, lots of the communicating systems we want to model have
more interesting conversational patterns.

  HTTP: request/response; streaming; authentication; sessions
  TCP: bidirectional; ongoing stream; connected
  Bittorrent: multiparty; cooperative; unordered
  Mailing lists: pub/sub; subscription management

So generally we can say that communication is instead

  Whiteboard
  receivers = many i.e. multicast/broadcast/M-of-N
  granularity = session/conversation/connection

The research I've been doing has been examining the idea of a
conversational context: the envelope within which a group of
communicating actors/processes work together to achieve some shared
task.

There's a rich area of work dealing with describing the patterns of
message exchange themselves, for example session types and behavioural
contracts, but I'm looking at it from a different angle:

 - how do actors interested in communicating find each other?
     Whiteboard
       Discovery
       Synchronisation
 - how does a conversation start?
       Resource allocation
 - how are resources managed during a conversation?
       Responsibility
       Error control
       Flow control
 - how do participants know when a conversation is over?
       Error/crash/exit signalling
       Resource release
 - how do conversations fit in to the underlying network system?
       Routing
 - what are the roles involved in various kinds of conversation?
       Sender/Receiver
       Scribe? Archivist? Statistician? Auditor?

To experiment with this I've come up with a model operating system,
and used it to implement a DNS caching proxy and an SSH server.

  Whiteboard
  Functional pub/sub + presence

The model operating system is based on multicast/pubsub, and also
includes a construct for working with conversational context. I've
been calling it "presence", since it's a generalisation of an idea
from the XMPP/Jabber world, but you could equally well call it
"interest" or "subscription".

It's pure functional and event-based - very reminiscent of big-bang,
and so has similar characteristics, including processes usually having
first-order state - and recursive, meaning that instances of the
kernel can run as user processes within another kernel instance.

 - Gives a "thread" equivalent that composes
 - Hide implementation detail: hide threads away in a nested-vm
   with a private language

There's a kind of hypervisor - which I'm calling a ground-vm - which
connects to the real world using Racket's CML-inspired event
mechanism.

Each process within an instance of the system has private state, and a
collection of active endpoints. Again the terminology here is a bit
loose, and while I usually think of endpoints as subscriptions or
event sources, they also make sense as TCP-like half-connections or
more loosely as representing participation in a particular (set of)
conversation(s).

  Whiteboard

  Process = ∃State . State × [Endpoint]
    where
      Endpoint = Topic × Role × InterestType
                   × PresenceHandler
                   × AbsenceHandler
                   × MessageHandler

      PresenceHandler = Topic           → State → Transition
      AbsenceHandler  = Topic × Reason  → State → Transition
      MessageHandler  = Message         → State → Transition

  Transition = State × [Action]

  Topic = notional set of Messages
  Role = 'Subscriber + 'Publisher + ...
  InterestType = 'Participant + 'Observer + 'SuperObserver

Motivate endpoints vs Erlang-style single mailbox
 - The system should make it possible for programs that would
   otherwise block on some external operation to remain responsive to
   their normal inputs while waiting for the external operation to
   complete.

Topics are /just/ sets of messages. Hence, messages include their
topic. This is like packets being wrapped in a header to become a
packet at the next layer down.

An endpoint includes not just a role and a set of messages, but also an /interest type/.
 - participant
 - observer/monitor
 - "super"-observer/monitor

Observers are useful for resource management.

Example: Bank teller

The kernel runs a process in response to some external event, and it
expects back a new process state and a list of actions that the
process wishes the kernel to perform on its behalf.

  Process = ∃State . State × [Endpoint]
    where (...)
      Action = 'Yield (State → Transition)
             + 'AtMetaLevel Preaction
             + Preaction
      Preaction = 'AddRole Endpoint
                + 'DeleteRole Endpoint
                + 'SendMessage Message
                + 'Spawn (∃State . State × [Action])

  VMState = Process × [Action]

  runVM :: VMState → VMState × [Action]

  nestedVM :: (∃State . State × [Action]) → (VMState × [Action])
  nestedVM bootProcess = ...

  groundVM :: (∃State . State × [Action]) → 0
  groundVM bootProcess = ...

Example: TCP. Talk about the roles of observers.
 - listener-factory: #:monitor 'everything
 - connection-factory: #:monitor #t
 - listener: #:monitor 'everything (to detect when counterpart goes away)
 - connections: #:monitor #f

---------------------------------------------------------------------------
|#


(define (main)
  (ground-vm
   (transition 'none
     (spawn tcp-driver)
     (spawn listener))))


(define listener
  (let ((local-addr (tcp-listener 5999)))
    (transition 'no-state
      (role 'inbound-handler
	  (topic-subscriber (tcp-channel (wild) local-addr (wild))
			    #:monitor? #t)
	#:state state
	#:topic t
	#:on-presence
	  (match t
	    [(topic 'publisher (tcp-channel remote-addr (== local-addr) _) #f)
	     (transition state
	       (spawn (connection-handler local-addr remote-addr)))])))))


(define (connection-handler local-addr remote-addr)
  (transition 'no-state
    (send-feedback (tcp-channel remote-addr local-addr (tcp-mode 'lines)))
    (send-feedback (tcp-channel remote-addr local-addr (tcp-credit 1)))
    (role 'echoer
	(topic-subscriber (tcp-channel remote-addr local-addr (wild)))
      #:state state
      #:on-absence (transition state
		     (kill))
      [(tcp-channel _ _ (? bytes? line))
       (define reply (bytes-append #"You said: " line #"\n"))
       (transition state
	 (send-feedback (tcp-channel remote-addr local-addr (tcp-credit 1)))
	 (send-message (tcp-channel local-addr remote-addr reply)))]
      [(tcp-channel _ _ (? eof-object?))
       (transition state
	 (kill))])))


(define (connection-handler local-addr remote-addr)
  (transition 'no-state
    (role 'date-sender (topic-publisher (tcp-channel local-addr remote-addr (wild)))
      #:state state
      [(tcp-channel _ _ (tcp-credit _))
       (transition state
	 (kill))])
    (send-message (tcp-channel local-addr remote-addr
			       (string->bytes/utf-8
				(format "~a\n" (current-inexact-milliseconds)))))))


#|
---------------------------------------------------------------------------

Talk about DNS and SSH

DNS structure:
  - timer
  - udp
  - server
     - timer-relay
     - query-id-allocator
     - reader (server)
     - writer (server)

 (server)
     - error-logger
     - respondent

 (proxy)
     - reader (client)
     - writer (client)
     - packet-dispatcher
     + packet-relay
     - question-dispatcher
     + question-handler
     + glueless-question-handler
     + network-query

SSH structure:
  - timer
  - tcp
  - listener
  + session
     - exception-handler
     - event-relay
     - timer-relay
     - reader
     - writer
     - session
     - application
        - boot process
	- ... any others

Compare the "natural" structure of SSH with the structure using nested VMs
  Whiteboard
  Diagram of 14 Oct 2011 from my research journal

Contracts
 - for process state (implemented)
 - for messages across the bus at each level
 - between processes / within conversations

Weaknesses
 - The Wart
 - Glitching
    - There’s maybe a kind of continuousness (in the calculus sense)
      at play here: well-behaved presence apps don’t glitch in their
      presence. Sets of interests smoothly evolve without transient
      drops of interest potentially leading to confusion on the part
      of the peer.
 - Handoff of responsibility
    - the manager can take responsibility, spawn, and then drop when
      it sees the child is up or crashed; or
    - the manager can spawn without taking, and blip on crash; or
    - the manager can spawn and then treat like an established service
      instance
 - Querying the routing table
    - e.g. for sending SSH channel opens: is the channel type
      supported? querying the routing table would let us find out
      without explicitly sending a message
 - Careful protocol design at all levels
    - Ground events are (cons/c evt/c any/c)
    - This is the set of all possible responses to sync'ing on that event!
    - I've scraped by with this so far, but because sync'ing has
      side-effects, I'll eventually need better names for event
      requests and responsibility transfer.

This got me thinking: the presence/messaging system I’ve been building
looks like a hybrid between synchronous and asynchronous messaging. Is
it fair to say you can use the presence mechanism to synchronise, and
then the messaging mechanism to communicate?

Rendezvous via presence: this makes temporal dependencies between
processes during system startup resolve themselves automatically. No
more races at boot time!

Exceptions and errors are propagated via loss of presence.
 - Erlang's process links and monitors can be seen as a special case
   of this mechanism.
 - Erlang/OTP-like supervisors fit naturally into this picture

|#