* --- Probabilistic Programming Reading Group, 4th Dec 2014 ------------
Notes by Jeremy Gibbons
on "Embedded Probabilistic Programming", Oleg Kiselyov and Ken Shan
from IFIP TC2 Working Conference on Domain-Specific Languages, 2009
(I'll actually be talking more about embedded domain-specific
languages, with probabilistic programming as an application.)
* --- imports ----------------------------------------------------------
We'll need some language extensions, and to import a number of libraries.
> {-# LANGUAGE StandaloneDeriving, FlexibleContexts, UndecidableInstances, FlexibleInstances, MultiParamTypeClasses #-}
> import Prelude hiding (fail, flip)
> import Data.List (genericLength, groupBy, sort)
> import System.Random (StdGen, getStdRandom, randomR)
> import Control.Monad.State hiding (fail)
* --- probabilistic choice ---------------------------------------------
We'll use rationals to represent probabilities. These will (should) be
in the range 0 to 1, but there is no straightforward way in Haskell to
enforce that.
> type Prob = Rational
The main construct we need to implement probabilistic programs is to
represent a probabilistic choice between two values. This we do via
the "Choice" datatype below. This declares a constructor "C", which
takes a probability and two values of the same type, and constructs a
data structure representing the weighted choice between those two
values. For example,
C (1%2) False True
represents a 50/50 choice between two booleans, ie a fair coin. (The
"%" is the constructor for rationals. The "deriving Show" clause is
just so we can display "Choice" values as outputs.)
> data Choice a = C Prob a a
> deriving Show
The is merely a representation of the *syntax* of a choice operation;
we have later to specify the semantics. The most important function on
representations of the syntax of operations, such as our "Choice"
datatype, is to provide access to the "arguments" to those
operations. That is done by the "Functor" typeclass, with a single
method "fmap" that applies a given function to each of those
arguments. For "Choice", there are two arguments.
> instance Functor Choice where
> fmap f (C w x y) = C w (f x) (f y)
* --- choice trees -----------------------------------------------------
The "Choice" datatype lets us represent a single probabilistic choice,
but not multiple such choices. For the latter, we need to define a
recursive datatype, here called "Tree". This takes two parameters. The
second parameter is the type of elements. The "Leaf" constructor
builds a singleton tree from an element. The first parameter is an
*operation on types*, like "Choice", and specifies the shape of the
tree; the "Node" constructor builds a composite tree, with shape
specified by the shape parameter - in the case of "Choice", each
"Node" has a weight and two children.
> data Tree f a = Leaf a
> | Node (f (Tree f a))
Here again is a "Show" function:
> deriving instance (Show a, Show (f (Tree f a))) => Show (Tree f a)
The "Tree" datatype is generic in the shape, ie the syntax of the
DSL. We will focus on the DSL of probabilistic choices, so we provide
a specialisation to that shape:
> type ChoiceTree a = Tree Choice a
* --- example programs -------------------------------------------------
For a first example, here is a (this time biased) coin, represented as
a tree:
> biasedcoin :: Prob -> ChoiceTree Bool
> biasedcoin w = Node (C w (Leaf False) (Leaf True))
All well and good, but we could have done that without the
trees. Here's a function "uniform" that can build trees bigger than
size two. And smaller, for that matter; but the argument should be a
non-empty list. The function constructs a uniform choice from the
given list of possible values.
> uniform :: [a] -> ChoiceTree a
> uniform [] = error "List of choices should be non-empty"
> uniform [x] = Leaf x
> uniform xs = Node (C (1/genericLength xs) (Leaf (head xs)) (uniform (tail xs)))
For example,
uniform [1..6]
builds the choice tree representing the roll of a fair die:
*Main> uniform [1..6]
Node (C (1 % 6) (Leaf 1) (Node (C (1 % 5) (Leaf 2) (Node (C (1 % 4) (Leaf 3) (Node (C (1 % 3) (Leaf 4) (Node (C (1 % 2) (Leaf 5) (Leaf 6))))))))))
But now suppose that we want to compose bigger probabilistic programs
out of smaller ones. For example, on the basis of the outcome of a
biased coin, we might want to choose between rolling a fair die and a
biased die. For this, we need to be able to glue trees together; then
we can make a second choice on the basis of the outcome of a
first. This is *substitution*. It takes a tree with elements of some
type "a", and a function that turns any such "a" into a tree of "b"s,
and then substitutes each "a" with the corresponding tree of "b"s.
> subst :: ChoiceTree a -> (a -> ChoiceTree b) -> ChoiceTree b
> subst (Leaf a) k = k a
> subst (Node x) k = Node (fmap (`subst` k) x)
Now our example is as follows:
> eg :: ChoiceTree Int
> eg = subst (biasedcoin 0.3) (\ b -> if b then uniform [1..6] else uniform [1,1,2,3,3,4,5,5,6])
If the outcome of the biased coin is "True", it rolls a fair die; if
the coin yields "False", it rolls a die that is twice as likely to
yield an odd number as an even one.
* --- free monad --------------------------------------------------------
You might think of certain operations on types, such as ChoiceTree, as
representing computations. The important ingredients are a way to lift
values into computations, and a way to sequence two computations
together. The first is given by the "Leaf" constructor; the second by
substitution. An operation on types that supports those two operations
(in such a way as to satisfy three simple laws, omitted here), is
called a monad. What we have defined above, in fact, is known as "the
free monad on the Choice functor". But we didn't need anything in
particular about "Choice"; it works for any functor:
> instance Functor f => Monad (Tree f) where
> return a = Leaf a
> Leaf a >>= k = k a
> Node x >>= k = Node (fmap (>>= k) x)
The two operations are conventionally called "return" and "(>>=)"
(pronounced "bind").
Using these new names, we might write Daphne Koller's "letter of
support" example as follows:
> ref :: ChoiceTree Bool
> ref = biasedcoin 0.6 >>= \ difficult ->
> biasedcoin 0.7 >>= \ intelligent ->
> biasedcoin (case (intelligent,difficult) of
> (False,False) -> 0.3
> (False,True) -> 0.05
> (True,False) -> 0.9
> (True,True) -> 0.5) >>= \ grade ->
> biasedcoin (if intelligent then 0.95 else 0.2) >>= \ sat ->
> biasedcoin (if grade then 0.4 else 0.1) >>= \ letter ->
> return sat
Haskell provides a special "do" notation for writing such monadic
expressions, where the "bind"s turn into actual binding constructs (or
rather, the binding construct "<-" is translated into a ">>=" and a
lambda binding). This makes a monadic computation read like an
imperative program.
> ref' :: ChoiceTree Bool
> ref' = do
> difficult <- biasedcoin 0.6
> intelligent <- biasedcoin 0.7
> grade <- biasedcoin (case (intelligent,difficult) of
> (False,False) -> 0.3
> (False,True) -> 0.05
> (True,False) -> 0.9
> (True,True) -> 0.5)
> sat <- biasedcoin (if intelligent then 0.95 else 0.2)
> letter <- biasedcoin (if grade then 0.4 else 0.1)
> return sat
* --- distributions -----------------------------------------------------
So far, we have only discussed the syntax for probabilistic
computations. What about semantics? It's merely a matter of writing a
function to interpret the syntax into a semantic domain. For example,
we might want a semantics in terms of distributions: finite sets of
weighted values, such that the weights sum to one. For simplicity, we
represent those sets as lists, and we don't worry about order or
duplicates.
> newtype Distr a = D { unD :: [(a,Prob)] } deriving Show
Here's the semantic interpretation, from choice trees to distributions
(the name "reflect" is the one used by Kiselyov and Shan):
> reflect :: ChoiceTree a -> Distr a
> reflect (Leaf a) = D [(a,1)]
> reflect (Node (C w x y)) = wsum w (reflect x) (reflect y)
This uses the weighted sum of two distributions
> wsum :: Prob -> Distr a -> Distr a -> Distr a
> wsum w xs ys = D (scale w (unD xs) ++ scale (1-w) (unD ys))
which in turn involves scaling a distribution (which generally makes
the weights sum to something other than one, so is declared here to
work on the underlying lists of pairs rather than on distributions per
se).
> scale :: Prob -> [(a,Prob)] -> [(a,Prob)]
> scale w xs = [(x,w*w') | (x,w') <- xs]
For example,
*Main> reflect (uniform [1..6])
D {unD = [(1,1 % 6),(2,1 % 6),(3,1 % 6),(4,1 % 6),(5,1 % 6),(6,1 % 6)]}
For completeness, but not used later, here is a kind of inverse to "reflect":
> reify :: Distr a -> ChoiceTree a
> reify (D [(a,_)]) = Leaf a
> reify (D ((a,w):xs)) = Node (C w (Leaf a) (reify (D xs)))
and a function to simplify a distribution, grouping together matching
elements:
> simplify :: Ord a => Distr a -> Distr a
> simplify = D . map combine . groupBy (\ u v -> fst u == fst v) . sort . unD
> where combine xs = (fst (head xs), sum (map snd xs))
For example, here is the semantics of our "biased or fair die" example as a distribution with duplicates:
*Main> reflect eg
D {unD = [(1,1 % 30),(1,1 % 30),(2,1 % 30),(3,1 % 30),(3,1 % 30),(4,1 % 30),(5,1 % 30),(5,1 % 30),(6,1 % 30),(1,7 % 60),(2,7 % 60),(3,7 % 60),(4,7 % 60),(5,7 % 60),(6,7 % 60)]}
and here it is simplified to combine duplicates:
*Main> simplify (reflect eg)
D {unD = [(1,11 % 60),(2,3 % 20),(3,11 % 60),(4,3 % 20),(5,11 % 60),(6,3 % 20)]}
* --- sampling --------------------------------------------------------
But distributions aren't the only semantics we might consider. Another
is to sample from the probabilistic computation. In Haskell, this
consists of a function that operates on a random number generator; and
one of those can be obtained from the run-time environment in the same
way that I/O is done.
> newtype Rand a = R { unR :: StdGen -> (a, StdGen) }
> randUnitFloat :: Rand Float
> randUnitFloat = R (randomR (0,1))
> runRand :: Rand a -> IO a
> runRand = getStdRandom . unR
Those five lines are a bit cryptic. But all I need to do to let you
use them is explain how the three aspects of choice trees can be
represented using the "Rand" type operation. Leaves and substitution
are dealt with by making "Rand" an instance of "Monad" and defining
analogous "return" and "(>>=)" functions:
> instance Monad Rand where
> return a = R (\ s -> (a,s))
> m >>= k = R (\ s -> let (a,s') = unR m s in unR (k a) s')
You might see in there that "return" gives back the value "a" without
touching the random number generator, and ">>=" gets an updated
generator from the left-hand argument and feeds it to the right-hand
argument.
The only other thing we need is to interpret choice itself. We do this
by drawing a random float from the random number generator and
comparing it to the (rational) weight:
> randChoice :: Prob -> Rand a -> Rand a -> Rand a
> randChoice w x y = do
> u <- randUnitFloat
> if u <= fromRational w then x else y
Using these, we can write the interpretation as a sampling function:
> sample :: ChoiceTree a -> Rand a
> sample (Leaf a) = return a
> sample (Node (C w x y)) = randChoice w (sample x) (sample y)
For example,
*Main> runRand (sample (uniform [1..6]))
1
*Main> runRand (sample (uniform [1..6]))
3
(We haven't used the "(>>=)" yet, but we will.)
* --- interface for different implementations --------------------------
As a general principle, once you have two implementations of the same
interface (if not before), you should explicitly define that interface
and start programming to it rather than to any of the
implementations. For us, the interface is a sublcass of "Monad",
because as well as "return" and "(>>=)" it has an additional operation
representing probabilistic choice:
> class Monad m => MonadChoice m where
> choice :: Prob -> m a -> m a -> m a
Of course, choice trees implement this interface. We've already seen
that they implement the "Monad" interface, so all we have to add is an
explanation of how they implement "choice":
> instance MonadChoice (Tree Choice) where
> choice w x y = Node (C w x y)
Now we can write programs to this interface rather than to a specific
implementaition. Basically the same definition works as before, but
using the new names and with a more generic type:
> biasedcoinGeneric :: MonadChoice m => Prob -> m Bool
> biasedcoinGeneric w = choice w (return False) (return True)
> refGeneric :: MonadChoice m => m Bool
> refGeneric = do
> difficult <- biasedcoinGeneric 0.6
> intelligent <- biasedcoinGeneric 0.7
> grade <- biasedcoinGeneric (case (intelligent,difficult) of
> (False,False) -> 0.3
> (False,True) -> 0.05
> (True,False) -> 0.9
> (True,True) -> 0.5)
> sat <- biasedcoinGeneric (if intelligent then 0.95 else 0.2)
> letter <- biasedcoinGeneric (if grade then 0.4 else 0.1)
> return sat
That is, "biasedcoinGeneric" and "refGeneric" work for *any* monad
"m", as long as "m" is an instance of "MonadChoice" (thereby
supporting "return", "(>>=)" and "choice"). Just "refGeneric" by
itself can't be evaluated, because of its generic type; but we can
specialise to a concrete type, and then it can be evaluated:
*Main> refGeneric :: ChoiceTree Bool
Node (C 0.6 (Node (C 0.7 (Node (C 0.3 (Node (C 0.2 (Node (C 0.1 (Leaf False) (Leaf False))) (Node (C 0.1 (Leaf True) (Leaf True))))) (Node (C 0.2 (Node (C 0.4 (Leaf False) (Leaf False))) (Node (C 0.4 (Leaf True) (Leaf True))))))) (Node (C 0.9 (Node (C 0.95 (Node (C 0.1 (Leaf False) (Leaf False))) (Node (C 0.1 (Leaf True) (Leaf True))))) (Node (C 0.95 (Node (C 0.4 (Leaf False) (Leaf False))) (Node (C 0.4 (Leaf True) (Leaf True))))))))) (Node (C 0.7 (Node (C 5.0e-2 (Node (C 0.2 (Node (C 0.1 (Leaf False) (Leaf False))) (Node (C 0.1 (Leaf True) (Leaf True))))) (Node (C 0.2 (Node (C 0.4 (Leaf False) (Leaf False))) (Node (C 0.4 (Leaf True) (Leaf True))))))) (Node (C 0.5 (Node (C 0.95 (Node (C 0.1 (Leaf False) (Leaf False))) (Node (C 0.1 (Leaf True) (Leaf True))))) (Node (C 0.95 (Node (C 0.4 (Leaf False) (Leaf False))) (Node (C 0.4 (Leaf True) (Leaf True))))))))))
* --- shallow embeddings -----------------------------------------------
The choice tree type is a so-called *deep embedding* of a DSL for
probabilistic computation. Probabilistic programs such as "biasedcoin"
and "ref" merely generate syntactic representations of a computation
(trees), and a subsequent interpretation function such as "reflect" or
"sample" is needed in order to provide semantics.
An alternative is to give a *shallow embedding*, in which a
probabilistic program is directly implemented in terms of its
semantics. The crucial observation is that we can do this precisely
when the interpretation function defining the semantics is
*compositional*. One can give a formal definition of this (see my
paper "Folding DSLs: Deep and Shallow Embeddings"); but in a nutshell,
we have to be able to make the semantic domain an instance of the
"MonadChoice" interface.
For example, we can make a shallow embedding directly as
distributions, because the "reflect" function is compositional. In
other words, the "Distr" type is an instance of the "Monad" class:
> instance Monad Distr where
> return a = D [(a,1)]
> xs >>= k = D (concat [ scale w (unD (k x)) | (x,w) <- unD xs ])
(where "return" constructs a singleton distribution, and "(>>=)" does
a kind of weighted substitution of distributions into distributions);
and also also an instance of the "MonadChoice" subclass:
> instance MonadChoice Distr where
> choice = wsum
(where "choice" is weighted sum of distributions).
Now our generic probabilistic programs can be specialised to the
"Distr" semantics:
*Main> refGeneric :: Distr Bool
D {unD = [(False,2.5200003e-3),(False,2.268e-2),(True,1.0080001e-2),(True,9.072e-2),(False,2.3520002e-2),(False,3.528e-2),(True,9.408001e-2),(True,0.14112),(False,1.539e-2),(False,0.13851),(True,8.100003e-4),(True,7.290002e-3),(False,6.8400023e-3),(False,1.0260003e-2),(True,3.600002e-4),(True,5.400003e-4),(False,2.8e-4),(False,2.5199996e-3),(True,1.12e-3),(True,1.00799985e-2),(False,2.128e-2),(False,3.1919997e-2),(True,8.512e-2),(True,0.12767999),(False,5.6999996e-3),(False,5.1299997e-2),(True,3.0000007e-4),(True,2.7000005e-3),(False,2.2799999e-2),(False,3.4199998e-2),(True,1.2000003e-3),(True,1.8000006e-3)]}
Similarly, we the "sample" interpretation is compositional, and so we
can use samplers as a shallow embedding. We have already seen above
that "Rand" is an instance of the class "Monad"; it is also an
instance of the subclass "MonadChoice":
> instance MonadChoice Rand where
> choice = randChoice
Now our generic probabilistic programs can be specialised to the
"Rand" type too. We can't directly evaluate a value of type "Rand
Bool", but we can extract an "IO" computation from one and run that:
*Main> runRand (refGeneric :: Rand Bool)
True
*Main> runRand (refGeneric :: Rand Bool)
False
* --- observations -----------------------------------------------------
Now let's extend our DSL. As well as providing probabilistic choice,
we'll allow ourselves to reject executions. This manifests itself as a
piece of syntax for rejection, which is just a constant that we'll
write as "F":
> data Fail a = F
> deriving Show
In order to use the free monad approach as above, we need to make
"Fail" an instance of "Functor". That is, we need to be able to apply
a given function "f" to every argument of "F". But that's easy,
because "F" has no arguments:
> instance Functor Fail where
> fmap f F = F
One could make a little DSL using the syntax of failure alone (in
fact, we do so, at the end of this script); but it's not a very
exciting language. Instead, we will make a slightly bigger language,
providing both choice and failure. A general technique for doing this
would be to combine the two partial syntaxes "Choice" and "Fail" (for
details on this approach, see the paper "Datatypes a la Carte" by
Wouter Swierstra). We'll take a shortcut, and simply define a new
syntax from scratch that supports both operations:
> data ChoiceFail a = Ch Prob a a
> | Fl
> deriving Show
Again, we have to make it an instance of "Functor":
> instance Functor ChoiceFail where
> fmap f (Ch w x y) = Ch w (f x) (f y)
> fmap f Fl = Fl
Because we now have an instance of "Functor", we can make a free monad
using the "Tree" type. We introduce a synonym for trees with this more
general shape - each internal node is *either* a choice node (with a
probability and two children) or a failure node (with no further
information):
> type ChoiceFailTree a = Tree ChoiceFail a
We can generalise the "reflect" function from before to handle failure
too. But now we get subdistributions rather than distributions as the
result - the weights generally sum to less than one.
> reflectF :: ChoiceFailTree a -> Distr a
> reflectF (Leaf a) = D [(a,1)]
> reflectF (Node (Ch w x y)) = wsum w (reflectF x) (reflectF y)
> reflectF (Node Fl) = D []
Since we have expanded the language, we need to expand the interface
that representations of it must support: as well as plain values,
substitution, and choice, it must support failure.
> class Monad m => MonadFail m where
> fail :: m a
Now, "ChoiceFailTree" is a free monad type, so it is automatically an
instance of "Monad". It also supports choice and failure, using the
two variants of the "ChoiceFail" syntax:
> instance MonadChoice (Tree ChoiceFail) where
> choice w x y = Node (Ch w x y)
> instance MonadFail (Tree ChoiceFail) where
> fail = Node Fl
Having gone to all that effort, we can now write probabilistic
programs that may fail; this lets us interpose *observations* into the
computation. We introduce a little shorthand: observing some boolean
property is a no-op if the property holds, but a failure when it
doesn't.
> observe :: MonadFail m => Bool -> m ()
> observe b = if b then return () else fail
Now we can extend Daphne Koller's "letter of support" example to
reject executions in which no letter is sent; this gives us the
conditional probability that the student's SAT score is good, *given*
that a letter is sent.
> refObserve :: (MonadChoice m, MonadFail m) => m Bool
> refObserve = do
> difficult <- biasedcoinGeneric 0.6
> intelligent <- biasedcoinGeneric 0.7
> grade <- biasedcoinGeneric (case (intelligent,difficult) of
> (False,False) -> 0.3
> (False,True) -> 0.05
> (True,False) -> 0.9
> (True,True) -> 0.5)
> sat <- biasedcoinGeneric (if intelligent then 0.95 else 0.2)
> letter <- biasedcoinGeneric (if grade then 0.4 else 0.1)
> observe letter
> return sat
The result is rather a complicated tree, with some failure nodes in:
*Main> refObserve :: ChoiceFailTree Bool
Node (Ch (3 % 5) (Node (Ch (7 % 10) (Node (Ch (3 % 10) (Node (Ch (1 % 5) (Node (Ch (1 % 10) (Node Fl) (Leaf False))) (Node (Ch (1 % 10) (Node Fl) (Leaf True))))) (Node (Ch (1 % 5) (Node (Ch (2 % 5) (Node Fl) (Leaf False))) (Node (Ch (2 % 5) (Node Fl) (Leaf True))))))) (Node (Ch (9 % 10) (Node (Ch (19 % 20) (Node (Ch (1 % 10) (Node Fl) (Leaf False))) (Node (Ch (1 % 10) (Node Fl) (Leaf True))))) (Node (Ch (19 % 20) (Node (Ch (2 % 5) (Node Fl) (Leaf False))) (Node (Ch (2 % 5) (Node Fl) (Leaf True))))))))) (Node (Ch (7 % 10) (Node (Ch (1 % 20) (Node (Ch (1 % 5) (Node (Ch (1 % 10) (Node Fl) (Leaf False))) (Node (Ch (1 % 10) (Node Fl) (Leaf True))))) (Node (Ch (1 % 5) (Node (Ch (2 % 5) (Node Fl) (Leaf False))) (Node (Ch (2 % 5) (Node Fl) (Leaf True))))))) (Node (Ch (1 % 2) (Node (Ch (19 % 20) (Node (Ch (1 % 10) (Node Fl) (Leaf False))) (Node (Ch (1 % 10) (Node Fl) (Leaf True))))) (Node (Ch (19 % 20) (Node (Ch (2 % 5) (Node Fl) (Leaf False))) (Node (Ch (2 % 5) (Node Fl) (Leaf True))))))))))
However, we can compute and simplify the subdistribution represented
by that tree:
*Main> simplify (reflectF (refObserve :: ChoiceFailTree Bool))
D {unD = [(False,32667 % 100000),(True,38193 % 100000)]}
(note that the weights sum to less than one). Better still, the
program is generic, and we can interpret directly in any shallow
embedding that implements the two interfaces "MonadChoice" and
"MonadFail". We have already seen that "Distr" implements the first
interface; and it's easy to make an implementation of the second
interface too, ie a failing subdistribution:
> instance MonadFail Distr where
> fail = D []
Then we have:
*Main> simplify (refObserve :: Distr Bool)
D {unD = [(False,32667 % 100000),(True,38193 % 100000)]}
This time, sampling isn't obviously another implementation of the
interface: it's not clear how to implement failure, without actually
raising an exception (which would really halt evaluation, rather than
letting us properly handle rejection).
How can we fix this? The idea is that if we reach a failure, we should
re-run the execution and try again. We achieve this in two
steps. First, we allow the body of the computation to return an
optional rather than a mandatory result:
> newtype RandFail a = RF { unRF :: StdGen -> (Maybe a, StdGen) }
Here, the standard datatype "Maybe a" represents an optional value of
type "a": either "Just x" for some "x :: a", or "Nothing".
data Maybe a = Just a | Nothing
Second, we wrap the possibly-failing stateful computation in a loop,
catching the failure and trying again on the resulting state:
> loop :: (s -> (Maybe a, s)) -> (s -> (a, s))
> loop f s = case f s of
> (Just a, s') -> (a,s')
> (Nothing, s') -> loop f s'
Now we can turn a possibly-failing random-number-consuming process
into an IO computation to be run at the REPL command line:
> runRandF :: RandFail a -> IO a
> runRandF = getStdRandom . loop . unRF
And finally, we can now explain how "RandFail" is suitable for a
shallow representation. It's a bit clunky, but not difficult. (Note
that "RandFail a" is equivalent to "Rand (Maybe a)"; so we could make
the code shorter but perjaps less clear by exploiting this observation
and reusing previous code.)
First, we make an instance of "Monad":
> instance Monad RandFail where
> return a = RF (\ s -> (Just a,s))
> m >>= k = RF (\ s -> case unRF m s of
> (Just a, s') -> unRF (k a) s'
> (Nothing, s') -> (Nothing, s'))
This is very like the instance for plain "Rand", except that "return"
generates an extra "Just", and "(>>=)" has to handle the left-hand
argument returning either "Just" or "Nothing". (Note that if the
left-hand argument "m" fails, the right-hand argument "k" is never
executed.)
Second, we make an instance of "MonadChoice", which boils down to a
function "randFailChoice" analogous to "randChoice" on plain "Rand"s.
> instance MonadChoice RandFail where
> choice = randFailChoice
> randFailChoice :: Prob -> RandFail a -> RandFail a -> RandFail a
> randFailChoice w x y = do
> u <- randFailUnitFloat
> if u <= fromRational w then x else y
> randFailUnitFloat :: RandFail Float
> randFailUnitFloat = RF (\ s -> let (a,s') = randomR (0,1) s in (Just a, s'))
Third, we make an instance of "MonadFail", which is a computation that
always fails:
> instance MonadFail RandFail where
> fail = RF (\ s -> (Nothing, s))
For example, a single run again returns only a boolean; if the sample
is rejected by the observation clause, another one is drawn:
*Main> runRandF refObserve
False
*Main> runRandF refObserve
True
We can draw many samples and measure the relative outcomes:
*Main> do { results <- sequence (replicate 10000 (runRandF refObserve)) ; return (length (filter not results),length (filter id results)) }
(4549,5451)
*Main> 5451/4549
1.1982853374367992
Compare this with the exact distribution that we computed earlier:
*Main> simplify (refObserve :: Distr Bool)
D {unD = [(False,32667 % 100000),(True,38193 % 100000)]}
*Main> 38193/32667
1.1691615391679677
* --- two more free monads ----------------------------------------------
The language of failure is represented by the free monad for the
functor "Fail" defined above:
data Fail a = F
Of course, the free monad supports failure:
> instance MonadFail (Tree Fail) where
> fail = Node F
I leave it as an exercise to convince yourself that "Tree Fail a" is
isomorphic to "Maybe a", so that a computation in the language of
failure returns either a result or nothing.
> instance MonadFail Maybe where
> fail = Nothing
Now programs can be written that accomodate failure, generically:
> average :: MonadFail m => [Float] -> m Float
> average [] = fail
> average xs = return (genericLength xs / sum xs)
*Main> average [1,3,4] :: Tree Fail Float
Leaf 0.375
*Main> average [] :: Tree Fail Float
Node F
*Main> average [] :: Maybe Float
Nothing
Here's a more interesting DSL, of stateful computations (on a state of
type "s"). The two operations are to "Put" a new value into the state
and then continue, and to "Get" a value from the state and continue
with a computation that may depend on the value read:
> data PutGet s a = Put s a | Get (s->a)
It's a functor, in a straightforward way:
> instance Functor (PutGet s) where
> fmap f (Put s a) = Put s (f a)
> fmap f (Get k) = Get (f . k)
so of course "Tree (PutGet s)" is a free monad. Haskell already
provides an interface for monads with these features:
class Monad m => MonadState s m | m -> s where
get :: m s
put :: s -> m ()
which in our language is implemented as follows:
> instance MonadState s (Tree (PutGet s)) where
> put s = Node (Put s (Leaf ()))
> get = Node (Get Leaf)
Here's a little generic stateful program, which operates on an integer
state and increments it:
> incr :: MonadState Int m => m ()
> incr = do { n <- get ; put (n+1) }
We can't simply show a "Tree PutGet" value, because it contains
functions which we can't convert to strings. However, we can interpret
such a tree as a stateful expression, ie one that depends on an input
state, and returns an updated output state along with its result:
> runPutGet :: Tree (PutGet s) a -> (s -> (a,s))
> runPutGet (Leaf a) = \ s -> (a,s)
> runPutGet (Node (Put s m)) = \ _ -> runPutGet m s
> runPutGet (Node (Get k)) = \ s -> runPutGet (k s) s
For example, executing the increment program when the initial state is
3 yields a final state of 4 (and the unit value as the actual result):
*Main> runPutGet incr 3
((),4)
* --- footer, just for emacs --------------------------------------------
%Local Variables: ***
%mode: literate-haskell ***
%eval: (orgstruct-mode) ***
%mmm-classes: (literate-haskell-latex-spec literate-haskell-latex-code) ***
%End: ***