Wednesday January 24, 2001
Function definitions in pure functional languages like
Haskell can be evaluated in a clean "mathematical way":
Consider the function
double :: Integer -> Integer
double n = n + n -- (***) see below
which, given an integer, returns the doubled integer.
How does a functional system like Haskell apply a function to the given arguments, i.e., how are expressions evaluated?
The expression "double (double 4)" is a reducible expression (short: redex) because we can apply the function definition of double to the corresponding argument. There are two main strategies two reduce an expression to its normal form (if an expression is still reducible by applying some function definition, then its a redex, else, i.e., if no more function definitions are applicable, we say that an expression is in normal form.)
LEFTMOST INNERMOST (LI): the next redex to try is the innermost one found by scanning the expression from left to right. This is also known as applicative order, strict, or call by value.
Example:
double (double 4) -- the second (i.e., inner) "(double 4)" is reduced
-- via the (***) function definition above; we get...
=> double (4 + 4) -- now we reduce 4 + 4 to 8...
=> double 8 -- next we can reduce with (***) again...
=> (8 + 8) -- the final reduction yields...
=> 16 -- voila! The normal form
LEFTMOST OUTERMOST (LO): the next redex to work on is the leftmost outermost. This is also known as normal order, call by name, and corresponds roughly to non-strict (lazy) evaluation or call by need.
Example:
double (double 4) -- the LO (leftmost outermost) "double" is reduced
-- via the (***) function definition above; we get...
=> (double 4) + (double 4) -- again the LO "double" is reduced
=> (4 + 4) + (double 4) -- we can't reduce the middle (=outermost) "+" ..
-- ... so we reduce the first "+"
=> 8 + (double 4) -- the only redex is "double"
=> 8 + (4 + 4) -- here only the 2nd "+" can be reduced
=> 8 + 8 -- only one "+" remains
=> 16
The LO reduction has a disadvantage vs. the LI reduction if, a function parameter like "n" above, occurs multiple times in the function body (here: "n + n"). By working on graph structures, this extra work can be avoided.
An example. Let us say the function h is defined as h x y = x + x
We compute h (9-3) (h 34 3)
Step 1: (9-3) + (9-3) -- the second argument is ignored
Step 2: 6+6 -- 6 is computed only once because (9-3) is a single item (node of the operation graph) over which the + operation occurs
Step 3: 12
The LO reduction has the clear advantage over LI that if arguments are not really needed, no attempts are made to evaluate the (unnecessary) arguments:
Example Let's define
first (a, b) = a
Then with LO we immediately get
first (5, double 4)
=> 5
whereas LI first evaluates (in two additional reduction steps!) the inner argument "double 4" only to throw the result away! (since first simply ignores the second argument)
Another way of explaining non-strict (lazy) function evaluation is that Haskell computes using definitions rather than assignments. For example, a declaration such as
v = 1/0
should be read as "define v as 1/0" and not "compute 1/0 and store the result in v". Only if (and when) the actual value (definition) of v is needed will the divison by zero error occur.
Haskell is a strongly typed language – the Haskell compiler checks whether or not all expressions that the program wishes to evaluate obey all the typing rules of the language, thus many errors are caught before the program is actually run.
How is this done? First, let’s look at a couple of function definitions.
A simple function DECLARATION add :: Integer ® Integer ® Integer
And
now the DEFINITION add x
y = x+y
And now a little more complex function declaration merge :: [a] ® [a] ® [a]
The definition merge [] [] = []
merge [] (y:ys) = (y:ys)
merge (x:xs) [] = (x:xs)
merge (x:xs) (y:ys)
| x < y = x : merge xs (y:ys)
| x = = y = x : merge xs ys
| otherwise = y : merge (x:xs) ys
The last segment of the definition shows how a function is defined when there a multiple cases (or conditions) to consider.
And now, we can relate this to type checking for monomorphic types.
Let’s say we have the functions:
plus :: Int ® Int ® Int -- adds two integers to produce another
plus x y = x+y
Let’s assume a built-in function
ord :: Char ® Int -- takes a character and returns its numeric value
Now take the expression plus (ord `c`) 3 -- is this a valid expression?
Informally, the steps to verify the validity go something like this.
Note that we never really evaluated the value of the expression, we just checked it by treating signatures as rules
On the other hand, the expression plus (ord ‘c’) [1 3 5] is clearly not.
So for a function f :: t1 ® t2 ® … ® tk ® t
the definition is: f p1 p2 … pk
| g1 = e1
| g2 = e2
…
| gn = en
where g1 g2 etc. are conditions called guards,
e1, e2 etc. are values returned and are all of type t
p1, p2 etc. are patterns are are consistent with the type of the arguments t1, t2 etc. A pattern is consistent with a type if it matches (some) elements of the type. So, a variable is consistent with any type. A literal is consistent with its own type. The pattern (p:q) is consistent with the type [t] if both p and q are consistent with [t]. Similarly we can reason about other types.
Let us say we have a function f
f :: [Int] ® [Int] ® Int
f [] ys = 0
f (x:xs) [] = 0
f (x:xs) (y:ys) = x+y
Now evaluate
f [1 .. 3] [1 ..3]
Step 1. f (1:[2 .. 3]) [1 .. 3] -- matches
Step 2. f (1:[2 .. 3]) (1: [2 .. 3])
Step 3. 1 + 1
Step 4. 2
We have seen that in Haskell one can always enumerate a list.
We can say [1, 2, 4, 9, 4] :: [Integer] or [`p`, `o`, `s`, `t`] :: [Char]. We can even use some shortcuts like ``post`` :: [Char]
There are other shortcuts allowed by Haskell.
[2 .. 7] = [2, 3, 4, 5, 6, 7]
[0, 2 .. 10] = [0, 2, 4, 6, 8, 10]
[1, 3 ..] = [1, 3, 5, 7, ….] an infinite sequence!!
But an important way to define a list is through a syntax called list comprehension. With this syntax one can create a list by describing how to choose elements from one or more lists and and put the chosen elements into the constructed list. First a simple example.
The expression [2*n | n <- [2, 4, 6]] evaluates to the list [4, 8, 12]
The symbol <- is read as member of and the symbol | is read as such that. So the expression reads as: “Make a list of 2 times n such that n comes from the list [2, 4, 6].” Here the list [2, 4, 6] on the right side of the | symbol is called a generator of the list comprehension.
We can have several generators in a list comprehension.
[(x,y) | x <- [1, 3, 5, 7] , y <- [2, 4, 6, 8]] produces all combinations of pair (the Cartesian Product) of the two lists
In addition to generators, we can also add tests, which are Boolean valued expressions to determine what goes in the result list.
[(x,y) | x <- [1, 3, 5, 7] , y <- [2, 4, 6, 8], x+y>5] will prevent (1,3), (1,4) (3, 2) from being in the result list
We could, make one generator dependent upon another. Consider, Pythagoran triples, three numbers x, y, z such that z2=x2+y2. Let’s say we want to produce a list of Pythgoran triples for with numbers going up to n. We can write this as:
pythogran-triple n
[ (x, y, z) | x <- [2 .. n], y <- [x+1 .. n], z <- [y+1 .. n], x*x + y*y = = z*z]
Now, pythagorean-triple 100 will evaluate to [(3, 4, 5), (5, 12, 13), (6, 8, 10) …, (65,72,97)].
qsort [] = [] -- (1) qsort (x:xs) = qsort elts_lt_x ++ [x] ++ qsort elts_greq_x -- (2) where -- (3) elts_lt_x = [y | y <- xs, y < x] -- (4) elts_greq_x = [y | y <- xs, y >= x] -- (5)
Note this style where a function is defined in terms of "simpler" functions with "where" (3-5).
The first line (1) reads: "The result of sorting an empty list (written []) is an empty list".
(2) reads: "To sort a list whose first element is x and the rest of which is called xs, just sort all the elements of xs which are less than x (call them elts_lt_x), sort all the elements of xs which are greater than or equal to x (call them elts_greq_x), and concatenate (++) the results, with x sandwiched in the middle."
The definition of elts_lt_x, which is given immediately below, is read like this: "elts_lt_x is the list of all y's such that y is drawn from the list xs, and y is less than x". The definition of elts_greq_x is similar. The syntax is deliberately reminiscent of standard mathematical set notation, again, "|" as "such that" and "<-" as "member (element) of".
When asked to sort a non-empty list, qsort calls itself to sort elts_lt_x and elts_greq_x. That's OK because both these lists are smaller than the one originally given to qsort, so the splitting-and-sorting process will eventually reduce to sorting an empty list, which is done rather trivially by the first line of qsort.
Here is another way to write the same program in Haskell without using "where":
qsort [] = [] qsort (x:xs) = qsort [y | y <- xs, y<x] ++ [x] ++ qsort [y | y <- xs, y>=x]
Now let's try to write this in C
qsort( a, lo, hi ) int a[], hi, lo; /* a is the array to sort, lo and hi specify the range over which to sort */ { int h, l, p, t; if (lo < hi) { l = lo; h = hi; p = a[hi]; /* choose the last element in the range */ do { while ((l < h) && (a[l] <= p)) l = l+1; /* move up through array until you find a number greater than p */ while ((h > l) && (a[h] >= p)) h = h-1; /* move down through array until you find a number less than p */ if (l < h) { t = a[l]; a[l] = a[h]; a[h] = t; /* swap in place */ } } while (l < h); t = a[l]; a[l] = a[hi]; a[hi] = t; /* swap in place again, making two arrays "around" p */ qsort( a, lo, l-1 ); /* recurse for each of the two arrays */ qsort( a, l+1, hi ); } }
The C quicksort uses an extremely ingenious technique, invented by Hoare, whereby it sorts the array in place; that is, without using any extra storage. As a result, it runs quickly, and in a small amount of memory. In contrast, the Haskell program allocates quite a lot of extra memory behind the scenes, and runs rather slower than the C program.
In effect, the C quicksort does some very ingenious storage management, trading this algorithmic complexity for a reduction in run-time storage management costs.
In applications where performance is required at any cost, or when the goal is detailed tuning of a low-level algorithm, an imperative language like C would probably be a better choice than Haskell, exactly because it provides more intimate control over the exact way in which the computation is carried out.
The non-strict nature of Haskell allows us to define data structures that are also non-strict. Here is an example that defines an infinite list of integers beginning with a user-specified number n
numsFrom n = n : numsFrom (n+1)
What can we do with this? Let us first create a function that “generates” an infinte sequence of squares:
squares = map (^2) (numsFrom 0) -- remember that the signature of map is map :: (a ® b) ® [a] ® [b]
Haskell has a bunch of utility functions, called “Standard Prelude” functions. Here is one, called take, that picks (and removes) the first k elements of a list
take 5 squares would evaluate to [0, 1, 4, 9, 16]
An easy utility function is tail, which takes a list, removes the first element, and returns the rest of the list (yes, there is also a function called head).
Another standard utility function is "zip". It works like a "zipper" for two lists and is defined as
zip (x:xs) (y:ys) = (x, y) : zip xs ys -- (1) zip xs ys = [] -- (2) (1) says: make a pair (x,y) from the heads of each list; repeat for the rest of the lists (2) says: if at least one list is empty: done!
With these we can define the Fibonacci sequence ( it is the sequence 1 1 2 3 5 8 13 21 …)
fib = 1 : 1 : [a+b | (a,b) <- zip fib (tail fib) ]
Do you see how it works in a “bootstrap” (or is it tailstrap?) fashion?