% $Id$ \documentclass[a4paper,12pt,twoside,titlepage]{book} \usepackage{scaladoc} \usepackage{fleqn} \usepackage{modefs} \usepackage{math} \usepackage{scaladefs} \renewcommand{\todo}[1]{} \ifpdf \pdfinfo { /Author (Martin Odersky) /Title (Scala by Example) /Keywords (Scala) /Subject () /Creator (TeX) /Producer (PDFLaTeX) } \fi \renewcommand{\doctitle}{Scala By Example\\[33mm]\ } \renewcommand{\docauthor}{Martin Odersky\\[53mm]\ } \begin{document} \frontmatter \makedoctitle \clearemptydoublepage \tableofcontents \mainmatter \sloppy \chapter{\label{chap:intro}Introduction} Scala smoothly integrates object-oriented and functional programming. It is designed to express common programming patterns in a consise, elegant, and type-safe way. Scala introduces several innovative language constructs. For instance: \begin{itemize} \item Abstract types and mixin composition unify concepts from object and module systems. \item Pattern matching over class hierarchies unifies functional and object-oriented data access. It greatly simplifies the processing of XML trees. \item A flexible syntax and type system enables the construction of advanced libraries and new domain specific languages. \end{itemize} At the same time, Scala is compatible with Java. Java libraries and frameworks can be used without glue code or additional declarations. This document introduces Scala in an informal way, through a sequence of examples. Chapters~\ref{chap:example-one} and \ref{chap:example-auction} highlight some of the features that make Scala interesting. The following chapters introduce the language constructs of Scala in a more thorough way, starting with simple expressions and functions, and working up through objects and classes, lists and streams, mutable state, pattern matching to more complete examples that show interesting programming techniques. The present informal exposition is meant to be complemented by the Scala Language Reference Manual which specifies Scala in a more detailed and precise way. \paragraph{Acknowledgment} We owe a great debt to Abelson's and Sussman's wonderful book ``Structure and Interpretation of Computer Programs''\cite{abelson-sussman:structure}. Many of their examples and exercises are also present here. Of course, the working language has in each case been changed from Scheme to Scala. Furthermore, the examples make use of Scala's object-oriented constructs where appropriate. \input{ExamplesPart} \bibliographystyle{alpha} \bibliography{Scala} \end{document} \chapter{Implementing Abstract Types: Search Trees} This chapter presents unbalanced binary search trees, implemented in three different styles: algebraic, object-oriented, and imperative. In each case, a search tree package is seen as an implementation of a class {\em MapStruct}. \begin{lstlisting} trait MapStruct[kt, vt] { trait Map with Function1[kt, vt] { def extend(key: kt, value: vt): Map; def remove(key: kt): Map; def domain: Stream[kt]; def range: Stream[vt]; } type map <: Map; val empty: map; } \end{lstlisting} The \code{MapStruct} class is parameterized with a type of keys \code{kt} and a type of values \code{vt}. It specifies an abstract type \code{Map} and an abstract value \code{empty}, which represents empty maps. Every implementation \code{Map} needs to conform to that abstract type, which extends the function type \code{kt => vt} with four new methods. The method \code{domain} yields a stream that enumerates the map's domain, i.e. the set of keys that are mapped to non-null values. The method \code{range} yields a stream that enumerates the function's range, i.e.\ the values obtained by applying the function to arguments in its domain. The method \code{extend} extends the map with a given key/value binding, whereas \code{remove} removes a given key from the map's domain. Both methods yield a new map value as result, which has the same representation as the receiver object. \begin{figure}[t] \begin{lstlisting} class AlgBinTree[kt extends Ord, vt] extends MapStruct[kt, vt] { private case Empty extends Map, Node(key: kt, value: vt, l: Map, r: Map) extends Map final class Map extends kt => vt { def apply(key: kt): vt = this match { case Empty => null case Node(k, v, l, r) => if (key < k) l.apply(key) else if (key > k) r.apply(key) else v } def extend(key: kt, value: vt): Map = this match { case Empty => Node(k, v, Empty, Empty) case Node(k, v, l, r) => if (key < k) Node(k, v, l.extend(key, value), r) else if (key > k) Node(k, v, l, r.extend(key, value)) else Node(k, value, l, r) } def remove(key: kt): Map = this match { case Empty => Empty case Node(k, v, l, r) => if (key < k) Node(k, v, l.remove(key), r) else if (key > k) Node(k, v, l, r.remove(key)) else if (l == Empty) r else if (r == Empty) l else { val midKey = r.domain.head Node(midKey, r.apply(midKey), l, r.remove(midKey)) } } def domain: Stream[kt] = this match { case Empty => [] case Node(k, v, l, r) => Stream.concat([l.domain, [k], r.domain]) } def range: Stream[vt] = this match { case Empty => [] case Node(k, v, l, r) => Stream.concat([l.range, [v], r.range]) } } def empty: Map = Empty } \end{lstlisting} \caption{\label{fig:algbintree}Algebraic implementation of binary search trees} \end{figure} We now present three implementations of \code{Map}, which are all based on binary search trees. The \code{apply} method of a map is implemented in each case by the usual search function over binary trees, which compares a given key with the key stored in the topmost tree node, and depending on the result of the comparison, searches the left or the right hand sub-tree. The type of keys must implement the \code{Ord} class, which contains comparison methods (see Chapter~\ref{chap:classes} for a definition of class \code{Ord}). The first implementation, \code{AlgBinTree}, is given in Figure~\ref{fig:algbintree}. It represents a map with a data type \code{Map} with two cases, \code{Empty} and \code{Node}. Every method of \code{AlgBinTree[kt, vt].Map} performs a pattern match on the value of \code{this} using the \code{match} method which is defined as postfix function application in class \code{Object} (\sref{sec:class-object}). The functions \code{domain} and \code{range} return their results as lazily constructed lists. The \code{Stream} class is an alias of \code{List} which should be used to indicate the fact that its values are constructed lazily. \begin{figure}[thb] \begin{lstlisting} class OOBinTree[kt extends Ord, vt] extends MapStruct[kt, vt] { abstract class Map extends kt => vt { def apply(key: kt): v def extend(key: kt, value: vt): Map def remove(key: kt): Map def domain: Stream[kt] def range: Stream[vt] } module empty extends Map { def apply(key: kt) = null def extend(key: kt, value: vt) = Node(key, value, empty, empty) def remove(key: kt) = empty def domain = [] def range = [] } private class Node(k: kt, v: vt, l: Map, r: Map) extends Map { def apply(key: kt): vt = if (key < k) l.apply(key) else if (key > k) r.apply(key) else v def extend(key: kt, value: vt): Map = if (key < k) Node(k, v, l.extend(key, value), r) else if (key > k) Node(k, v, l, r.extend(key, value)) else Node(k, value, l, r) def remove(key: kt): Map = if (key < k) Node(k, v, l.remove(key), r) else if (key > k) Node(k, v, l, r.remove(key)) else if (l == empty) r else if (r == empty) l else { val midKey = r.domain.head Node(midKey, r(midKey), l, r.remove(midKey)) } def domain: Stream[kt] = Stream.concat([l.domain, [k], r.domain]) def range: Stream[vt] = Stream.concat([l.range, [v], r.range]) } } \end{lstlisting} \caption{\label{fig:oobintree}Object-oriented implementation of binary search trees} \end{figure} The second implementation of maps is given in Figure~\ref{fig:oobintree}. Class \code{OOBinTree} implements the type \code{Map} with a module \code{empty} and a class \code{Node}, which define the behavior of empty and non-empty trees, respectively. Note the different nesting structure of \code{AlgBinTree} and \code{OOBinTree}. In the former, all methods form part of the base class \code{Map}. The different behavior of empty and non-empty trees is expressed using a pattern match on the tree itself. In the latter, each subclass of \code{Map} defines its own set of methods, which override the methods in the base class. The pattern matches of the algebraic implementation have been replaced by the dynamic binding that comes with method overriding. Which of the two schemes is preferable depends to a large degree on which extensions of the type are anticipated. If the type is later extended with a new alternative, it is best to keep methods in each alternative, the way it was done in \code{OOBinTree}. On the other hand, if the type is extended with additional methods, then it is preferable to keep only one implementation of methods and to rely on pattern matching, since this way existing subclasses need not be modified. \begin{figure} \begin{lstlisting} class MutBinTree[kt extends Ord, vt] extends MapStruct[kt, vt] { class Map(key: kt, value: vt) extends kt => vt { val k = key var v = value var l = empty, r = empty def apply(key: kt): vt = if (this eq empty) null else if (key < k) l.apply(key) else if (key > k) r.apply(key) else v def extend(key: kt, value: vt): Map = if (this eq empty) Map(key, value) else { if (key < k) l = l.extend(key, value) else if (key > k) r = r.extend(key, value) else v = value this } def remove(key: kt): Map = if (this eq empty) this else if (key < k) { l = l.remove(key) ; this } else if (key > k) { r = r.remove(key) ; this } else if (l eq empty) r else if (r eq empty) l else { var mid = r while (!(mid.l eq empty)) { mid = mid.l } mid.r = r.remove(mid.k) mid.l = l mid } def domain: Stream[kt] = Stream.concat([l.domain, [k], r.domain]) def range: Stream[vt] = Stream.concat([l.range, [v], r.range]) } let empty = new Map(null, null) } \end{lstlisting} \caption{\label{fig:impbintree}Side-effecting implementation of binary search trees} \end{figure} The two versions of binary trees presented so far are {\em persistent}, in the sense that maps are values that cannot be changed by side effects. By contrast, in the next implementation of binary trees, the implementations of \code{extend} and \code{remove} do have an effect on the state of their receiver object. This corresponds to the way binary trees are usually implemented in imperative languages. The new implementation can lead to some savings in computing time and memory allocation, but care is required not to use the original tree after it has been modified by a side-effecting operation. In this implementation, \code{value}, \code{l} and \code{r} are variables that can be affected by method calls. The class \code{MutBinTree[kt, vt].Map} takes two instance parameters which define the \code{key} value and the initial value of the \code{value} variable. Empty trees are represented by a value \code{empty}, which has \code{null} (signifying undefined) in both its key and value fields. Note that this value needs to be defined lazily using \code{let} since its definition involves the creation of a \code{Map} object, which accesses \code{empty} recursively as part of its initialization. All methods test first whether the current tree is empty using the reference equality operator \code{eq} (\sref{sec:class-object}). As a program using the \code{MapStruct} abstraction, consider a function which creates a map from strings to integers and then applies it to a key string: \begin{lstlisting} def mapTest(def mapImpl: MapStruct[String, int]): int = { val map: mapImpl.Map = mapImpl.empty.extend("ab", 1).extend("bx", 3) val x = map("ab") // returns 1 } \end{lstlisting} The function is parameterized with the particular implementation of \code{MapStruct}. It can be applied to any one of the three implementations described above. E.g.: \begin{lstlisting} mapTest(AlgBinTree[String, int]) mapTest(OOBinTree[String, int]) mapTest(MutBinTree[String, int]) \end{lstlisting} } \paragraph{Mixin Composition} We can now define a class of \code{Rational} numbers that support comparison operators. \begin{lstlisting} final class OrderedRational(n: int, d: int) extends Rational(n, d) with Ord { override def ==(that: OrderedRational) = numer == that.numer && denom == that.denom; def <(that: OrderedRational): boolean = numer * that.denom < that.numer * denom; } \end{lstlisting} Class \code{OrderedRational} redefines method \code{==}, which is defined as reference equality in class \code{Object}. It also implements the abstract method \code{<} from class \code{Ord}. In addition, it inherits all members of class \code{Rational} and all non-abstract members of class \code{Ord}. The implementations of \code{==} and \code{<} replace the definition of \code{==} in class \code{Object} and the abstract declaration of \code{<} in class \code{Ord}. The other inherited comparison methods then refer to this implementation in their body. The clause ``\code{Rational(d, d) with Ord}'' is an instance of {\em mixin composition}. It describes a template for an object that is compatible with both \code{Rational} and \code{Ord} and that contains all members of either class. \code{Rational} is called the {\em superclass} of \code{OrderedRational} while \code{Ord} is called a {\em mixin class}. The type of this template is the {\em compound type} ``\code{Rational with Ord}''. On the surface, mixin composition looks much like multiple inheritance. The difference between the two becomes apparent if we look at superclasses of inherited classes. With multiple inheritance, both \code{Rational} and \code{Ord} would contribute a superclass \code{Object} to the template. We therefore have to answer some tricky questions, such as whether members of \code{Object} are present once or twice and whether the initializer of \code{Object} is called once or twice. Mixin composition avoids these complications. In the mixin composition \code{Rational with Ord}, class \code{Rational} is treated as actual superclass of class \code{Ord}. A mixin composition \code{C with M} is well-formed as long as the first operand \code{C} conforms to the declared superclass of the second operand \code{M}. This holds in our example, because \code{Rational} conforms to \code{Object}. In a sense, mixin composition amounts to overriding the superclass of a class. \paragraph{Final Classes} Note that class \code{OrderedRational} was defined \code{final}. This means that no classes extending \code{OrderedRational} may be defined in other parts of the program. %Within final classes the %type \code{this} is an alias of the defined class itself. Therefore, %we could define our \code{<} method with an argument of type %\code{OrderedRational} as a well-formed implementation of the abstract class %\code{less(that: this)} in class \code{Ord}. \chapter{\label{sec:simple-examples}Pattern Matching} \todo{Complete} Consider binary trees whose leafs contain integer arguments. This can be described by a class for trees, with subclasses for leafs and branch nodes: \begin{lstlisting} abstract class Tree; case class Branch(left: Tree, right: Tree) extends Tree; case class Leaf(x: int) extends Tree; \end{lstlisting} Note that the class \code{Tree} is not followed by an extends clause or a body. This defines \code{Tree} to be an empty subclass of \code{Object}, as if we had written \begin{lstlisting} class Tree extends Object {} \end{lstlisting} Note also that the two subclasses of \code{Tree} have a \code{case} modifier. That modifier has two effects. First, it lets us construct values of a case class by simply calling the constructor, without needing a preceding \code{new}. Example: \begin{lstlisting} val tree1 = Branch(Branch(Leaf(1), Leaf(2)), Branch(Leaf(3), Leaf(4))) \end{lstlisting} Second, it lets us use constructors for these classes in patterns, as is illustrated in the following example. \begin{lstlisting} def sumLeaves(t: Tree): int = t match { case Branch(l, r) => sumLeaves(l) + sumLeaves(r) case Leaf(x) => x } \end{lstlisting} The function \code{sumLeaves} sums up all the integer values in the leaves of a given tree \code{t}. It is is implemented by calling the \code{match} method of \code{t} with a {\em choice expression} as argument (\code{match} is a predefined method in class \code{Object}). The choice expression consists of two cases which botrelate a pattern with an expression. The pattern of the first case, \code{Branch(l, r)} matches all instances of class \code{Branch} and binds the {\em pattern variables} \code{l} and \code{r} to the constructor arguments, i.e.\ the left and right subtrees of the branch. Pattern variables always start with a lower case letter; to avoid ambiguities, constructors in patterns should start with an upper case letter. The effect of the choice expression is to select the first alternative whose pattern matches the given select value, and to evaluate the body of this alternative in a context where pattern variables are bound to corresponding parts of the selector. For instance, the application \code{sumLeaves(tree1)} would select the first alternative with the \code{Branch(l,r)} pattern, and would evaluate the expression \code{sumLeaves(l) + sumLeaves(r)} with bindings \begin{lstlisting} l = Branch(Leaf(1), Leaf(2)), r = Branch(Leaf(3), Leaf(4)). \end{lstlisting} As another example, consider the following class \begin{lstlisting} abstract final class Option[+a]; case object None extends Option[All]; case class Some[a](item: a) extends Option[a]; \end{lstlisting} ... %\todo{Several simple and intermediate examples needed}. \begin{lstlisting} def find[a,b](it: Iterator[(a, b)], x: a): Option[b] = { var result: Option[b] = None; while (it.hasNext && result == None) { val (x1, y) = it.next; if (x == x1) result = Some(y) } result } find(xs, x) match { case Some(y) => System.out.println(y) case None => System.out.println("no match") } \end{lstlisting} \comment{ class MaxCounter { var maxVal: Option[int] = None; def set(x: int) = maxVal match { case None => maxVal = Some(x) case Some(y) => maxVal = Some(Math.max(x, y)) } } \end{lstlisting} } \comment{ \begin{lstlisting} class Stream[a] = List[a] module Stream { def concat(xss: Stream[Stream[a]]): Stream[a] = { let result: Stream[a] = xss match { case [] => [] case [] :: xss1 => concat(xss1) case (x :: xs) :: xss1 => x :: concat(xs :: xss1) } result } } \end{lstlisting} } \comment{ } %\todo{We also need some XML examples.} \end{document} case ([], _) => ys case (_, []) => xs case (x :: xs1, y :: ys1) => if (x < y) x :: merge(xs1, ys) else y :: merge(xs, ys1) } def split (xs: List[a]): (List[a], List[a]) = xs match { case [] => ([], []) case [x] => (x, []) case y :: z :: xs1 => val (ys, zs) = split(xs1) ; (y :: ys, z :: zs) } def sort(xs: List[a]): List[a] = { val (ys, zs) = split(xs) merge(sort(ys), sort(zs)) } def sort(a:Array[String]): Array[String] = { val pivot = a(a.length / 2) sort(a.filter(x => x < pivot)) ++ a.filter(x => x == pivot) ++ sort(a.filter(x => x > pivot)) } def sort(a:Array[String]): Array[String] = { def swap (i: int, j: int): unit = { val t = a(i) ; a(i) = a(j) ; a(j) = t } def sort1(l: int, r: int): unit = { val pivot = a((l + r) / 2) var i = l, j = r while (i <= r) { while (i < r && a(i) < pivot) { i = i + 1 } while (j > l && a(j) > pivot) { j = j - 1 } if (i <= j) { swap(i, j) i = i + 1 j = j - 1 } } if (l < j) sort1(l, j) if (j < r) sort1(i, r) } sort1(0, a.length - 1) } class Array[a] { def copy(to: Array[a], src: int, dst: int, len: int): unit val length: int val apply(i: int): a val update(i: int, x: a): unit def filter (p: a => boolean): Array[a] = { val temp = new Array[a](a.length) var i = 0, j = 0 for (i < a.length, i = i + 1) { val x = a(i) if (p(x)) { temp(j) = x; j = j + 1 } } val res = new Array[a](j) temp.copy(res, 0, 0, j) } def ++ (that: Array[a]): Array[a] = { val a = new Array[a](this.length + that.length) this.copy(a, 0, 0, this.length) that.copy(a, 0, this.length, that.length) } static def concat [a] (as: List[Array[a]]) = { val l = (as map (a => a.length)).sum val dst = new Array[a](l) var j = 0 as forall {a => { a.copy(dst, j, a.length) ; j = j + a.length }} dst } } module ABT extends AlgBinTree[kt, vt] ABT.Map