summaryrefslogtreecommitdiff
path: root/doc/tutorial/ScalaTutorial.scala.tex
blob: 7d83406f7be6f4bb55d994ad5715ddbbd26bad8e (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
%% This file really contains -*- LaTeX -*- code, to be processed by
%% the scalatex script.

%% $Id$

\documentclass[a4paper,12pt,twoside,titlepage]{article}

\usepackage{scaladoc}
\usepackage{xspace}
\usepackage{url}

\ifpdf
    \pdfinfo {
        /Author   (Michel Schinz)
        /Title    (Scala Tutorial)
        /Keywords (Scala)
        /Subject  ()
        /Creator  (TeX)
        /Producer (PDFLaTeX)
    }
\fi

\renewcommand{\doctitle}{Scala Tutorial}
\renewcommand{\docsubtitle}{for Java programmers}
\renewcommand{\docauthor}{Michel Schinz}

\newcommand{\langname}[1]{#1\xspace}

\newcommand{\Scala}{\langname{Scala}}
\newcommand{\Java}{\langname{Java}}

\newcommand{\toolname}[1]{\texttt{#1}\xspace}

\newcommand{\scalac}{\toolname{scalac}}
\newcommand{\java}{\toolname{java}}

\newcommand{\ident}[1]{\code{#1}\xspace}

\begin{document}
\makedoctitle

\section{Introduction}
\label{sec:introduction}

This document gives a quick introduction to the \Scala language and
compiler. It is intended for people who have already some experience
at programming and want an overview of what they can do with \Scala. A
basic knowledge of object-oriented programming, especially in \Java,
is assumed.

\section{A first example}
\label{sec:first-example}

As a first example, we will use the standard \emph{Hello world}
program, which is not very fascinating but makes it easy to
demonstrate the use of the \Scala tools without knowing too much about
the language. Here is how it looks:
\begin{scalaprogram}{HelloWorld}
object HelloWorld {
  def main(args: Array[String]): unit = {
    System.out.println("Hello, world!");
  }
}
\end{scalaprogram}

The structure of this program should be familiar to Java programmers:
it consists of one method called \ident{main} which takes the command
line arguments, an array of strings, as parameter; the body of this
method consists of a single call to the \ident{println} method of the
object representing the standard output, with the friendly greeting as
argument. The \ident{main} method is declared as returning a value of
type \ident{unit}, which for now can be seen as similar to \Java's
\ident{void} type.

What should be less familiar to Java programmers is the \ident{object}
declaration containing the \ident{main} method. Such a declaration
introduces what is commonly known as a \emph{singleton object}, that
is a class with a single instance. The declaration above thus declares
both a class called \ident{HelloWorld} and an instance of that class,
also called \ident{HelloWorld}. This instance is created lazily, the
first time it is used.

The astute reader might also have noticed that the \ident{main} method
is not declared as \ident{static} here. This is because static members
(methods or fields) do not exist in \Scala. Rather than define static
members, the \Scala programmer declares these members in singleton
objects.

\subsection{Compiling the example}
\label{sec:compiling-example}

To compile the example, we need to use \scalac, the \Scala compiler.
\scalac works like most compilers: it takes a source file as argument,
maybe some options, and produces one or several object files. The
object files it produces are standard \Java class files.

If we save the above program in a file called
\ident{HelloWorld.scala}, we can compile it by issuing the following
command (the greater-than sign `\verb|>|' represents the shell prompt
and should not be typed):
\begin{verbatim}
> scalac HelloWorld.scala
\end{verbatim}
This will generate a few class files in the current directory, one of
which called \ident{HelloWorld.class}. This file contains a class
which can be directly executed using the \java command, as will be
seen in the following section.

\subsection{Running the example}
\label{sec:running-example}

Once compiled, a \Scala program can be run like a \Java program, using
the \java command. However, a compiled \Scala program needs to access
some support classes at run-time, which should be available through
\Java's class path. These support classes are distributed in a JAR
file called \path{scala.jar}, which lives in the directory
\path{SCALA_HOME/lib}. Here \path{SCALA_HOME} is a place-holder for the
name of the directory where the \Scala distribution was installed.

The example above can therefore be executed using the command below,
if we assume that the \Scala distribution was installed in
\path{/usr/local}:
\begin{verbatim}
> java -classpath /usr/local/lib/scala.jar:. HelloWorld
\end{verbatim}
\scalaprogramoutput{HelloWorld}

\section{Interaction with Java}
\label{sec:inter-with-java}

One of the strength of \Scala is that it makes it very easy to
interact with \Java code. Actually, the example of the previous
section showed this: to print the message on screen, we simply used a
call to \Java's \ident{println} method on the (\Java) object
\ident{System.out}.

All \Java code is accessible as easily from \Scala. Of course, it is
sometimes necessary to import classes, as one does in \Java, in order
to be able to use them. All classes in the \ident{java.lang} packages
are imported by default, others need to be imported explicitly.

Let's look at another example to see this. The aim of this example is
to compute and print the factorial of 100 using \Java big integers
(i.e. the class \ident{java.math.BigInteger}), since the result does
not fit in a \Java integer. This program looks like this:
\begin{scalaprogram}{BigFactorial}
object BigFactorial {
  import java.math.BigInteger, BigInteger._;

  def fact(x: BigInteger): BigInteger =
    if (x == ZERO) ONE
    else x multiply fact(x subtract ONE);

  def main(args: Array[String]): unit =
    System.out.println("fact(100) = "
                       + fact(new BigInteger("100")));
}
\end{scalaprogram}

\Scala's \ident{import} statement looks very similar to \Java's
equivalent, but an important difference appears here: to import all
the names of a package or class, one uses the underscore (\verb|_|)
character instead of the asterisk (\verb|*|). This is due to the fact
that, as we will see later, the asterisk is actually a valid \Scala
identifier.

The \ident{import} statement above therefore starts by importing the
class \ident{java.math.BigInteger}, and then all the names it
contains. This makes the static fields \ident{ZERO} and \ident{ONE}
directly visible.

While we're talking about \ident{ZERO}, something must be said about
the condition of the \ident{if} expression in method \ident{fact}. Is
it really correct to check that \ident{x} is null by using the
\ident{==} operator? A \Java programmer would say no, because in that
language the \ident{==} operator compares objects by physical
equality, and this is not what we want here. What we want to know is
whether \ident{x} is \emph{some} big integer object representing zero,
and there might be several of them. So a \Java programmer would use
the \ident{equals} method to perform the comparison.

The \Scala programmer, on the other hand, can use \ident{==} here
because that operator compares objects according to the \ident{equals}
method. Is \ident{==} just an alias for \ident{equals} then? Well,
almost, but \ident{==} has one advantage over \ident{equals} in that
it works also when the selector is the \ident{null} constant.

The \ident{fact} method also shows some characteristics of \Scala's
syntax. The first one is that the method body does not have to be
surrounded by curly braces if it consists of a single expression.
The second one is that methods taking one argument can be used with an
infix syntax. That is, the expression
\begin{lstlisting}
x subtract ONE
\end{lstlisting}
is just another, slightly less verbose way of writing the expression
\begin{lstlisting}
x.subtract(ONE)
\end{lstlisting}
This might seem like a minor syntactic detail, but it has important
consequences, one of which will be explored in the next section.

To conclude this section about integration with \Java, it should be
noted that it is also possible to inherit from \Java classes and
implement \Java interfaces directly in \Scala.

\section{Everything is an object}
\label{sec:everything-an-object}

\Scala is a pure object-oriented language in the sense that
\emph{everything} is an object, including numbers or functions. It
differs from \Java in that respect, since \Java distinguishes numeric
types from objects, and does not enable one to manipulate functions as
values.

\subsection{Numbers are objects}
\label{sec:numbers-are-objects}

Since numbers are objects, they also have methods. And in fact, an
arithmetic expression like the following:
\begin{lstlisting}
1 + 2 * 3 / x
\end{lstlisting}
consists exclusively of method calls, because it is equivalent to the
following expression, as we saw in the previous section:
\begin{lstlisting}
1.+(2.*(3./(x)))
\end{lstlisting}
This also means that \ident{+}, \ident{*}, etc. are valid identifiers
in \Scala.

\subsection{Functions are objects}
\label{sec:funct-are-objects}

Perhaps more surprising for the \Java programmer, functions are also
objects in \Scala. It is therefore possible to pass functions as
arguments, to store them in variables, and to return them from other
functions. This ability to manipulate functions as values is one of
the cornerstone of a very interesting programming paradigm called
\emph{functional programming}.

As a very simple example of why it can be useful to use functions as
values, let's consider a timer function whose aim is to perform some
action every second. How do we pass it the action to perform? Quite
logically, as a function. This very simple kind of function passing
should be familiar to many programmers: it is often used in
user-interface code, to register call-back functions which get called
when some event occurs.

In the following program, the timer function is called
\ident{oncePerSecond}, and it gets a call-back function as argument.
The type of this function is written \verb|() => unit| and is the type
of all functions which have no arguments and return a value of type
\ident{unit}. The main function of this program simply calls this
timer function with a call-back which prints a sentence on the
terminal. In other words, this program endlessly prints the sentence
\emph{time flies like an arrow} every second.

\begin{scalaprogram}{Timer}
object Timer {
  def oncePerSecond(callback: () => unit): unit =
    while (true) { callback(); Thread sleep 1000 };

  def timeFlies(): unit =
    System.out.println("time flies like an arrow...");

  def main(args: Array[String]): unit =
    oncePerSecond(timeFlies);
}
\end{scalaprogram}

\subsubsection{Anonymous functions}
\label{sec:anonymous-functions}

While this program is easy to understand, it can be refined a bit.
First of all, notice that the function \ident{timeFlies} is only
defined in order to be passed later to the \ident{oncePerSecond}
function. Having to name that function, which is only used once, might
seem unnecessary, and it would in fact be nice to be able to construct
this function just as it is passed to \ident{oncePerSecond}. This is
possible in \Scala using \emph{anonymous functions}, which are exactly
that: functions without a name. The revised version of our timer
program using an anonymous function instead of \ident{timeFlies} looks
like that:
\begin{scalaprogram}{TimerAnonymous}
object TimerAnonymous {
  def oncePerSecond(callback: () => unit): unit =
    while (true) { callback(); Thread sleep 1000 };

  def main(args: Array[String]): unit =
    oncePerSecond(() =>
      System.out.println("time flies like an arrow..."));
}
\end{scalaprogram}
The presence of an anonymous function in this example is revealed by
the right arrow `\verb|=>|' which separates the function's argument
list from its body. In this example, the argument list is empty, as
witnessed by the empty pair of parenthesis on the left of the arrow.
The body of the function is the same as the one of \ident{timeFlies}
above.

% TODO fonctions avec environnement

\section{Classes}
\label{sec:classes}

As we have seen above, \Scala is an object-oriented language, and as
such it has a concept of class.\footnote{For the sake of completeness,
  it should be noted that some object-oriented languages do not have
  the concept of class, but \Scala is not one of them.}
Classes in \Scala are declared using a syntax which is close to
\Java's syntax. One important difference is that classes in \Scala can
have parameters. This is illustrated in the following definition of
complex numbers.
\begin{scalaprogram}{Complex}
class Complex(real: double, imaginary: double) {
  def re() = real;
  def im() = imaginary;
}
\end{scalaprogram}
This complex class takes two arguments, which are the real and
imaginary part of the complex. It then defines two methods, called
\ident{re} and \ident{im} which give access to these two parts.

It should be noted that the return type of these two methods is not
given explicitly. It will be inferred automatically by the compiler,
which looks at the right-hand side of these methods and deduces that
both return a value of type \ident{double}.

The compiler is not always able to infer types like it does here, and
there is unfortunately no simple rule to know exactly when it will be,
and when not. In practice, this is usually not a problem since the
compiler complains when it is not able to infer a type which was not
given explicitly. As a simple rule, beginner \Scala programmers
should try to omit type declarations which seem superfluous to them,
because they are easily deduced from the context, and see whether the
compiler agrees with them. After some time, they should get a good
feeling about when to omit types, and when to specify them
explicitly.

\subsection{Methods without arguments}
\label{sec:meth-wo-args}

A small problem of the methods \ident{re} and \ident{im} is that, in
order to call them, one has to put an empty pair of parenthesis after
their name, as the following example shows:
\begin{lstlisting}[escapechar=\#]
val c = new Complex(1.2, 3.4);
System.out.println("imaginary part: " + c.im());
\end{lstlisting}
It would be nicer to be able to access the real and imaginary parts
like if they were fields, without putting the empty pair of
parenthesis. This is perfectly doable in \Scala, simply by defining
the methods as methods \emph{without arguments}. These differ from
methods with zero arguments in that they don't have parenthesis after
their name, neither in their definition nor in their use. Our
\ident{Complex} class can be rewritten as follows:
\begin{scalaprogram}{Complex2}
class Complex(real: double, imaginary: double) {
  def re = real;
  def im = imaginary;
}
\end{scalaprogram}

\subsection{Inheritance and overriding}
\label{sec:inheritance}

All classes in \Scala inherit from a super-class. When no super-class
is specified, as in the \ident{Complex} example of previous section,
\ident{scala.Object} is implicitly used.

It is possible to override methods inherited from a super-class in
\Scala. It is however mandatory to explicitly specify that a method
overrides another one using the \ident{override} modifier, in order to
avoid accidental overriding. As an example, our \ident{Complex} class
can be augmented with a redefinition of the \ident{toString} method
inherited from \ident{Object}.
\begin{scalaprogram}{Complex3}
class Complex(real: double, imaginary: double) {
  def re = real;
  def im = imaginary;
  override def toString() =
    "" + re + (if (im < 0) "-" else "+") + im + "i";
}
\end{scalaprogram}

\section{Case classes and pattern matching}
\label{sec:case-classes-pattern}

\section{Mixins}
\label{sec:mixins}

Apart from inheriting code from a super-class, a \Scala class can also
import code from one or several \emph{mixins}.

Maybe the easiest way for a \Java programmer to understand what mixins
are is to view them as interfaces which can also contain code. In
\Scala, when a class inherits from a mixin, it implements that mixin's
interface, and inherits all the code contained in the mixin.

To see the usefulness of mixins, let's look at a classical example:
ordered objects. It is often useful to be able to compare objects of a
given class among themselves, for example to sort them. In \Java,
objects which are comparable implement the \ident{Comparable}
interface. In \Scala, we can do a bit better than in \Java by defining
our equivalent of \ident{Comparable} as a mixin, which we will call
\ident{Ord}.

When comparing objects, six different predicates can be useful:
smaller, smaller or equal, equal, not equal, greater or equal, and
greater. However, defining all of them is fastidious, especially since
four out of these six can be expressed using the remaining two. That
is, given the equal and smaller predicates (for example), one can
express the other ones. In \Scala, all these observations can be
nicely captured by the following mixin declaration:
\beginscalaprogram{Ord}
\begin{scalacode}
 abstract class Ord {
  def < (that: Any): boolean;
  def <=(that: Any): boolean = (this < that) || (this == that);
  def > (that: Any): boolean = !(this <= that);
  def >=(that: Any): boolean = !(this < that);
} 
\end{scalacode}
This definition both creates a new type called \ident{Ord}, which
plays the same role as \Java's \ident{Comparable} interface, and
default implementations of three predicates in terms of a fourth,
abstract one. The predicates for equality and inequality do not appear
here since they are by default present in all objects.

To make objects of a class comparable, it is therefore sufficient to
define the predicates which test equality and inferiority, and mix in
the \ident{Ord} class above. As an example, let's define a
\ident{Date} class representing dates in the Gregorian calendar. Such
dates are composed of a day, a month and a year, which we will all
represent as integers. We therefore start the definition of the
\ident{Date} class as follows:
\begin{scalacode}
class Date(y: int, m: int, d: int) with Ord {
  def year = y;
  def month = m;
  def day = d;

  override def toString(): String = year + "-" + month + "-" + day;
\end{scalacode}
The important part here is the \code{with Ord} declaration which
follows the class name and parameters. It declares that the
\ident{Date} class inherits from the \ident{Ord} class as a mixin.

Then, we redefine the \ident{equals} method, inherited from
\ident{Object}, so that it correctly compares dates by comparing their
individual fields. The default implementation of \ident{equals} is not
usable, because as in \Java it compares object physically. We arrive
at the following definition:
\begin{scalacode}
  override def equals(that: Any): boolean = {
    that.isInstanceOf[Date] && {
      val o = that.asInstanceOf[Date];
      o.day == day && o.month == month && o.year == year
    }
  }
\end{scalacode}
This method makes use of the predefined methods \ident{isInstanceOf}
and \ident{asInstanceOf}. The first one, \ident{isInstanceOf},
corresponds to \Java's \ident{instanceof} operator, and returns true
if and only if the object on which it is applied is an instance of the
given type. The second one, \ident{AsInstanceOf}, corresponds to
\Java's cast operator: If the object is an instance of the given type,
it is viewed as such, and otherwise a \ident{ClassCastException} is
thrown.

Finally, the last method to define is the predicate which tests for
inferiority, as follows. It makes use of another predefined method,
\ident{error}, which throws an exception with the given error message.
\begin{scalacode}
  def <(that: Any): boolean = {
    if (!that.isInstanceOf[Date])
      error("cannot compare " + that + " with a Date");

    val o = that.asInstanceOf[Date];
    (year < o.year)
      || (year == o.year && (month < o.month
                               || (month == o.month && day < o.day)))
  }
}
\end{scalacode}
This completes the definition of the \ident{Date} class. Instances of
this class can be seen either as dates or as comparable objects.
Moreover, they all define the six comparison predicates mentioned
above: \ident{equals} and \ident{<} because they appear directly in
the definition of the \ident{Date} class, and the others because they
are inherited from the \ident{Ord} mixin.

Mixins are useful in other situations than the one shown here, of
course, but discussing their applications in length is outside the
scope of this document.


\section{Genericity}
\label{sec:genericity}



\section{Conclusion}
\label{sec:conclusion}

This document gave a quick overview of the \Scala language and
presented some basic examples. The interested reader can go on by
reading the companion document \textit{Scala By Example\/} and consult
the \textit{Scala Language Specification\/} when needed.

\end{document}

% LocalWords:  mixins mixin mixin's