path: root/doc/reference/RationalePart.tex



%% $Id$

There are hundreds of programming languages in active use, and many
more are being designed each year. It is therefore hard to justify the
development of yet another language. Nevertheless, this is what we
attempt to do here.  Our argument is based on two claims:
\begin{itemize}
\item[] {\em Claim 1:} The raise in importance of web services and
other distributed software is a fundamental paradigm
shift in programming. It is comparable in scale to the shift 20 years ago
from character-oriented to graphical user interfaces.
\item[] {\em Claim 2:} That paradigm shift will provide demand
for new programming languages, just as graphical user interfaces
promoted the adoption of object-oriented languages.
\end{itemize}
For the last 20 years, the most common programming model was
object-oriented: System components are objects, and computation is
done by method calls.  Methods themselves take object references as
parameters. Remote method calls let one extend this programming model
to distributed systems. The problem of this model is that it does not
scale up very well to wide-scale networks where messages can be
delayed and components may fail. Web services address the message
delay problem by increasing granularity, using method calls with
larger, structured arguments, such as XML trees.  They address the
failure problem by using transparent replication and avoiding server
state.  Conceptually, they are {\em tree transformers} that consume
incoming message documents and produce outgoing ones.  \comment{ To
back up the first claim, one observes that web services and other
distributed software increasingly tend to communicate using structured
or semi-structured data. A typical example is the use of XML to
describe data managed by applications as well as the messages between
applications. This tends to affect the role of a program in a
fundamental way. Previously, programs could be seen as objects that
reacted to method calls and in turn called methods of other
objects. Some of these method calls might originate from users while
others might originate from other computers via remote invocations.
These method calls have simple unstructured parameters or object
references as arguments.  Web services, on the other hand, communicate
with each other by transmitting asynchronous messages that carry
structured documents, usually in XML format. Programs then
conceptually become {\em tree transformers} that consume incoming
message documents and produce outgoing ones.  }

Why should this have an effect on programming languages? There are at
least two reasons: First, today's object-oriented languages are not
very good at analyzing and transforming XML trees. Because such trees
usually contain data but no methods, they have to be decomposed and
constructed from the ``outside'', that is from code which is external
to the tree definition itself. In an object-oriented language, the
ways of doing so are limited. The most common solution \cite{w3c:dom} is
to represent trees in a generic way, where all tree nodes are values
of a common type.  This makes it easy to write generic traversal
functions, but forces applications to operate on a very low conceptual
level, which often loses important semantic distinctions present in
the XML data.  More semantic precision is obtained if different
internal types model different kinds of nodes.  But then tree
decompositions require the use of run-time type tests and type casts
to adapt the treatment to the kind of node encountered. Such type
tests and type casts are generally not considered good object-oriented
style. They are rarely efficient, nor easy to use.


Conceivably, the glue problem could be addressed by a ``multi-paradigm''
language that would express object-oriented, concurrent, as well
as functional aspects of an application.  But one needs to be careful
not to simply replace cross-language glue by awkward interfaces
between different paradigms within the language itself.  Ideally, one
would hope for a fusion which unifies concepts found in different
paradigms instead of an agglutination, which merely includes them side
by side.  This fusion is what we try to achieve with Scala
\footnote{Scala stands for ``Scalable Language''. The term means
  ``Stairway'' in Italian}.

Scala is both an object-oriented and functional language.  It is a
pure object-oriented language in the sense that every value is an
object. Types and behavior of objects are described by
classes. Classes can be composed using mixin composition.  Scala is
designed work seamlessly with mainstream object-oriented languages,
in particular Java and C\#.

Scala is also a functional language in the sense that every function
is a value. Nesting of function definitions and higher-order functions
are naturally supported. Scala also supports a general notion of
pattern matching which can model the algebraic types used in many
functional languages. Furthermore, this notion of pattern matching
naturally extends to the processing of XML data.

The design of Scala is driven by the desire to unify object-oriented
and functional elements. Here are three examples how this is achieved:
\begin{itemize}
\item
Since every function is a value and every value is an object, it
follows that every function in Scala is an object. Indeed, there is a
root class for functions which is specialized in the Scala standard
library to data structures such as arrays and hash tables.
\item
Data structures in many functional languages are defined using
algebraic data types. They are decomposed using pattern matching.
Object-oriented languages, on the other hand, describe data with class
hierarchies. Algebraic data types are usually closed, in that the
range of alternatives of a type is fixed when the type is defined.  By
contrast, class hierarchies can be extended by adding new leaf
classes.  Scala adopts the object-oriented class hierarchy scheme for
data definitions, but allows pattern matching against values coming
from a whole class hierarchy, not just values of a single type.
This can express both closed and extensible data types, and also
provides a convenient way to exploit run-time type information in
cases where static typing is too restrictive.
\item
Module systems of functional languages such as SML or Caml excel in
abstraction; they allow very precise control over visibility of names
and types, including the ability to partially abstract over types.  By
contrast, object-oriented languages excel in composition; they offer
several composition mechanisms lacking in module systems, including
inheritance and unlimited recursion between objects and classes.
Scala unifies the notions of object and module, of module signature
and interface, as well as of functor and class. This combines the
abstraction facilities of functional module systems with the
composition constructs of object-oriented languages. The unification
is made possible by means of a new type system based on path-dependent
types \cite{odersky-et-al:fool10}.
\end{itemize}
There are several other languages that try to bridge the gap between
the functional and object oriented
paradigms. Smalltalk\cite{goldberg-robson:smalltalk-language},
Python\cite{rossum:python}, or Ruby\cite{matsumtoto:ruby} come to
mind. Unlike these languages, Scala has an advanced static type
system, which contains several innovative constructs.  This aspect
makes the Scala definition a bit more complicated than those of the
languages above. On the other hand, Scala enjoys the robustness,
safety and scalability benefits of strong static typing. Furthermore,
Scala incorporates recent advances in type inference, so that
excessive type annotations in user programs can usually be avoided.

% rest of this report is structured as follows. Chapters
%\ref{sec:simple-examples} to \ref{sec:concurrency} give an informal overview of
%Scala by means of a sequence of program examples.  The remaining
%chapters contain the language definition. The definition is formulated
%in prose but tries to be precise.
%% $Id$

There are hundreds of programming languages in active use, and many
more are being designed each year. It is therefore hard to justify the
development of yet another language. Nevertheless, this is what we
attempt to do here.  Our argument is based on two claims:
\begin{itemize}
\item[] {\em Claim 1:} The raise in importance of web services and
other distributed software is a fundamental paradigm
shift in programming. It is comparable in scale to the shift 20 years ago
from character-oriented to graphical user interfaces.
\item[] {\em Claim 2:} That paradigm shift will provide demand
for new programming languages, just as graphical user interfaces
promoted the adoption of object-oriented languages.
\end{itemize}
For the last 20 years, the most common programming model was
object-oriented: System components are objects, and computation is
done by method calls.  Methods themselves take object references as
parameters. Remote method calls let one extend this programming model
to distributed systems. The problem of this model is that it does not
scale up very well to wide-scale networks where messages can be
delayed and components may fail. Web services address the message
delay problem by increasing granularity, using method calls with
larger, structured arguments, such as XML trees.  They address the
failure problem by using transparent replication and avoiding server
state.  Conceptually, they are {\em tree transformers} that consume
incoming message documents and produce outgoing ones.  \comment{ To
back up the first claim, one observes that web services and other
distributed software increasingly tend to communicate using structured
or semi-structured data. A typical example is the use of XML to
describe data managed by applications as well as the messages between
applications. This tends to affect the role of a program in a
fundamental way. Previously, programs could be seen as objects that
reacted to method calls and in turn called methods of other
objects. Some of these method calls might originate from users while
others might originate from other computers via remote invocations.
These method calls have simple unstructured parameters or object
references as arguments.  Web services, on the other hand, communicate
with each other by transmitting asynchronous messages that carry
structured documents, usually in XML format. Programs then
conceptually become {\em tree transformers} that consume incoming
message documents and produce outgoing ones.  }

Why should this have an effect on programming languages? There are at
least two reasons: First, today's object-oriented languages are not
very good at analyzing and transforming XML trees. Because such trees
usually contain data but no methods, they have to be decomposed and
constructed from the ``outside'', that is from code which is external
to the tree definition itself. In an object-oriented language, the
ways of doing so are limited. The most common solution \cite{w3c:dom} is
to represent trees in a generic way, where all tree nodes are values
of a common type.  This makes it easy to write generic traversal
functions, but forces applications to operate on a very low conceptual
level, which often loses important semantic distinctions present in
the XML data.  More semantic precision is obtained if different
internal types model different kinds of nodes.  But then tree
decompositions require the use of run-time type tests and type casts
to adapt the treatment to the kind of node encountered. Such type
tests and type casts are generally not considered good object-oriented
style. They are rarely efficient, nor easy to use.


Conceivably, the glue problem could be addressed by a ``multi-paradigm''
language that would express object-oriented, concurrent, as well
as functional aspects of an application.  But one needs to be careful
not to simply replace cross-language glue by awkward interfaces
between different paradigms within the language itself.  Ideally, one
would hope for a fusion which unifies concepts found in different
paradigms instead of an agglutination, which merely includes them side
by side.  This fusion is what we try to achieve with Scala
\footnote{Scala stands for ``Scalable Language''. The term means
  ``Stairway'' in Italian}.

Scala is both an object-oriented and functional language.  It is a
pure object-oriented language in the sense that every value is an
object. Types and behavior of objects are described by
classes. Classes can be composed using mixin composition.  Scala is
designed work seamlessly with mainstream object-oriented languages,
in particular Java and C\#.

Scala is also a functional language in the sense that every function
is a value. Nesting of function definitions and higher-order functions
are naturally supported. Scala also supports a general notion of
pattern matching which can model the algebraic types used in many
functional languages. Furthermore, this notion of pattern matching
naturally extends to the processing of XML data.

The design of Scala is driven by the desire to unify object-oriented
and functional elements. Here are three examples how this is achieved:
\begin{itemize}
\item
Since every function is a value and every value is an object, it
follows that every function in Scala is an object. Indeed, there is a
root class for functions which is specialized in the Scala standard
library to data structures such as arrays and hash tables.
\item
Data structures in many functional languages are defined using
algebraic data types. They are decomposed using pattern matching.
Object-oriented languages, on the other hand, describe data with class
hierarchies. Algebraic data types are usually closed, in that the
range of alternatives of a type is fixed when the type is defined.  By
contrast, class hierarchies can be extended by adding new leaf
classes.  Scala adopts the object-oriented class hierarchy scheme for
data definitions, but allows pattern matching against values coming
from a whole class hierarchy, not just values of a single type.
This can express both closed and extensible data types, and also
provides a convenient way to exploit run-time type information in
cases where static typing is too restrictive.
\item
Module systems of functional languages such as SML or Caml excel in
abstraction; they allow very precise control over visibility of names
and types, including the ability to partially abstract over types.  By
contrast, object-oriented languages excel in composition; they offer
several composition mechanisms lacking in module systems, including
inheritance and unlimited recursion between objects and classes.
Scala unifies the notions of object and module, of module signature
and interface, as well as of functor and class. This combines the
abstraction facilities of functional module systems with the
composition constructs of object-oriented languages. The unification
is made possible by means of a new type system based on path-dependent
types \cite{odersky-et-al:fool10}.
\end{itemize}
There are several other languages that try to bridge the gap between
the functional and object oriented
paradigms. Smalltalk\cite{goldberg-robson:smalltalk-language},
Python\cite{rossum:python}, or Ruby\cite{matsumtoto:ruby} come to
mind. Unlike these languages, Scala has an advanced static type
system, which contains several innovative constructs.  This aspect
makes the Scala definition a bit more complicated than those of the
languages above. On the other hand, Scala enjoys the robustness,
safety and scalability benefits of strong static typing. Furthermore,
Scala incorporates recent advances in type inference, so that
excessive type annotations in user programs can usually be avoided.

% rest of this report is structured as follows. Chapters
%\ref{sec:simple-examples} to \ref{sec:concurrency} give an informal overview of
%Scala by means of a sequence of program examples.  The remaining
%chapters contain the language definition. The definition is formulated
%in prose but tries to be precise.