aboutsummaryrefslogtreecommitdiff
path: root/docs/dotc-internals
diff options
context:
space:
mode:
authorMartin Odersky <odersky@gmail.com>2016-03-15 11:22:53 +0100
committerMartin Odersky <odersky@gmail.com>2016-04-01 11:20:18 +0200
commit5ab9e4d025b14760cdeacb7aafd44d920729f538 (patch)
treec4a0efa8ad4ccd1815e49e63262c89be452d739f /docs/dotc-internals
parenta73a1d9f17d0de945262fbf1c61aa68b105a02d1 (diff)
downloaddotty-5ab9e4d025b14760cdeacb7aafd44d920729f538.tar.gz
dotty-5ab9e4d025b14760cdeacb7aafd44d920729f538.tar.bz2
dotty-5ab9e4d025b14760cdeacb7aafd44d920729f538.zip
New article: dotc's overall structure
Diffstat (limited to 'docs/dotc-internals')
-rw-r--r--docs/dotc-internals/overall-structure.md165
1 files changed, 165 insertions, 0 deletions
diff --git a/docs/dotc-internals/overall-structure.md b/docs/dotc-internals/overall-structure.md
new file mode 100644
index 000000000..d5828ea13
--- /dev/null
+++ b/docs/dotc-internals/overall-structure.md
@@ -0,0 +1,165 @@
+# Dotc's Overall Structure
+
+The compiler code is found in package `dotty.tools`. It spans the
+following three sub-packages:
+
+ backend Compiler backends (currently for JVM and JS)
+ dotc The main compiler
+ io Helper modules for file access and classpath handling.
+
+The `dotc` package contains some main classes that can be run as separate
+programs. The most important one is class `Main`. `Main` inherits from `Driver` which
+contains the highest level functions for starting a compiler and processing some sources
+`Driver` in turn is based on two other high-level classes, `Compiler` and `Run`.
+
+## Package Structure
+
+Most functionality of `dotc` is implemented in subpackages of `dotc`. Here's a list of sub-packages
+and their focus.
+
+ ast Abstract syntax trees,
+ config Compiler configuration, settings, platform specific definitions.
+ core Core data structures and operations, with specific subpackages for:
+
+ core.classfile Reading of Java classfiles into core data structures
+ core.tasty Reading and writing of TASTY files to/from core data structures
+ core.unpickleScala2 Reading of Scala2 symbol information into core data structures
+
+ parsing Scanner and parser
+ printing Pretty-printing trees, types and other data
+ repl The interactive REPL
+ reporting Reporting of error messages, warnings and other info.
+ rewrite Helpers for rewriting Scala 2's constructs into dotty's.
+ transform Miniphases and helpers for tree transformations.
+ typer Type-checking and other frontend phases
+ util General purpose utility classes and modules.
+
+## Contexts
+
+`dotc` has almost no global state (the only significant bit of global state is the name table,
+which is used to hash strings into unique names). Instead, all essential bits of information that
+can vary over a compiler run are collected in a [Context](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Context.scala). Most methods in `dotc` take a Context value as an implicit parameter.
+
+Contexts give a convenient way to customize values in some part of the
+call-graph. To run, e.g. some compiler function `f` at a given
+phase `phase`, we invoke `f` with an explicit context parameter, like
+this
+
+ f(/*normal args*/)(ctx.withPhase(phase))
+
+This assumes that `f` is defined in way most compiler functions are:
+
+ def f(/*normal parameters*/)(implicit ctx: Context) ...
+
+Compiler code follows the convention that all implicit `Context`
+parameters are named `ctx`. This is important to avoid implicit
+ambiguities in the case where nested methods contain each a Context
+parameters. The common name ensures then that the implicit parameters
+properly shadow each other.
+
+Sometimes we want to make sure that implicit contexts are not captured
+in closures or other long-lived objects, be it because we want to
+enforce that nested methods each get their own implicit context, or
+because we want to avoid a space leak in the case where a closure can
+survive several compiler runs. A typical case is a completer for a
+symbol representing an external class, which produces the attributes
+of the symbol on demand, and which might never be invoked. In that
+case we follow the convention that any context parameter is explicit,
+not implicit, so we can track where it is used, and that it has a name
+different from `ctx`. Commonly used is `ictx` for "initialization
+context".
+
+With these two conventions is has turned out that the use of implicit
+contexts as an dependency injection and bulk parameterization device
+worked exceptionally well. There were not very many bugs related to
+passing the wrong context by accident.
+
+## Compiler Phases
+
+Seen from a temporal perspective, the `dotc` compiler consists of a list of phases.
+The current list of phases is specified in class [Compiler] as follows:
+
+```scala
+ def phases: List[List[Phase]] = List(
+ List(new FrontEnd), // Compiler frontend: scanner, parser, namer, typer
+ List(new PostTyper), // Additional checks and cleanups after type checking
+ List(new Pickler), // Generate TASTY info
+ List(new FirstTransform, // Some transformations to put trees into a canonical form
+ new CheckReentrant), // Internal use only: Check that compiled program has no data races involving global vars
+ List(new RefChecks, // Various checks mostly related to abstract members and overriding
+ new CheckStatic, // Check restrictions that apply to @static members
+ new ElimRepeated, // Rewrite vararg parameters and arguments
+ new NormalizeFlags, // Rewrite some definition flags
+ new ExtensionMethods, // Expand methods of value classes with extension methods
+ new ExpandSAMs, // Expand single abstract method closures to anonymous classes
+ new TailRec, // Rewrite tail recursion to loops
+ new LiftTry, // Put try expressions that might execute on non-empty stacks into their own methods
+ new ClassOf), // Expand `Predef.classOf` calls.
+ List(new PatternMatcher, // Compile pattern matches
+ new ExplicitOuter, // Add accessors to outer classes from nested ones.
+ new ExplicitSelf, // Make references to non-trivial self types explicit as casts
+ new CrossCastAnd, // Normalize selections involving intersection types.
+ new Splitter), // Expand selections involving union types into conditionals
+ List(new VCInlineMethods, // Inlines calls to value class methods
+ new SeqLiterals, // Express vararg arguments as arrays
+ new InterceptedMethods, // Special handling of `==`, `|=`, `getClass` methods
+ new Getters, // Replace non-private vals and vars with getter defs (fields are added later)
+ new ElimByName, // Expand by-name parameters and arguments
+ new AugmentScala2Traits, // Expand traits defined in Scala 2.11 to simulate old-style rewritings
+ new ResolveSuper), // Implement super accessors and add forwarders to trait methods
+ List(new Erasure), // Rewrite types to JVM model, erasing all type parameters, abstract types and refinements.
+ List(new ElimErasedValueType, // Expand erased value types to their underlying implementation types
+ new VCElideAllocations, // Peep-hole optimization to eliminate unnecessary value class allocations
+ new Mixin, // Expand trait fields and trait initializers
+ new LazyVals, // Expand lazy vals
+ new Memoize, // Add private fields to getters and setters
+ new LinkScala2ImplClasses, // Forward calls to the implementation classes of traits defined by Scala 2.11
+ new NonLocalReturns, // Expand non-local returns
+ new CapturedVars, // Represent vars captured by closures as heap objects
+ new Constructors, // Collect initialization code in primary constructors
+ // Note: constructors changes decls in transformTemplate, no InfoTransformers should be added after it
+ new FunctionalInterfaces,// Rewrites closures to implement @specialized types of Functions.
+ new GetClass), // Rewrites getClass calls on primitive types.
+ List(new LambdaLift, // Lifts out nested functions to class scope, storing free variables in environments
+ // Note: in this mini-phase block scopes are incorrect. No phases that rely on scopes should be here
+ new ElimStaticThis, // Replace `this` references to static objects by global identifiers
+ new Flatten, // Lift all inner classes to package scope
+ new RestoreScopes), // Repair scopes rendered invalid by moving definitions in prior phases of the group
+ List(new ExpandPrivate, // Widen private definitions accessed from nested classes
+ new CollectEntryPoints, // Find classes with main methods
+ new LabelDefs), // Converts calls to labels to jumps
+ List(new GenSJSIR), // Generate .js code
+ List(new GenBCode) // Generate JVM bytecode
+ )
+```
+
+Note that phases are grouped, so the `phases` value is a
+`List[List[Phase]]`. The idea is that all phases in a group are be
+*fused* into a single tree traversal. That way, phases can be kept
+small (most phases perform a single function) without requiring an
+excessive number of tree traversals (which are costly, because they
+have generally bad cache locality).
+
+Phases fall into 4 categories:
+
+ - Frontend phases: `Frontend`, `PostTyper` and `Pickler`. `FrontEnd` parses the source programs and generates
+ untyped abstract syntax trees, which are then typechecked and transformed into typed abstract syntax trees.
+ `PostTyper` performs checks and cleanups that require a fully typed program. In particular, it
+
+ - creates super accessors representing `super` calls in traits
+ - creates implementations of synthetic (compiler-implemented) methods
+ - avoids storing parameters passed unchanged from subclass to superclass in duplicate fields.
+
+ Finally `Pickler` serializes the typed syntax trees produced by the frontend as TASTY data structures.
+
+ - High-level transformations: All phases from `FirstTransform` to `Erasure`. Most of these phases transform
+ syntax trees, expanding high-level constructs to more primitive ones. The last phase in the group, `Erasure`
+ translates all types into types supported directly by the JVM. To do this, it performs another type checking
+ pass, but using the rules of the JVM's type system instead of Scala's.
+
+ - Low-level transformations: All phases from `ElimErasedValueType` to `LabelDefs`. These
+ further transform trees until they are just a structured version of Java bytecode.
+
+ - Code generators: These map the transformed trees to Java classfiles or Javascript files.
+
+