diff options
Diffstat (limited to 'docs/dotc-internals/overall-structure.md')
-rw-r--r-- | docs/dotc-internals/overall-structure.md | 174 |
1 files changed, 174 insertions, 0 deletions
diff --git a/docs/dotc-internals/overall-structure.md b/docs/dotc-internals/overall-structure.md new file mode 100644 index 000000000..a80c35b4c --- /dev/null +++ b/docs/dotc-internals/overall-structure.md @@ -0,0 +1,174 @@ +# Dotc's Overall Structure + +The compiler code is found in package [dotty.tools](https://github.com/lampepfl/dotty/tree/master/src/dotty/tools). It spans the +following three sub-packages: + + backend Compiler backends (currently for JVM and JS) + dotc The main compiler + io Helper modules for file access and classpath handling. + +The [dotc](https://github.com/lampepfl/dotty/tree/master/src/dotty/tools/dotc) +package contains some main classes that can be run as separate +programs. The most important one is class +[Main](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Main.scala). +`Main` inherits from +[Driver](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Driver.scala) which +contains the highest level functions for starting a compiler and processing some sources. +`Driver` in turn is based on two other high-level classes, +[Compiler](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Compiler.scala) and +[Run](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Run.scala). + +## Package Structure + +Most functionality of `dotc` is implemented in subpackages of `dotc`. Here's a list of sub-packages +and their focus. + + ast Abstract syntax trees, + config Compiler configuration, settings, platform specific definitions. + core Core data structures and operations, with specific subpackages for: + + core.classfile Reading of Java classfiles into core data structures + core.tasty Reading and writing of TASTY files to/from core data structures + core.unpickleScala2 Reading of Scala2 symbol information into core data structures + + parsing Scanner and parser + printing Pretty-printing trees, types and other data + repl The interactive REPL + reporting Reporting of error messages, warnings and other info. + rewrite Helpers for rewriting Scala 2's constructs into dotty's. + transform Miniphases and helpers for tree transformations. + typer Type-checking and other frontend phases + util General purpose utility classes and modules. + +## Contexts + +`dotc` has almost no global state (the only significant bit of global state is the name table, +which is used to hash strings into unique names). Instead, all essential bits of information that +can vary over a compiler run are collected in a +[Context](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Contexts.scala). +Most methods in `dotc` take a Context value as an implicit parameter. + +Contexts give a convenient way to customize values in some part of the +call-graph. To run, e.g. some compiler function `f` at a given +phase `phase`, we invoke `f` with an explicit context parameter, like +this + + f(/*normal args*/)(ctx.withPhase(phase)) + +This assumes that `f` is defined in the way most compiler functions are: + + def f(/*normal parameters*/)(implicit ctx: Context) ... + +Compiler code follows the convention that all implicit `Context` +parameters are named `ctx`. This is important to avoid implicit +ambiguities in the case where nested methods contain each a Context +parameters. The common name ensures then that the implicit parameters +properly shadow each other. + +Sometimes we want to make sure that implicit contexts are not captured +in closures or other long-lived objects, be it because we want to +enforce that nested methods each get their own implicit context, or +because we want to avoid a space leak in the case where a closure can +survive several compiler runs. A typical case is a completer for a +symbol representing an external class, which produces the attributes +of the symbol on demand, and which might never be invoked. In that +case we follow the convention that any context parameter is explicit, +not implicit, so we can track where it is used, and that it has a name +different from `ctx`. Commonly used is `ictx` for "initialization +context". + +With these two conventions in place, it has turned out that implicit +contexts work amazingly well as a device for dependency injection and +bulk parameterization. There is of course always the danger that +an unexpected implicit will be passed, but in practice this has not turned out to +be much of a problem. + +## Compiler Phases + +Seen from a temporal perspective, the `dotc` compiler consists of a list of phases. +The current list of phases is specified in class [Compiler](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Compiler.scala) as follows: + +```scala + def phases: List[List[Phase]] = List( + List(new FrontEnd), // Compiler frontend: scanner, parser, namer, typer + List(new PostTyper), // Additional checks and cleanups after type checking + List(new Pickler), // Generate TASTY info + List(new FirstTransform, // Some transformations to put trees into a canonical form + new CheckReentrant), // Internal use only: Check that compiled program has no data races involving global vars + List(new RefChecks, // Various checks mostly related to abstract members and overriding + new CheckStatic, // Check restrictions that apply to @static members + new ElimRepeated, // Rewrite vararg parameters and arguments + new NormalizeFlags, // Rewrite some definition flags + new ExtensionMethods, // Expand methods of value classes with extension methods + new ExpandSAMs, // Expand single abstract method closures to anonymous classes + new TailRec, // Rewrite tail recursion to loops + new LiftTry, // Put try expressions that might execute on non-empty stacks into their own methods + new ClassOf), // Expand `Predef.classOf` calls. + List(new PatternMatcher, // Compile pattern matches + new ExplicitOuter, // Add accessors to outer classes from nested ones. + new ExplicitSelf, // Make references to non-trivial self types explicit as casts + new CrossCastAnd, // Normalize selections involving intersection types. + new Splitter), // Expand selections involving union types into conditionals + List(new VCInlineMethods, // Inlines calls to value class methods + new SeqLiterals, // Express vararg arguments as arrays + new InterceptedMethods, // Special handling of `==`, `|=`, `getClass` methods + new Getters, // Replace non-private vals and vars with getter defs (fields are added later) + new ElimByName, // Expand by-name parameters and arguments + new AugmentScala2Traits, // Expand traits defined in Scala 2.11 to simulate old-style rewritings + new ResolveSuper), // Implement super accessors and add forwarders to trait methods + List(new Erasure), // Rewrite types to JVM model, erasing all type parameters, abstract types and refinements. + List(new ElimErasedValueType, // Expand erased value types to their underlying implementation types + new VCElideAllocations, // Peep-hole optimization to eliminate unnecessary value class allocations + new Mixin, // Expand trait fields and trait initializers + new LazyVals, // Expand lazy vals + new Memoize, // Add private fields to getters and setters + new LinkScala2ImplClasses, // Forward calls to the implementation classes of traits defined by Scala 2.11 + new NonLocalReturns, // Expand non-local returns + new CapturedVars, // Represent vars captured by closures as heap objects + new Constructors, // Collect initialization code in primary constructors + // Note: constructors changes decls in transformTemplate, no InfoTransformers should be added after it + new FunctionalInterfaces,// Rewrites closures to implement @specialized types of Functions. + new GetClass), // Rewrites getClass calls on primitive types. + List(new LambdaLift, // Lifts out nested functions to class scope, storing free variables in environments + // Note: in this mini-phase block scopes are incorrect. No phases that rely on scopes should be here + new ElimStaticThis, // Replace `this` references to static objects by global identifiers + new Flatten, // Lift all inner classes to package scope + new RestoreScopes), // Repair scopes rendered invalid by moving definitions in prior phases of the group + List(new ExpandPrivate, // Widen private definitions accessed from nested classes + new CollectEntryPoints, // Find classes with main methods + new LabelDefs), // Converts calls to labels to jumps + List(new GenSJSIR), // Generate .js code + List(new GenBCode) // Generate JVM bytecode + ) +``` + +Note that phases are grouped, so the `phases` method is of type +`List[List[Phase]]`. The idea is that all phases in a group are +*fused* into a single tree traversal. That way, phases can be kept +small (most phases perform a single function) without requiring an +excessive number of tree traversals (which are costly, because they +have generally bad cache locality). + +Phases fall into four categories: + + - Frontend phases: `Frontend`, `PostTyper` and `Pickler`. `FrontEnd` parses the source programs and generates + untyped abstract syntax trees, which are then typechecked and transformed into typed abstract syntax trees. + `PostTyper` performs checks and cleanups that require a fully typed program. In particular, it + + - creates super accessors representing `super` calls in traits + - creates implementations of synthetic (compiler-implemented) methods + - avoids storing parameters passed unchanged from subclass to superclass in duplicate fields. + + Finally `Pickler` serializes the typed syntax trees produced by the frontend as TASTY data structures. + + - High-level transformations: All phases from `FirstTransform` to `Erasure`. Most of these phases transform + syntax trees, expanding high-level constructs to more primitive ones. The last phase in the group, `Erasure` + translates all types into types supported directly by the JVM. To do this, it performs another type checking + pass, but using the rules of the JVM's type system instead of Scala's. + + - Low-level transformations: All phases from `ElimErasedValueType` to `LabelDefs`. These + further transform trees until they are essentially a structured version of Java bytecode. + + - Code generators: These map the transformed trees to Java classfiles or Javascript files. + + |