From 2ffc7cfaf05217394708d2c00ab85ab07663d03c Mon Sep 17 00:00:00 2001 From: Felix Mulder Date: Fri, 7 Oct 2016 14:35:35 +0200 Subject: Migrate dotty.epfl.ch to static site in repository --- docs/docs/internals/backend.md | 127 +++++++++ docs/docs/internals/benchmarks.md | 5 + docs/docs/internals/classpaths.md | 42 +++ docs/docs/internals/contexts.md | 55 ++++ docs/docs/internals/dotc-scalac.md | 104 +++++++ docs/docs/internals/higher-kinded-v2.md | 461 +++++++++++++++++++++++++++++++ docs/docs/internals/overall-structure.md | 191 +++++++++++++ docs/docs/internals/periods.md | 96 +++++++ docs/docs/internals/type-system.md | 134 +++++++++ 9 files changed, 1215 insertions(+) create mode 100644 docs/docs/internals/backend.md create mode 100644 docs/docs/internals/benchmarks.md create mode 100644 docs/docs/internals/classpaths.md create mode 100644 docs/docs/internals/contexts.md create mode 100644 docs/docs/internals/dotc-scalac.md create mode 100644 docs/docs/internals/higher-kinded-v2.md create mode 100644 docs/docs/internals/overall-structure.md create mode 100644 docs/docs/internals/periods.md create mode 100644 docs/docs/internals/type-system.md (limited to 'docs/docs/internals') diff --git a/docs/docs/internals/backend.md b/docs/docs/internals/backend.md new file mode 100644 index 000000000..1fb9bba26 --- /dev/null +++ b/docs/docs/internals/backend.md @@ -0,0 +1,127 @@ +--- +layout: default +title: "Backend Internals" +--- + +Backend Internals +================= +The code for the backend is split up by functionality and assembled in the +objet `GenBCode`. + +```none +object GenBCode --- [defines] --> PlainClassBuilder GenBCode also defines class BCodePhase, the compiler phase + | | + [extends] [extends] + | | +BCodeSyncAndTry ----------------> SyncAndTryBuilder + | | +BCodeBodyBuilder ----------------> PlainBodyBuilder + | | +BCodeSkelBuilder ----------------> PlainSkelBuilder + | / | \ + BCodeHelpers ----------------> BCClassGen BCAnnotGen ... (more components) + | \ \ + | \ \-------------> helper methods + | \ \------------> JMirrorBuilder, JBeanInfoBuilder (uses some components, e.g. BCInnerClassGen) + | \ + | BytecodeWriters ---------> methods and classes to write byte code files + | + BCodeTypes ----------------> maps and fields for common BTypes, class Tracked, methods to collect information on classes, tests for BTypes (conforms), ... + | +BCodeIdiomatic ----------------> utilities for code generation, e.g. genPrimitiveArithmetic + | + BCodeGlue ----------------> BType class, predefined BTypes +``` + +### Data Flow ### +Compiler creates a `BCodePhase`, calls `runOn(compilationUnits)`. + * initializes fields of `GenBCode` defined in `BCodeTypes` (BType maps, + common BTypes like `StringReference`) + * initialize `primitives` map defined in `scalaPrimitives` (maps primitive + members, like `int.+`, to bytecode instructions) + * creates `BytecodeWriter`, `JMirrorBuilder` and `JBeanInfoBuilder` instances + (on each compiler run) + * `buildAndSendToDisk(units)`: uses work queues, see below. + - `BCodePhase.addToQ1` adds class trees to `q1` + - `Worker1.visit` creates ASM `ClassNodes`, adds to `q2`. It creates one + `PlainClassBuilder` for each compilation unit. + - `Worker2.addToQ3` adds byte arrays (one for each class) to `q3` + - `BCodePhase.drainQ3` writes byte arrays to disk + + +### Architecture ### +The architecture of `GenBCode` is the same as in Scalac. It can be partitioned +into weakly coupled components (called "subsystems" below): + + +#### (a) The queue subsystem #### +Queues mediate between processors, queues don't know what each processor does. + +The first queue contains AST trees for compilation units, the second queue +contains ASM ClassNodes, and finally the third queue contains byte arrays, +ready for serialization to disk. + +Currently the queue subsystem is all sequential, but as can be seen in +http://magarciaepfl.github.io/scala/ the above design enables overlapping (a.1) +building of ClassNodes, (a.2) intra-method optimizations, and (a.3) +serialization to disk. + +This subsystem is described in detail in GenBCode.scala + +#### (b) Bytecode-level types, BType #### +The previous bytecode emitter goes to great lengths to reason about +bytecode-level types in terms of Symbols. + +GenBCode uses BType as a more direct representation. A BType is immutable, and +a value class (once the rest of GenBCode is merged from +http://magarciaepfl.github.io/scala/ ). + +Whether value class or not, its API is the same. That API doesn't reach into +the type checker. Instead, each method on a BType answers a question that can +be answered based on the BType itself. Sounds too simple to be good? It's a +good building block, that's what it is. + +The internal representation of a BType is based on what the JVM uses: internal +names (eg Ljava/lang/String; ) and method descriptors; as defined in the JVM +spec (that's why they aren't documented in GenBCode, just read the spec). + +All things BType can be found in BCodeGlue.scala + +#### (c) Utilities offering a more "high-level" API to bytecode emission #### +Bytecode can be emitted one opcode at a time, but there are recurring patterns +that call for a simpler API. + +For example, when emitting a load-constant, a dedicated instruction exists for +emitting load-zero. Similarly, emitting a switch can be done according to one +of two strategies. + +All these utilities are encapsulated in file BCodeIdiomatic.scala. They know +nothing about the type checker (because, just between us, they don't need to). + +#### (d) Mapping between type-checker types and BTypes #### +So that (c) can remain oblivious to what AST trees contain, some bookkeepers +are needed: + + - Tracked: for a bytecode class (BType), its superclass, directly declared + interfaces, and inner classes. + +To understand how it's built, see: + +```scala +final def exemplar(csym0: Symbol): Tracked = { ... } +``` + +Details in BCodeTypes.scala + +#### (e) More "high-level" utilities for bytecode emission #### +In the spirit of BCodeIdiomatic, utilities are added in BCodeHelpers for +emitting: + +- bean info class +- mirror class and their forwarders +- android-specific creator classes +- annotations + + +#### (f) Building an ASM ClassNode given an AST TypeDef #### +It's done by PlainClassBuilder diff --git a/docs/docs/internals/benchmarks.md b/docs/docs/internals/benchmarks.md new file mode 100644 index 000000000..4d24ec0ff --- /dev/null +++ b/docs/docs/internals/benchmarks.md @@ -0,0 +1,5 @@ +The regression benchmark infrastructure is still under construction. + +A preview can be found below: + +- [d-d.me/tnc/dotty/web/](https://d-d.me/tnc/dotty/web/) \ No newline at end of file diff --git a/docs/docs/internals/classpaths.md b/docs/docs/internals/classpaths.md new file mode 100644 index 000000000..0038b5de0 --- /dev/null +++ b/docs/docs/internals/classpaths.md @@ -0,0 +1,42 @@ +When ran from the `dotty` script, this is the classloader stack: + +``` +===================================================== +class sun.misc.Launcher$AppClassLoader <= corresponds to java.class.path +sun.misc.Launcher$AppClassLoader@591ce4fe +file:/mnt/data-local/Work/Workspace/dev-2.11/dotty/target/scala-2.11.0-M7/dotty_2.11.0-M7-0.1-SNAPSHOT.jar:file:/home/sun/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.11.0-M7.jar:file:/home/sun/.ivy2/cache/org.scala-lang/scala-reflect/jars/scala-reflect-2.11.0-M7.jar +===================================================== +class sun.misc.Launcher$ExtClassLoader <= corresponds to sun.boot.class.path +sun.misc.Launcher$ExtClassLoader@77fe0d66 +file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar +===================================================== +``` + +When running from sbt or Eclipse, the classloader stack is: + +``` +===================================================== +class sbt.classpath.ClasspathUtilities$$anon$1 +sbt.classpath.ClasspathUtilities$$anon$1@22a29f97 +file:/mnt/data-local/Work/Workspace/dev-2.11/dotty/target/scala-2.11.0-M7/classes/:file:/home/sun/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.11.0-M7.jar:file:/home/sun/.ivy2/cache/org.scala-lang/scala-reflect/jars/scala-reflect-2.11.0-M7.jar:file:/home/sun/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M7/bundles/scala-xml_2.11.0-M7-1.0.0-RC7.jar +===================================================== +class java.net.URLClassLoader +java.net.URLClassLoader@2167c879 +file:/home/sun/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.11.0-M7.jar:file:/home/sun/.ivy2/cache/org.scala-lang/scala-compiler/jars/scala-compiler-2.11.0-M7.jar:file:/home/sun/.ivy2/cache/org.scala-lang/scala-reflect/jars/scala-reflect-2.11.0-M7.jar:file:/home/sun/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M6/bundles/scala-xml_2.11.0-M6-1.0.0-RC6.jar:file:/home/sun/.ivy2/cache/org.scala-lang.modules/scala-parser-combinators_2.11.0-M6/bundles/scala-parser-combinators_2.11.0-M6-1.0.0-RC4.jar:file:/home/sun/.ivy2/cache/jline/jline/jars/jline-2.11.jar +===================================================== +class xsbt.boot.BootFilteredLoader +xsbt.boot.BootFilteredLoader@73c74402 +not a URL classloader +===================================================== +class sun.misc.Launcher$AppClassLoader <= corresponds to java.class.path +sun.misc.Launcher$AppClassLoader@612dcb8c +file:/home/sun/.sbt/.lib/0.13.0/sbt-launch.jar +===================================================== +class sun.misc.Launcher$ExtClassLoader <= corresponds to sun.boot.class.path +sun.misc.Launcher$ExtClassLoader@58e862c +file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar:file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar +===================================================== +``` +Since scala/dotty only pick up `java.class.path` and `sun.boot.class.path`, +it's clear why dotty crashes in sbt and Eclipse unless we set the boot +classpath explicitly. diff --git a/docs/docs/internals/contexts.md b/docs/docs/internals/contexts.md new file mode 100644 index 000000000..e2111029c --- /dev/null +++ b/docs/docs/internals/contexts.md @@ -0,0 +1,55 @@ +--- +title: Contexts +layout: default +--- + +Contexts +======== +The `Context` contains the state of the compiler, for example + * `settings` + * `freshNames` (`FreshNameCreator`) + * `period` (run and phase id) + * `compilationUnit` + * `phase` + * `tree` (current tree) + * `typer` (current typer) + * `mode` (type checking mode) + * `typerState` (for example undetermined type variables) + * ... + +### Contexts in the typer ### +The type checker passes contexts through all methods and adapts fields where +necessary, e.g. + +```scala +case tree: untpd.Block => typedBlock(desugar.block(tree), pt)(ctx.fresh.withNewScope) +``` + +A number of fields in the context are typer-specific (`mode`, `typerState`). + +### In other phases ### +Other phases need a context for many things, for example to access the +denotation of a symbols (depends on the period). However they typically don't +need to modify / extend the context while traversing the AST. For these phases +the context can be simply an implicit class parameter that is then available in +all members. + +**Careful**: beware of memory leaks. Don't hold on to contexts in long lived +objects. + +### Using contexts ### +Nested contexts should be named `ctx` to enable implicit shadowing: + +```scala +scala> class A + +scala> def foo(implicit a: A) { def bar(implicit b: A) { println(implicitly[A]) } } +:8: error: ambiguous implicit values: + both value a of type A + and value b of type A + match expected type A + def foo(implicit a: A) { def bar(implicit b: A) { println(implicitly[A]) } } + +scala> def foo(implicit a: A) { def bar(implicit a: A) { println(implicitly[A]) } } +foo: (implicit a: A)Unit +``` diff --git a/docs/docs/internals/dotc-scalac.md b/docs/docs/internals/dotc-scalac.md new file mode 100644 index 000000000..cf668cbb8 --- /dev/null +++ b/docs/docs/internals/dotc-scalac.md @@ -0,0 +1,104 @@ +--- +layout: default +title: "Scalac vs Dotty" +--- + +Differences between Scalac and Dotty +==================================== +Overview explanation how symbols, named types and denotations hang together: +[Denotations.scala:22] + +### Denotation ### +Comment with a few details: [Denotations.scala:70] + +A `Denotation` is the result of a name lookup during a given period + +* Most properties of symbols are now in the denotation (name, type, owner, + etc.) +* Denotations usually have a reference to the selected symbol +* Denotations may be overloaded (`MultiDenotation`). In this case the symbol + may be `NoSymbol` (the two variants have symbols). +* Non-overloaded denotations have an `info` + +Denotations of methods have a signature ([Signature.scala:7]), which +uniquely identifies overloaded methods. + +#### Denotation vs. SymDenotation #### +A `SymDenotation` is an extended denotation that has symbol-specific properties +(that may change over phases) +* `flags` +* `annotations` +* `info` + +`SymDenotation` implements lazy types (similar to scalac). The type completer +assigns the denotation's `info`. + +#### Implicit Conversion #### +There is an implicit conversion: +```scala +core.Symbols.toDenot(sym: Symbol)(implicit ctx: Context): SymDenotation +``` + +Because the class `Symbol` is defined in the object `core.Symbols`, the +implicit conversion does **not** need to be imported, it is part of the +implicit scope of the type `Symbol` (check the Scala spec). However, it can +only be applied if an implicit `Context` is in scope. + +### Symbol ### +* `Symbol` instances have a `SymDenotation` +* Most symbol properties in scalac are now in the denotation (in dotc) + +Most of the `isFooBar` properties in scalac don't exist anymore in dotc. Use +flag tests instead, for example: + +```scala +if (sym.isPackageClass) // scalac +if (sym is Flags.PackageClass) // dotc (*) +``` + +`(*)` Symbols are implicitly converted to their denotation, see above. Each +`SymDeotation` has flags that can be queried using the `is` method. + +### Flags ### +* Flags are instances of the value class `FlagSet`, which encapsulates a + `Long` +* Each flag is either valid for types, terms, or both + +``` +000..0001000..01 + ^ ^^ + flag | \ + | valid for term + valid for type +``` + +* Example: `Module` is valid for both module values and module classes, + `ModuleVal` / `ModuleClass` for either of the two. +* `flags.is(Method | Param)`: true if `flags` has either of the two +* `flags.is(allOf(Method | Deferred))`: true if `flags` has both. `allOf` + creates a `FlagConjunction`, so a different overload of `is` is chosen. + - Careful: `flags.is(Method & Deferred)` is always true, because `Method & + Deferred` is empty. + +### Tree ### +* Trees don't have symbols + - `tree.symbol` is `tree.denot.symbol` + - `tree.denot` is `tree.tpe.denot` where the `tpe` is a `NamdedType` (see + next point) +* Subclasses of `DenotingTree` (`Template`, `ValDef`, `DefDef`, `Select`, + `Ident`, etc.) have a `NamedType`, which has a `denot` field. The + denotation has a symbol. + - The `denot` of a `NamedType` (prefix + name) for the current period is + obtained from the symbol that the type refers to. This symbol is searched + using `prefix.member(name)`. + + +### Type ### + * `MethodType(paramSyms, resultType)` from scalac => + `mt @ MethodType(paramNames, paramTypes)`. Result type is `mt.resultType` + +`@todo` + +[Denotations.scala:22]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Denotations.scala#L22 +[Denotations.scala:70]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Denotations.scala#L70 +[Signature.scala:7]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Signature.scala#L7 diff --git a/docs/docs/internals/higher-kinded-v2.md b/docs/docs/internals/higher-kinded-v2.md new file mode 100644 index 000000000..3019e3031 --- /dev/null +++ b/docs/docs/internals/higher-kinded-v2.md @@ -0,0 +1,461 @@ +--- +layout: default +title: "Higher-Kinded Types in Dotty" +--- + +**This page is out of date and preserved for posterity. Please see [Implementing +Higher-Kinded Types in +Dotty](http://guillaume.martres.me/publications/dotty-hk.pdf) for a more up to +date version** + +Higher-Kinded Types in Dotty V2 +=============================== +This note outlines how we intend to represent higher-kinded types in Dotty. +The principal idea is to collapse the four previously disparate features of +refinements, type parameters, existentials and higher-kinded types into just +one: refinements of type members. All other features will be encoded using +these refinements. + +The complexity of type systems tends to grow exponentially with the number of +independent features, because there are an exponential number of possible +feature interactions. Consequently, a reduction from 4 to 1 fundamental +features achieves a dramatic reduction of complexity. It also adds some nice +usablilty improvements, notably in the area of partial type application. + +This is a second version of the scheme which differs in a key aspect from the +first one: Following Adriaan's idea, we use traits with type members to model +type lambdas and type applications. This is both more general and more robust +than the intersections with type constructor traits that we had in the first +version. + +The duality +----------- +The core idea: A parameterized class such as + +```scala +class Map[K, V] +``` + +is treated as equivalent to a type with type members: + +```scala +class Map { type Map$K; type Map$V } +``` + +The type members are name-mangled (i.e. `Map$K`) so that they do not conflict +with other members or parameters named `K` or `V`. + +A type-instance such as `Map[String, Int]` would then be treated as equivalent +to: + +```scala +Map { type Map$K = String; type Map$V = Int } +``` + +Named type parameters +--------------------- +Type parameters can have unmangled names. This is achieved by adding the `type` +keyword to a type parameter declaration, analogous to how `val` indicates a +named field. For instance, + +```scala +class Map[type K, type V] +``` + +is treated as equivalent to + +```scala +class Map { type K; type V } +``` + +The parameters are made visible as fields. + +Wildcards +--------- +A wildcard type such as `Map[_, Int]` is equivalent to: + +```scala +Map { type Map$V = Int } +``` + +I.e. `_`'s omit parameters from being instantiated. Wildcard arguments can have +bounds. E.g. + +```scala +Map[_ <: AnyRef, Int] +``` + +is equivalent to: + +```scala +Map { type Map$K <: AnyRef; type Map$V = Int } +``` + +Type parameters in the encodings +-------------------------------- +The notion of type parameters makes sense even for encoded types, which do not +contain parameter lists in their syntax. Specifically, the type parameters of a +type are a sequence of type fields that correspond to parameters in the +unencoded type. They are determined as follows. + +* The type parameters of a class or trait type are those parameter fields declared in the class + that are not yet instantiated, in the order they are given. Type parameter fields of parents + are not considered. +* The type parameters of an abstract type are the type parameters of its upper bound. +* The type parameters of an alias type are the type parameters of its right hand side. +* The type parameters of every other type is the empty sequence. + +Partial applications +-------------------- +The definition of type parameters in the previous section leads to a simple +model of partial applications. Consider for instance: + +```scala +type Histogram = Map[_, Int] +``` + +`Histogram` is a higher-kinded type that still has one type parameter. +`Histogram[String]` would be a possible type instance, and it would be +equivalent to `Map[String, Int]`. + + +Modelling polymorphic type declarations +--------------------------------------- +The partial application scheme gives us a new -- and quite elegant -- way to do +certain higher-kinded types. But how do we interprete the poymorphic types that +exist in current Scala? + +More concretely, current Scala allows us to write parameterized type +definitions, abstract types, and type parameters. In the new scheme, only +classes (and traits) can have parameters and these are treated as equivalent to +type members. Type aliases and abstract types do not allow the definition of +parameterized types so we have to interprete polymorphic type aliases and +abstract types specially. + +Modelling polymorphic type aliases: simple case +----------------------------------------------- +A polymorphic type alias such as: + +```scala +type Pair[T] = Tuple2[T, T] +``` + +where `Tuple2` is declared as + +```scala +class Tuple2[T1, T2] ... +``` + +is expanded to a monomorphic type alias like this: + +```scala +type Pair = Tuple2 { type Tuple2$T2 = Tuple2$T1 } +``` + +More generally, each type parameter of the left-hand side must appear as a type +member of the right hand side type. Type members must appear in the same order +as their corresponding type parameters. References to the type parameter are +then translated to references to the type member. The type member itself is +left uninstantiated. + +This technique can expand most polymorphic type aliases appearing in Scala +codebases but not all of them. For instance, the following alias cannot be +expanded, because the parameter type `T` is not a type member of the right-hand +side `List[List[T]]`. + +```scala +type List2[T] = List[List[T]] +``` + +We scanned the Scala standard library for occurrences of polymorphic type +aliases and determined that only two occurrences could not be expanded. In +`io/Codec.scala`: + +```scala +type Configure[T] = (T => T, Boolean) +``` + +And in `collection/immutable/HashMap.scala`: + +```scala +private type MergeFunction[A1, B1] = ((A1, B1), (A1, B1)) => (A1, B1) +``` + +For these cases, we use a fall-back scheme that models a parameterized alias as +a `Lambda` type. + +Modelling polymorphic type aliases: general case +------------------------------------------------ +A polymorphic type alias such as: + +```scala +type List2D[T] = List[List[T]] +``` + +is represented as a monomorphic type alias of a type lambda. Here's the +expanded version of the definition above: + +```scala +type List2D = Lambda$I { type Apply = List[List[$hkArg$0]] } +``` + +Here, `Lambda$I` is a standard trait defined as follows: + +```scala +trait Lambda$I[type $hkArg$0] { type +Apply } +``` + +The `I` suffix of the `Lambda` trait indicates that it has one invariant type +parameter (named $hkArg$0). Other suffixes are `P` for covariant type +parameters, and `N` for contravariant type parameters. Lambda traits can have +more than one type parameter. For instance, here is a trait with contravariant +and covariant type parameters: + +```scala +trait Lambda$NP[type -$hkArg$0, +$hkArg1] { type +Apply } extends Lambda$IP with Lambda$NI +``` + +Aside: the `+` prefix in front of `Apply` indicates that `Apply` is a covariant +type field. Dotty admits variance annotations on type members. + +The definition of `Lambda$NP` shows that `Lambda` traits form a subtyping +hierarchy: Traits which have covariant or contravariant type parameters are +subtypes of traits which don't. The supertraits of `Lambda$NP` would themselves +be written as follows. + +```scala +trait Lambda$IP[type $hkArg$0, +$hkArg1] { type +Apply } extends Lambda$II +trait Lambda$NI[type -$hkArg$0, $hkArg1] { type +Apply } extends Lambda$II +trait Lambda$II[type $hkArg$0, $hkArg1] { type +Apply } +``` + +`Lambda` traits are special in that they influence how type applications are +expanded: If the standard type application `T[X1, ..., Xn]` leads to a subtype +`S` of a type instance + +```scala +LambdaXYZ { type Arg1 = T1; ...; type ArgN = Tn; type Apply ... } +``` + +where all argument fields `Arg1, ..., ArgN` are concretely defined and the +definition of the `Apply` field may be either abstract or concrete, then the +application is further expanded to `S # Apply`. + +For instance, the type instance `List2D[String]` would be expanded to + +```scala +Lambda$I { type $hkArg$0 = String; type Apply = List[List[String]] } # Apply +``` + +which in turn simplifies to `List[List[String]]`. + +2nd Example: Consider the two aliases + +```scala +type RMap[K, V] = Map[V, K] +type RRMap[K, V] = RMap[V, K] +``` + +These expand as follows: + +```scala +type RMap = Lambda$II { self1 => type Apply = Map[self1.$hkArg$1, self1.$hkArg$0] } +type RRMap = Lambda$II { self2 => type Apply = RMap[self2.$hkArg$1, self2.$hkArg$0] } +``` + +Substituting the definition of `RMap` and expanding the type application gives: + +```scala +type RRMap = Lambda$II { self2 => type Apply = + Lambda$II { self1 => type Apply = Map[self1.$hkArg$1, self1.$hkArg$0] } + { type $hkArg$0 = self2.$hkArg$1; type $hkArg$1 = self2.$hkArg$0 } # Apply } +``` + +Substituting the definitions for `self1.$hkArg${1,2}` gives: + +```scala +type RRMap = Lambda$II { self2 => type Apply = + Lambda$II { self1 => type Apply = Map[self2.$hkArg$0, self2.$hkArg$1] } + { type $hkArg$0 = self2.$hkArg$1; type $hkArg$1 = self2.$hkArg$0 } # Apply } +``` + +Simplifiying the `# Apply` selection gives: + +```scala +type RRMap = Lambda$II { self2 => type Apply = Map[self2.$hkArg$0, self2.$hkArg$1] } +``` + +This can be regarded as the eta-expanded version of `Map`. It has the same expansion as + +```scala +type IMap[K, V] = Map[K, V] +``` + +Modelling higher-kinded types +----------------------------- +The encoding of higher-kinded types uses again the `Lambda` traits to represent +type constructors. Consider the higher-kinded type declaration + +```scala +type Rep[T] +``` + +We expand this to + +```scala +type Rep <: Lambda$I +``` + +The type parameters of `Rep` are the type parameters of its upper bound, so +`Rep` is a unary type constructor. + +More generally, a higher-kinded type declaration + +```scala +type T[v1 X1 >: S1 <: U1, ..., vn XN >: S1 <: UN] >: SR <: UR +``` + +is encoded as + +```scala +type T <: LambdaV1...Vn { self => + type v1 $hkArg$0 >: s(S1) <: s(U1) + ... + type vn $hkArg$N >: s(SN) <: s(UN) + type Apply >: s(SR) <: s(UR) +} +``` + +where `s` is the substitution `[XI := self.$hkArg$I | I = 1,...,N]`. + +If we instantiate `Rep` with a type argument, this is expanded as was explained +before. + +```scala +Rep[String] +``` + +would expand to + +```scala +Rep { type $hkArg$0 = String } # Apply +``` + +If we instantiate the higher-kinded type with a concrete type constructor (i.e. +a parameterized trait or class), we have to do one extra adaptation to make it +work. The parameterized trait or class has to be eta-expanded so that it +comforms to the `Lambda` bound. For instance, + +```scala +type Rep = Set +``` + +would expand to: + +```scala +type Rep = Lambda1 { type Apply = Set[$hkArg$0] } +``` + +Or, + +```scala +type Rep = Map[String, _] +``` + +would expand to + +```scala +type Rep = Lambda1 { type Apply = Map[String, $hkArg$0] } +``` + +Full example +------------ +Consider the higher-kinded `Functor` type class + +```scala +class Functor[F[_]] { + def map[A, B](f: A => B): F[A] => F[B] +} +``` + +This would be represented as follows: + +```scala +class Functor[F <: Lambda1] { + def map[A, B](f: A => B): F { type $hkArg$0 = A } # Apply => F { type $hkArg$0 = B } # Apply +} +``` + +The type `Functor[List]` would be represented as follows + +```scala +Functor { + type F = Lambda1 { type Apply = List[$hkArg$0] } +} +``` + +Now, assume we have a value + +```scala +val ml: Functor[List] +``` + +Then `ml.map` would have type + +```scala +s(F { type $hkArg$0 = A } # Apply => F { type $hkArg$0 = B } # Apply) +``` + +where `s` is the substitution of `[F := Lambda1 { type Apply = List[$hkArg$0] }]`. +This gives: + +```scala +Lambda1 { type Apply = List[$hkArg$0] } { type $hkArg$0 = A } # Apply + => Lambda1 { type Apply = List[$hkArg$0] } { type $hkArg$0 = B } # Apply +``` + +This type simplifies to: + +```scala +List[A] => List[B] +``` + +Status of `#` +------------- +In the scheme above we have silently assumed that `#` "does the right thing", +i.e. that the types are well-formed and we can collapse a type alias with a `#` +projection, thereby giving us a form of beta reduction. + +In Scala 2.x, this would not work, because `T#X` means `x.X forSome { val x: T +}`. Hence, two occurrences of `Rep[Int]` say, would not be recognized to be +equal because the existential would be opened each time afresh. + +In pre-existentials Scala, this would not have worked either. There, `T#X` was +a fundamental type constructor, but was restricted to alias types or classes +for both `T` and `X`. Roughly, `#` was meant to encode Java's inner classes. +In Java, given the classes + +```scala +class Outer { class Inner } +class Sub1 extends Outer +class Sub2 extends Outer +``` + +The types `Outer#Inner`, `Sub1#Inner` and `Sub2#Inner` would all exist and be +regarded as equal to each other. But if `Outer` had abstract type members this +would not work, since an abstract type member could be instantiated differently +in `Sub1` and `Sub2`. Assuming that `Sub1#Inner = Sub2#Inner` could then lead +to a soundness hole. To avoid soundness problems, the types in `X#Y` were +restricted so that `Y` was (an alias of) a class type and `X` was (an alias of) +a class type with no abstract type members. + +I believe we can go back to regarding `T#X` as a fundamental type constructor, +the way it was done in pre-existential Scala, but with the following relaxed +restriction: + +> In a type selection `T#x`, `T` is not allowed to have any abstract members different from `X` + +This would typecheck the higher-kinded types examples, because they only +project with `# Apply` once all `$hkArg$` type members are fully instantiated. + +It would be good to study this rule formally, trying to verify its soundness. diff --git a/docs/docs/internals/overall-structure.md b/docs/docs/internals/overall-structure.md new file mode 100644 index 000000000..214e47aa5 --- /dev/null +++ b/docs/docs/internals/overall-structure.md @@ -0,0 +1,191 @@ +--- +layout: default +title: "Project Structure" +--- + +Dotc Overall Structure +====================== +The compiler code is found in package [dotty.tools]. It spans the +following three sub-packages: + +```none +backend Compiler backends (currently for JVM and JS) + dotc The main compiler + io Helper modules for file access and classpath handling. +``` + +The [dotc] package contains some main classes that can be run as separate +programs. The most important one is class [Main]. `Main` inherits from +[Driver] which contains the highest level functions for starting a compiler +and processing some sources. `Driver` in turn is based on two other high-level +classes, [Compiler] and [Run]. + +Package Structure +----------------- +Most functionality of `dotc` is implemented in subpackages of `dotc`. Here's a +list of sub-packages and their focus. + +```none +. +├── ast // Abstract syntax trees +├── config // Compiler configuration, settings, platform specific definitions. +├── core // Core data structures and operations, with specific subpackages for: +│   ├── classfile // Reading of Java classfiles into core data structures +│   ├── tasty // Reading and writing of TASTY files to/from core data structures +│   └── unpickleScala2 // Reading of Scala2 symbol information into core data structures +├── parsing // Scanner and parser +├── printing // Pretty-printing trees, types and other data +├── repl // The interactive REPL +├── reporting // Reporting of error messages, warnings and other info. +├── rewrite // Helpers for rewriting Scala 2's constructs into dotty's. +├── transform // Miniphases and helpers for tree transformations. +├── typer //Type-checking and other frontend phases +└── util // General purpose utility classes and modules. +``` + +Contexts +-------- +`dotc` has almost no global state (the only significant bit of global state is +the name table, which is used to hash strings into unique names). Instead, all +essential bits of information that can vary over a compiler run are collected +in a [Context]. Most methods in `dotc` take a `Context` value as an implicit +parameter. + +Contexts give a convenient way to customize values in some part of the +call-graph. To run, e.g. some compiler function `f` at a given phase `phase`, +we invoke `f` with an explicit context parameter, like this + +```scala +f(/*normal args*/)(ctx.withPhase(phase)) +``` + +This assumes that `f` is defined in the way most compiler functions are: + +```scala +def f(/*normal parameters*/)(implicit ctx: Context) ... +``` + +Compiler code follows the convention that all implicit `Context` parameters are +named `ctx`. This is important to avoid implicit ambiguities in the case where +nested methods contain each a Context parameters. The common name ensures then +that the implicit parameters properly shadow each other. + +Sometimes we want to make sure that implicit contexts are not captured in +closures or other long-lived objects, be it because we want to enforce that +nested methods each get their own implicit context, or because we want to avoid +a space leak in the case where a closure can survive several compiler runs. A +typical case is a completer for a symbol representing an external class, which +produces the attributes of the symbol on demand, and which might never be +invoked. In that case we follow the convention that any context parameter is +explicit, not implicit, so we can track where it is used, and that it has a +name different from `ctx`. Commonly used is `ictx` for "initialization +context". + +With these two conventions in place, it has turned out that implicit contexts +work amazingly well as a device for dependency injection and bulk +parameterization. There is of course always the danger that an unexpected +implicit will be passed, but in practice this has not turned out to be much of +a problem. + +Compiler Phases +--------------- +Seen from a temporal perspective, the `dotc` compiler consists of a list of +phases. The current list of phases is specified in class [Compiler] as follows: + +```scala + def phases: List[List[Phase]] = List( + List(new FrontEnd), // Compiler frontend: scanner, parser, namer, typer + List(new PostTyper), // Additional checks and cleanups after type checking + List(new Pickler), // Generate TASTY info + List(new FirstTransform, // Some transformations to put trees into a canonical form + new CheckReentrant), // Internal use only: Check that compiled program has no data races involving global vars + List(new RefChecks, // Various checks mostly related to abstract members and overriding + new CheckStatic, // Check restrictions that apply to @static members + new ElimRepeated, // Rewrite vararg parameters and arguments + new NormalizeFlags, // Rewrite some definition flags + new ExtensionMethods, // Expand methods of value classes with extension methods + new ExpandSAMs, // Expand single abstract method closures to anonymous classes + new TailRec, // Rewrite tail recursion to loops + new LiftTry, // Put try expressions that might execute on non-empty stacks into their own methods + new ClassOf), // Expand `Predef.classOf` calls. + List(new PatternMatcher, // Compile pattern matches + new ExplicitOuter, // Add accessors to outer classes from nested ones. + new ExplicitSelf, // Make references to non-trivial self types explicit as casts + new CrossCastAnd, // Normalize selections involving intersection types. + new Splitter), // Expand selections involving union types into conditionals + List(new VCInlineMethods, // Inlines calls to value class methods + new SeqLiterals, // Express vararg arguments as arrays + new InterceptedMethods, // Special handling of `==`, `|=`, `getClass` methods + new Getters, // Replace non-private vals and vars with getter defs (fields are added later) + new ElimByName, // Expand by-name parameters and arguments + new AugmentScala2Traits, // Expand traits defined in Scala 2.11 to simulate old-style rewritings + new ResolveSuper), // Implement super accessors and add forwarders to trait methods + List(new Erasure), // Rewrite types to JVM model, erasing all type parameters, abstract types and refinements. + List(new ElimErasedValueType, // Expand erased value types to their underlying implementation types + new VCElideAllocations, // Peep-hole optimization to eliminate unnecessary value class allocations + new Mixin, // Expand trait fields and trait initializers + new LazyVals, // Expand lazy vals + new Memoize, // Add private fields to getters and setters + new LinkScala2ImplClasses, // Forward calls to the implementation classes of traits defined by Scala 2.11 + new NonLocalReturns, // Expand non-local returns + new CapturedVars, // Represent vars captured by closures as heap objects + new Constructors, // Collect initialization code in primary constructors + // Note: constructors changes decls in transformTemplate, no InfoTransformers should be added after it + new FunctionalInterfaces,// Rewrites closures to implement @specialized types of Functions. + new GetClass), // Rewrites getClass calls on primitive types. + List(new LambdaLift, // Lifts out nested functions to class scope, storing free variables in environments + // Note: in this mini-phase block scopes are incorrect. No phases that rely on scopes should be here + new ElimStaticThis, // Replace `this` references to static objects by global identifiers + new Flatten, // Lift all inner classes to package scope + new RestoreScopes), // Repair scopes rendered invalid by moving definitions in prior phases of the group + List(new ExpandPrivate, // Widen private definitions accessed from nested classes + new CollectEntryPoints, // Find classes with main methods + new LabelDefs), // Converts calls to labels to jumps + List(new GenSJSIR), // Generate .js code + List(new GenBCode) // Generate JVM bytecode + ) +``` + +Note that phases are grouped, so the `phases` method is of type +`List[List[Phase]]`. The idea is that all phases in a group are *fused* into a +single tree traversal. That way, phases can be kept small (most phases perform +a single function) without requiring an excessive number of tree traversals +(which are costly, because they have generally bad cache locality). + +Phases fall into four categories: + +* Frontend phases: `Frontend`, `PostTyper` and `Pickler`. `FrontEnd` parses the + source programs and generates untyped abstract syntax trees, which are then + typechecked and transformed into typed abstract syntax trees. `PostTyper` + performs checks and cleanups that require a fully typed program. In + particular, it + + - creates super accessors representing `super` calls in traits + - creates implementations of synthetic (compiler-implemented) methods + - avoids storing parameters passed unchanged from subclass to superclass in + duplicate fields. + + Finally `Pickler` serializes the typed syntax trees produced by the frontend + as TASTY data structures. + +* High-level transformations: All phases from `FirstTransform` to `Erasure`. + Most of these phases transform syntax trees, expanding high-level constructs + to more primitive ones. The last phase in the group, `Erasure` translates all + types into types supported directly by the JVM. To do this, it performs + another type checking pass, but using the rules of the JVM's type system + instead of Scala's. + +* Low-level transformations: All phases from `ElimErasedValueType` to + `LabelDefs`. These further transform trees until they are essentially a + structured version of Java bytecode. + +* Code generators: These map the transformed trees to Java classfiles or + Javascript files. + +[dotty.tools]: https://github.com/lampepfl/dotty/tree/master/src/dotty/tools +[dotc]: https://github.com/lampepfl/dotty/tree/master/src/dotty/tools/dotc +[Main]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Main.scala +[Driver]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Driver.scala +[Compiler]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Compiler.scala +[Run]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Run.scala +[Context]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Contexts.scala diff --git a/docs/docs/internals/periods.md b/docs/docs/internals/periods.md new file mode 100644 index 000000000..fe788915d --- /dev/null +++ b/docs/docs/internals/periods.md @@ -0,0 +1,96 @@ +--- +layout: default +title: "Periods" +toc: true +--- + +Dotc's concept of time +====================== +Conceptually, the `dotc` compiler's job is to maintain views of various +artifacts associated with source code at all points in time. But what is +*time* for `dotc`? In fact, it is a combination of compiler runs and compiler +phases. + +The *hours* of the compiler's clocks are measured in compiler [runs]. Every run +creates a new hour, which follows all the compiler runs (hours) that happened +before. `dotc` is designed to be used as an incremental compiler that can +support incremental builds, as well as interactions in an IDE and a REPL. This +means that new runs can occur quite frequently. At the extreme, every +keystroke in an editor or REPL can potentially launch a new compiler run, so +potentially an "hour" of compiler time might take only a fraction of a second +in real time. + +The *minutes* of the compiler's clocks are measured in phases. At every +compiler run, the compiler cycles through a number of [phases]. The list of +phases is defined in the [Compiler]object There are currently about 60 phases +per run, so the minutes/hours analogy works out roughly. After every phase the +view the compiler has of the world changes: trees are transformed, types are +gradually simplified from Scala types to JVM types, definitions are rearranged, +and so on. + +Many pieces in the information compiler are time-dependent. For instance, a +Scala symbol representing a definition has a type, but that type will usually +change as one goes from the higher-level Scala view of things to the +lower-level JVM view. There are different ways to deal with this. Many +compilers change the type of a symbol destructively according to the "current +phase". Another, more functional approach might be to have different symbols +representing the same definition at different phases, which each symbol +carrying a different immutable type. `dotc` employs yet another scheme, which +is inspired by functional reactive programming (FRP): Symbols carry not a +single type, but a function from compiler phase to type. So the type of a +symbol is a time-indexed function, where time ranges over compiler phases. + +Typically, the definition of a symbol or other quantity remains stable for a +number of phases. This leads us to the concept of a [period]. Conceptually, +period is an interval of some given phases in a given compiler run. Periods +are conceptually represented by three pieces of information + +* the ID of the current run, +* the ID of the phase starting the period +* the number of phases in the period + +All three pieces of information are encoded in a value class over a 32 bit +integer. Here's the API for class `Period`: + +```scala +class Period(val code: Int) extends AnyVal { + def runId: RunId // The run identifier of this period. + def firstPhaseId: PhaseId // The first phase of this period + def lastPhaseId: PhaseId // The last phase of this period + def phaseId: PhaseId // The phase identifier of this single-phase period + + def containsPhaseId(id: PhaseId): Boolean + def contains(that: Period): Boolean + def overlaps(that: Period): Boolean + + def & (that: Period): Period + def | (that: Period): Period +} +``` + +We can access the parts of a period using `runId`, `firstPhaseId`, +`lastPhaseId`, or using `phaseId` for periods consisting only of a single +phase. They return `RunId` or `PhaseId` values, which are aliases of `Int`. +`containsPhaseId`, `contains` and `overlaps` test whether a period contains a +phase or a period as a sub-interval, or whether the interval overlaps with +another period. Finally, `&` and `|` produce the intersection and the union of +two period intervals (the union operation `|` takes as `runId` the `runId` of +its left operand, as periods spanning different `runId`s cannot be constructed. + +Periods are constructed using two `apply` methods: + +```scala +object Period { + /** The single-phase period consisting of given run id and phase id */ + def apply(rid: RunId, pid: PhaseId): Period + + /** The period consisting of given run id, and lo/hi phase ids */ + def apply(rid: RunId, loPid: PhaseId, hiPid: PhaseId): Period +} +``` + +As a sentinel value there's `Nowhere`, a period that is empty. + +[runs]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Run.scala +[phases]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Phases.scala +[period]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Periods.scala diff --git a/docs/docs/internals/type-system.md b/docs/docs/internals/type-system.md new file mode 100644 index 000000000..191c107cf --- /dev/null +++ b/docs/docs/internals/type-system.md @@ -0,0 +1,134 @@ +--- +layout: default +--- + +Type System +=========== +The types are defined in [dotty/tools/dotc/core/Types.scala][1] + +## Class diagram ## +- [PDF][2], generated with [a fork of scaladiagrams][3] + +## Proxy types and ground types ## +A type which inherits `TypeProxy` is a proxy for another type accessible using +the `underlying` method, other types are called _ground_ types and inherit +`CachedGroundType` or `UncachedGroundType`. + +``` +Type -+- ProxyType --+- NamedType ----+--- TypeRef + | | \ + | +- SingletonType-+-+- TermRef + | | | + | | +--- ThisType + | | +--- SuperType + | | +--- ConstantType + | | +--- MethodParam + | | +--- RefinedThis + | +- PolyParam + | +- RefinedType + | +- TypeBounds + | +- ExprType + | +- AnnotatedType + | +- TypeVar + | + +- GroundType -+- AndType + +- OrType + +- MethodType -----+- ImplicitMethodType + | +- JavaMethodType + +- PolyType + +- ClassInfo + | + +- NoType + +- NoPrefix + +- ErrorType + +- WildcardType + +``` + +## Representations of types ## + Type | Representation + ------------------------- | ----------------------------- + `p.x.type` | `TermRef(p, x)` + `p#T` | `TypeRef(p, T)` + `p.x.T` == `p.x.type#T` | `TypeRef(TermRef(p, x), T)` + `this.type` | `ThisType` + `A & B` | `AndType(A, B)` + `A | B` | `OrType(A, B)` + `=> T` | `ExprType(T)` + `p { refinedName }` | `RefinedType(p, refinedName)` + type of the value `super` | `SuperType` + `type T >: A <: B` | `TypeRef` with underlying type `RealTypeBounds(A, B)` + `type T = A` | `TypeRef` with underlying type `TypeAlias(A)` + `class p.C ...` | `ClassInfo(p, C, ...)` + +### Representation of methods ### +```scala +def f[A, B <: Ord[A]](x: A, y: B): Unit +``` +is represented as: + +```scala +val p = PolyType(List("A", "B"))( + List(TypeBounds(Nothing, Any), + TypeBounds(Nothing, + RefinedType(Ordering, + scala$math$Ordering$$T, TypeAlias(PolyParam(p, 0))))), + m) + +val m = MethodType(List("x", "y"), + List(PolyParam(p, 0), PolyParam(p, 1)))(Unit) +``` +(This is a slightly simplified version, e.g. we write `Unit` instead of +`TypeRef(TermRef(ThisType(TypeRef(NoPrefix,)),scala),Unit)`). + +Note that a PolyParam refers to a type parameter using its index (here A is 0 +and B is 1). + +## Subtyping checks ## +`topLevelSubType(tp1, tp2)` in [dotty/tools/dotc/core/TypeComparer.scala][4] +checks if `tp1` is a subtype of `tp2`. + +### Type rebasing ### +**FIXME**: This section is no longer accurate because +https://github.com/lampepfl/dotty/pull/331 changed the handling of refined +types. + +Consider [tests/pos/refinedSubtyping.scala][5] +```scala +class Test { + + class C { type T; type Coll } + + type T1 = C { type T = Int } + + type T11 = T1 { type Coll = Set[Int] } + + type T2 = C { type Coll = Set[T] } + + type T22 = T2 { type T = Int } + + var x: T11 = _ + var y: T22 = _ + + x = y + y = x + +} +``` +We want to do the subtyping checks recursively, since it would be nice if we +could check if `T22 <: T11` by first checking if `T2 <: T1`. To achieve this +recursive subtyping check, we remember that `T2#T` is really `T22#T`. This +procedure is called rebasing and is done by storing refined names in +`pendingRefinedBases` and looking them up using `rebase`. + +## Type caching ## +TODO + +## Type inference via constraint solving ## +TODO + +[1]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Types.scala +[2]: https://github.com/samuelgruetter/dotty/blob/classdiagrampdf/dotty-types.pdf +[3]: https://github.com/samuelgruetter/scaladiagrams/tree/print-descendants +[4]: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/TypeComparer.scala +[5]: https://github.com/lampepfl/dotty/blob/master/tests/pos/refinedSubtyping.scala -- cgit v1.2.3