From b41a09426c03167a883cfb8a3ac3165d99aac056 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 10 Apr 2018 12:11:33 -0700 Subject: update changelog for 0.2.0 --- docs/pages/1 - Intro to Mill.md | 58 ++++- docs/pages/3 - Common Project Layouts.md | 44 ++++ docs/pages/3 - Tasks.md | 341 ----------------------------- docs/pages/4 - Modules.md | 158 -------------- docs/pages/4 - Tasks.md | 341 +++++++++++++++++++++++++++++ docs/pages/5 - Cross Builds.md | 162 -------------- docs/pages/5 - Modules.md | 158 ++++++++++++++ docs/pages/6 - Cross Builds.md | 162 ++++++++++++++ docs/pages/6 - Extending Mill.md | 183 ---------------- docs/pages/7 - Extending Mill.md | 183 ++++++++++++++++ docs/pages/7 - Mill Internals.md | 357 ------------------------------- docs/pages/8 - Mill Internals.md | 357 +++++++++++++++++++++++++++++++ readme.md | 48 +++-- 13 files changed, 1327 insertions(+), 1225 deletions(-) delete mode 100644 docs/pages/3 - Tasks.md delete mode 100644 docs/pages/4 - Modules.md create mode 100644 docs/pages/4 - Tasks.md delete mode 100644 docs/pages/5 - Cross Builds.md create mode 100644 docs/pages/5 - Modules.md create mode 100644 docs/pages/6 - Cross Builds.md delete mode 100644 docs/pages/6 - Extending Mill.md create mode 100644 docs/pages/7 - Extending Mill.md delete mode 100644 docs/pages/7 - Mill Internals.md create mode 100644 docs/pages/8 - Mill Internals.md diff --git a/docs/pages/1 - Intro to Mill.md b/docs/pages/1 - Intro to Mill.md index bfbb3d9a..c11d9de1 100644 --- a/docs/pages/1 - Intro to Mill.md +++ b/docs/pages/1 - Intro to Mill.md @@ -1,4 +1,4 @@ -[Mill](https://github.com/lihaoyi/mill) is your shiny new Scala build tool! +[Mill](https://github.com/lihaoyi/mill) is your shiny new Java/Scala build tool! [Scared of SBT](http://www.lihaoyi.com/post/SowhatswrongwithSBT.html)? Melancholy over Maven? Grumbling about Gradle? Baffled by Bazel? Give Mill a try! @@ -33,6 +33,28 @@ Arch Linux has an [AUR package for mill](https://aur.archlinux.org/packages/mill pacaur -S mill ``` +### Windows + +To get started, download Mill from: https://github.com/lihaoyi/mill/releases/download/0.1.8/0.1.8, +and save it as `mill.bat`. + +Mill also works on a sh environment on Windows (e.g., +[MSYS2](https://www.msys2.org), +[Cygwin](https://www.cygwin.com), +[Git-Bash](https://gitforwindows.org), +[WSL](https://docs.microsoft.com/en-us/windows/wsl); +to get started, follow the instructions in the [manual](#manual) section below. Note that: + +* In some environments (such as WSL), mill has be run using interactive mode (`-i`) + +* Git-Bash: run the instruciton in administrator mode instead of `sudo` + +* Cygwin: run the following after downloading mill: + + ```bash + sed -i '0,/-cp "\$0"/{s/-cp "\$0"/-cp `cygpath -w "\$0"`/}; 0,/-cp "\$0"/{s/-cp "\$0"/-cp `cygpath -w "\$0"`/}' /usr/local/bin/mill + ``` + ### Manual To get started, download Mill and install it into your system via the following @@ -54,6 +76,17 @@ questions or say hi! ## Getting Started +The simplest Mill build for a Java project looks as follows: + +```scala +// build.sc +import mill._, mill.scalalib._ + +object foo extends JavaModule { + +} +``` + The simplest Mill build for a Scala project looks as follows: ```scala @@ -66,13 +99,14 @@ object foo extends ScalaModule { } ``` -This would build a project laid out as follows: +Both of these would build a project laid out as follows: ``` build.sc foo/ src/ - Main.scala + FileA.java + FileB.scala resources/ ... out/ @@ -100,7 +134,7 @@ $ mill foo.launcher # prepares a foo/launcher/dest/run you can ru $ mill foo.jar # bundle the classfiles into a jar $ mill foo.assembly # bundle classfiles and all dependencies into a jar - + $ mill -i foo.console # start a Scala console within your project (in interactive mode: "-i") $ mill -i foo.repl # start an Ammonite REPL within your project (in interactive mode: "-i") @@ -141,10 +175,20 @@ respective `out/foo/bar/` folder. ## Multiple Modules +### Java Example ```scala // build.sc -import mill._ -import mill.scalalib._ +import mill._, mill.scalalib._ +object foo extends ScalaModule +object bar extends ScalaModule { + def moduleDeps = Seq(foo) +} +``` + +### Scala Example +```scala +// build.sc +import mill._, mill.scalalib._ object foo extends ScalaModule { def scalaVersion = "2.12.4" } @@ -155,7 +199,7 @@ object bar extends ScalaModule { ``` You can define multiple modules the same way you define a single module, using -`def moduleDeps` to define the relationship between them. The above build +`def moduleDeps` to define the relationship between them. The above builds expects the following project layout: ``` diff --git a/docs/pages/3 - Common Project Layouts.md b/docs/pages/3 - Common Project Layouts.md index a0e3afbe..a53fb1b1 100644 --- a/docs/pages/3 - Common Project Layouts.md +++ b/docs/pages/3 - Common Project Layouts.md @@ -4,6 +4,50 @@ Earlier, we have shown how to work with the Mill default Scala module layout. Here we will explore some other common project layouts that you may want in your Scala build: +### Java Project with Test Suite + +```scala +trait JUnitTests extends TestModule{ + def testFrameworks = Seq("com.novocode.junit.JUnitFramework") + def ivyDeps = Agg(ivy"com.novocode:junit-interface:0.11") +} + +object core extends JavaModule{ + object test extends Tests with JUnitTests +} +object app extends JavaModule{ + def moduleDeps = Seq(core) + object test extends Tests with JUnitTests +} +``` + +This build is a two-module Java project with junit test suites. It expects the +following filesystem layout: + +```text +build.sc +app/ + src/hello/ + Main.java + test/src/hello/ + MyAppTests.java +core/ + src/hello/ + Core.java + test/src/hello/ + MyCoreTests.java +``` + +You can then run the junit tests using `mill app.test` or `mill core.test`, and +configure which exact tests you want to run using the flags defined on the +[JUnit Test Interface](https://github.com/sbt/junit-interface#junit-interface). + +For a more more complex, real-world example of a Java build, check out our +example build for the popular [Caffeine](https://github.com/ben-manes/caffeine) +project: + +- [Example Build](https://github.com/lihaoyi/mill/blob/master/integration/test/resources/caffeine/build.sc) + ### Cross Scala-Version Modules ```scala diff --git a/docs/pages/3 - Tasks.md b/docs/pages/3 - Tasks.md deleted file mode 100644 index 71974177..00000000 --- a/docs/pages/3 - Tasks.md +++ /dev/null @@ -1,341 +0,0 @@ -One of Mill's core abstractions is it's *Task Graph*: this is how Mill defines, -orders and caches work it needs to do, and exists independently of any support -for building Scala. - -The following is a simple self-contained example using Mill to compile Java: - -```scala -import ammonite.ops._, mill._ - -// sourceRoot -> allSources -> classFiles -// | -// v -// resourceRoot ----> jar - -def sourceRoot = T.sources{ pwd / 'src } - -def resourceRoot = T.sources{ pwd / 'resources } - -def allSources = T{ sourceRoot().flatMap(p => ls.rec(p.path)).map(PathRef(_)) } - -def classFiles = T{ - mkdir(T.ctx().dest) - import ammonite.ops._ - %("javac", sources().map(_.path.toString()), "-d", T.ctx().dest)(wd = T.ctx().dest) - PathRef(T.ctx().dest) -} - -def jar = T{ Jvm.createJar(Loose.Agg(classFiles().path) ++ resourceRoot().map(_.path)) } - -def run(mainClsName: String) = T.command{ - %%('java, "-cp", classFiles().path, mainClsName) -} -``` - -Here, we have two `T.source`s, `sourceRoot` and `resourceRoot`, which act as the -roots of our task graph. `allSources` depends on `sourceRoot` by calling -`sourceRoot()` to extract it's value, `classFiles` depends on `allSources` the -same way, and `jar` depends on both `classFiles` and `resourceRoot`. - -Filesystem o1perations in Mill are done using the -[Ammonite-Ops](http://ammonite.io/#Ammonite-Ops) library. - -The above build defines the following task graph: - -``` -sourceRoot -> allSources -> classFiles - | - v - resourceRoot ----> jar -``` - -When you first evaluate `jar` (e.g. via `mill jar` at the command line), it will -evaluate all the defined targets: `sourceRoot`, `allSources`, `classFiles`, -`resourceRoot` and `jar`. - -Subsequent `mill jars` will evaluate only as much as is necessary, depending on -what input sources changed: - -- If the files in `sourceRoot` change, it will re-evaluate `allSources`, - compiling to `classFiles`, and building the `jar` - -- If the files in `resourceRoot` change, it will only re-evaluate `jar` and use - the cached output of `allSources` and `classFiles` - -Apart from the `foo()` call-sites which define what each targets depend on, the -code within each `T{...}` wrapper is arbirary Scala code that can compute an -arbitrary result from it's inputs. - -## Different Kinds of Tasks - -There are four primary kinds of *Tasks* that you should care about: - -- [Targets](#targets), defined using `T{...}` -- [Sources](#sources), defined using `T.source{...}` -- [Commands](#commands), defined using `T.command{...}` - -### Targets - -```scala -def allSources = T{ ls.rec(sourceRoot().path).map(PathRef(_)) } -``` - -`Target`s are defined using the `def foo = T{...}` syntax, and dependencies on -other targets are defined using `foo()` to extract the value from them. Apart -from the `foo()` calls, the `T{...}` block contains arbitrary code that does -some work and returns a result. - -Each target e.g. `classFiles` is assigned a path on disk as scratch space & to -store it's output files at `out/classFiles/dest/`, and it's returned metadata is -automatically JSON-serialized and stored at `out/classFiles/meta.json`. The -return-value of targets has to be JSON-serializable via -[uPickle](https://github.com/lihaoyi/upickle). - -If you want to return a file or a set of files as the result of a `Target`, -write them to disk within your `T.ctx().dest` available through the -[Task Context API](#task-context-api) and return a `PathRef` to the files you -wrote. - -If a target's inputs change but it's output does not, e.g. someone changes a -comment within the source files that doesn't affect the classfiles, then -downstream targets do not re-evaluate. This is determined using the `.hashCode` -of the Target's return value. For target's returning `ammonite.ops.Path`s that -reference files on disk, you can wrap the `Path` in a `PathRef` (shown above) -whose `.hashCode()` will include the hashes of all files on disk at time of -creation. - -The graph of inter-dependent targets is evaluated in topological order; that -means that the body of a target will not even begin to evaluate if one of it's -upstream dependencies has failed. This is unlike normal Scala functions: a plain -old function `foo` would evaluate halfway and then blow up if one of `foo`'s -dependencies throws an exception. - -Targets cannot take parameters and must be 0-argument `def`s defined directly -within a `Module` body - -### Sources - -```scala -def sourceRootPath = pwd / 'src - -def sourceRoots = T.sources{ sourceRootPath } -``` - -`Source`s are defined using `T.source{ ... }`, taking one-or-more -`ammonite.ops.Path`s as arguments. A `Source` is a subclass of -`Target[Seq[PathRef]]`: this means that it's build signature/`hashCode` depends -not just on the path it refers to (e.g. `foo/bar/baz`) but also the MD5 hash of -the filesystem tree under that path. - -`T.source` also has an overload which takes `Seq[PathRef]`, to let you -override-and-extend source lists the same way you would any other `T{...}` -definition: - -```scala -def additionalSources = T.sources{ pwd / 'additionalSources } -def sourceRoots = T.sources{ super.sourceRoots() ++ additionalSources() } -``` - -### Commands - -```scala -def run(mainClsName: String) = T.command{ - %%('java, "-cp", classFiles().path, mainClsName) -} -``` - -Defined using `T.command{ ... }` syntax, `Command`s can run arbitrary code, with -dependencies declared using the same `foo()` syntax (e.g. `classFiles()` above). -Commands can be parametrized, but their output is not cached, so they will -re-evaluate every time even if none of their inputs have changed. - -Like [Targets](#targets), a command only evaluates after all it's upstream -dependencies have completed, and will not begin to run if any upstream -dependency has failed. - -Commands are assigned the same scratch/output directory `out/run/dest/` as -Targets are, and it's returned metadata stored at the same `out/run/meta.json` -path for consumption by external tools. - -Commands can only be defined directly within a `Module` body. - -## Task Context API - -There are several APIs available to you within the body of a `T{...}` or -`T.command{...}` block to help your write the code implementing your Target or -Command: - -### mill.util.Ctx.Dest - -- `T.ctx().dest` -- `implicitly[mill.util.Ctx.Dest]` - -This is the unique `out/classFiles/dest/` path or `out/run/dest/` path that is -assigned to every Target or Command. It is cleared before your task runs, and -you can use it as a scratch space for temporary files or a place to put returned -artifacts. This is guaranteed to be unique for every `Target` or `Command`, so -you can be sure that you will not collide or interfere with anyone else writing -to those same paths. - -### mill.util.Ctx.Log - -- `T.ctx().log` -- `implicitly[mill.util.Ctx.Log]` - -This is the default logger provided for every task. While your task is running, -`System.out` and `System.in` are also redirected to this logger. The logs for a -task are streamed to standard out/error as you would expect, but each task's -specific output is also streamed to a log file on disk e.g. `out/run/log` or -`out/classFiles/log` for you to inspect later. - -### mill.util.Ctx.Env - -- `T.ctx().env` -- `implicitly[mill.util.Ctx.Env]` - -Mill keeps a long-lived JVM server to avoid paying the cost of recurrent -classloading. Because of this, running `System.getenv` in a task might not yield -up to date environment variables, since it will be initialised when the server -starts, rather than when the client executes. To circumvent this, mill's client -sends the environment variables to the server as it sees them, and the server -makes them available as a `Map[String, String]` via the `Ctx` API. - -If the intent is to always pull the latest environment values, the call should -be wrapped in an `Input` as such : - -```scala -def envVar = T.input { T.ctx().env.get("ENV_VAR") } -``` - -## Other Tasks - -- [Anonymous Tasks](#anonymous-tasks), defined using `T.task{...}` -- [Persistent Targets](#persistent-targets) -- [Inputs](#inputs) -- [Workers](#workers) - - -### Anonymous Tasks - -```scala -def foo(x: Int) = T.task{ ... x ... bar() ... } -``` - -You can define anonymous tasks using the `T.task{ ... }` syntax. These are not -runnable from the command-line, but can be used to share common code you find -yourself repeating in `Target`s and `Command`s. - -```scala -def downstreamTarget = T{ ... foo() ... } -def downstreamCommand = T.command{ ... foo() ... } -``` -Anonymous tasks's output does not need to be JSON-serializable, their output is -not cached, and they can be defined with or without arguments. Unlike -[Targets](#targets) or [Commands](#commands), anonymous tasks can be defined -anywhere and passed around any way you want, until you finally make use of them -within a downstream target or command. - -While an anonymous task `foo`'s own output is not cached, if it is used in a -downstream target `bar` and the upstream targets's `baz` `qux` haven't changed, -`bar`'s cached output will be used and `foo`'s evaluation will be skipped -altogether. - -### Persistent Targets -```scala -def foo = T.persistent{ ... } -``` - -Identical to [Targets](#targets), except that the `dest/` directory is not -cleared in between runs. - -This is useful if you are running external incremental-compilers, such as -Scala's [Zinc](https://github.com/sbt/zinc), Javascript's -[WebPack](https://webpack.js.org/), which rely on filesystem caches to speed up -incremental execution of their particular build step. - -Since Mill no longer forces a "clean slate" re-evaluation of `T.persistent` -targets, it is up to you to ensure your code (or the third-party incremental -compilers you rely on!) are deterministic. They should always converge to the -same outputs for a given set of inputs, regardless of what builds and what -filesystem states existed before. - -### Inputs - -```scala -def foo = T.input{ ... } -``` - -A generalization of [Sources](#sources), `T.input`s are tasks that re-evaluate -*every time* (Unlike [Anonymous Tasks](#anonymous-tasks)), containing an -arbitrary block of code. - -Inputs can be used to force re-evaluation of some external property that may -affect your build. For example, if I have a [Target](#targets) `bar` that makes -use of the current git version: - -```scala -def bar = T{ ... %%("git", "rev-parse", "HEAD").out.string ... } -``` - -`bar` will not know that `git rev-parse` can change, and will -not know to re-evaluate when your `git rev-parse HEAD` *does* change. This means -`bar` will continue to use any previously cached value, and `bar`'s output will -be out of date! - -To fix this, you can wrap your `git rev-parse HEAD` in a `T.input`: - -```scala -def foo = T.input{ %%("git", "rev-parse", "HEAD").out.string } -def bar = T{ ... foo() ... } -``` - -This makes `foo` will always re-evaluate every build; if `git rev-parse HEAD` -does not change, that will not invalidate `bar`'s caches. But if `git rev-parse -HEAD` *does* change, `foo`'s output will change and `bar` will be correctly -invalidated and re-compute using the new version of `foo`. - -Note that because `T.input`s re-evaluate every time, you should ensure that the -code you put in `T.input` runs quickly. Ideally it should just be a simple check -"did anything change?" and any heavy-lifting can be delegated to downstream -targets. - -### Workers - -```scala -def foo = T.worker{ ... } -``` - -Most tasks dispose of their in-memory return-value every evaluation; in the case -of [Targets](#targets), this is stored on disk and loaded next time if -necessary, while [Commands](#commands) just re-compute them each time. Even if -you use `--watch` or the Build REPL to keep the Mill process running, all this -state is still discarded and re-built every evaluation. - -Workers are unique in that they store their in-memory return-value between -evaluations. This makes them useful for storing in-memory caches or references -to long-lived external worker processes that you can re-use. - -Mill uses workers to managed long-lived instances of the -[Zinc Incremental Scala Compiler](https://github.com/sbt/zinc) and the -[Scala.js Optimizer](https://github.com/scala-js/scala-js). This lets us keep -them in-memory with warm caches and fast incremental execution. - -Like [Persistent Targets](#persistent-targets), Workers inherently involve -mutable state, and it is up to the implementation to ensure that this mutable -state is only used for caching/performance and does not affect the -externally-visible behavior of the worker. - -## Cheat Sheet - -The following table might help you make sense of the small collection of -different Task types: - -| | Target | Command | Source/Input | Anonymous Task | Persistent Target | Worker | -|:-------------------------------|:-------|:--------|:-------------|:---------------|:------------------|:-------| -| Cached on Disk | X | X | | | X | | -| Must be JSON Writable | X | X | | | X | | -| Must be JSON Readable | X | | | | X | | -| Runnable from the Command Line | X | X | | | X | | -| Can Take Arguments | | X | | X | | | -| Cached between Evaluations | | | | | | X | - diff --git a/docs/pages/4 - Modules.md b/docs/pages/4 - Modules.md deleted file mode 100644 index c8d7378c..00000000 --- a/docs/pages/4 - Modules.md +++ /dev/null @@ -1,158 +0,0 @@ -Mill modules are `object`s extending `mill.Module`, and let you group related -tasks together to keep things neat and organized. Mill's comes with built in -modules such as `mill.scalalib.ScalaModule` and `mill.scalalib.CrossSbtModule`, -but you can use modules for other purposes as well. - -## Using Modules - -The path to a Mill module from the root of your build file corresponds to the -path you would use to run tasks within that module from the command line. e.g. -for the following build: - -```scala -object foo extends mill.Module{ - def bar = T{ "hello" } - object baz extends mill.Module{ - def qux = T{ "world" } - } -} -``` - -You would be able to run the two targets via `mill foo.bar` or `mill -foo.baz.qux`. You can use `mill show foo.bar` or `mill show foo.baz.qux` to -make Mill echo out the string value being returned by each Target. The two -targets will store their output metadata & files at `./out/foo/bar` and -`./out/foo/baz/qux` respectively. - -Modules also provide a way to define and re-use common collections of tasks, via -Scala `trait`s. For example, you can define your own `FooModule` trait: - -```scala -trait FooModule extends mill.Module{ - def bar = T{ "hello" } - def baz = T{ "world" } -} -``` - -And use it to define multiple modules with the same `bar` and `baz` targets, -along with any other customizations such as `qux`: - -```scala -object foo1 extends FooModule -object foo2 extends FooModule{ - def qux = T{ "I am Cow" } -} -``` - -This would make the following targets available from the command line - -- `mill show foo1.bar` -- `mill show foo1.baz` -- `mill show foo2.bar` -- `mill show foo2.baz` -- `mill show foo2.qux` - -The built in `mill.scalalib` package uses this to define -`mill.scalalib.ScalaModule`, `mill.scalalib.SbtModule` and -`mill.scalalib.TestScalaModule`, all of which contain a set of "standard" -operations such as `compile` `jar` or `assembly` that you may expect from a -typical Scala module. - -## Overriding Targets - -```scala -trait BaseModule extends Module { - def foo = T{ Seq("base") } - def cmd(i: Int) = T.command{ Seq("base" + i) } -} - -object canOverrideSuper with BaseModule { - def foo = T{ super.foo() ++ Seq("object") } - def cmd(i: Int) = T.command{ super.cmd(i)() ++ Seq("object" + i) } -} -``` - -You can override targets and commands to customize them or change what they do. -The overriden version is available via `super`. You can omit the `override` -keyword in Mill builds. - -## millSourcePath - -Each Module has a `millSourcePath` field that corresponds to the path that module -expects it's input files to be on disk. Re-visiting our examples above: - -```scala -object foo extends mill.Module{ - def bar = T{ "hello" } - object baz extends mill.Module{ - def qux = T{ "world" } - } -} -``` - -The `foo` module has a `millSourcePath` of `./foo`, while the `foo.baz` module has a -`millSourcePath` of `./foo/baz`. - -You can use `millSourcePath` to automatically set the source directories of your -modules to match the build structure. You are not forced to rigidly use -`millSourcePath` to define the source folders of all your code, but it can simplify -the common case where you probably want your build-layout on on-disk-layout to -be the same. - -e.g. for `mill.scalalib.ScalaModule`, the Scala source code is assumed by -default to be in `millSourcePath/"src"` while resources are automatically assumed to -be in `millSourcePath/"resources"`. - -You can override `millSourcePath`: - -```scala -object foo extends mill.Module{ - def millSourcePath = super.millSourcePath / "lols" - def bar = T{ "hello" } - object baz extends mill.Module{ - def qux = T{ "world" } - } -} -``` - -And any overrides propagate down to the module's children: in the above example, -module `foo` would have it's `millSourcePath` be `./foo/lols` while module` foo.baz` -would have it's `millSourcePath` be `./foo/lols/baz`. - -Note that `millSourcePath` is generally only used for a module's input source files. -Output is always in the `out/` folder and cannot be changed, e.g. even with the -overriden `millSourcePath` the output paths are still the default `./out/foo/bar` and -`./out/foo/baz/qux` folders. - -## External Modules - -Libraries for use in Mill can define `ExternalModule`s: `Module`s which are -shared between all builds which use that library: - -```scala -package foo -import mill._ - -object Bar extends mill.define.ExternalModule { - def baz = T{ 1 } - def qux() = T.command{ println(baz() + 1) } - - lazy val millDiscover = mill.define.Discover[this.type] -} -``` - -In the above example, `foo.Bar` is an `ExternalModule` living within the `foo` -Java package, containing the `baz` target and `qux` command. Those can be run -from the command line via: - -```bash -mill foo.Bar/baz -mill foo.Bar/qux -``` - -`ExternalModule`s are useful for someone providing a library for use with Mill -that is shared by the entire build: for example, -`mill.scalalib.ScalaWorkerApi/scalaWorker` provides a shared Scala compilation -service & cache that is shared between all `ScalaModule`s, and -`mill.scalalib.GenIdeaModule/idea` lets you generate IntelliJ projects without -needing to define your own `T.command` in your `build.sc` file \ No newline at end of file diff --git a/docs/pages/4 - Tasks.md b/docs/pages/4 - Tasks.md new file mode 100644 index 00000000..71974177 --- /dev/null +++ b/docs/pages/4 - Tasks.md @@ -0,0 +1,341 @@ +One of Mill's core abstractions is it's *Task Graph*: this is how Mill defines, +orders and caches work it needs to do, and exists independently of any support +for building Scala. + +The following is a simple self-contained example using Mill to compile Java: + +```scala +import ammonite.ops._, mill._ + +// sourceRoot -> allSources -> classFiles +// | +// v +// resourceRoot ----> jar + +def sourceRoot = T.sources{ pwd / 'src } + +def resourceRoot = T.sources{ pwd / 'resources } + +def allSources = T{ sourceRoot().flatMap(p => ls.rec(p.path)).map(PathRef(_)) } + +def classFiles = T{ + mkdir(T.ctx().dest) + import ammonite.ops._ + %("javac", sources().map(_.path.toString()), "-d", T.ctx().dest)(wd = T.ctx().dest) + PathRef(T.ctx().dest) +} + +def jar = T{ Jvm.createJar(Loose.Agg(classFiles().path) ++ resourceRoot().map(_.path)) } + +def run(mainClsName: String) = T.command{ + %%('java, "-cp", classFiles().path, mainClsName) +} +``` + +Here, we have two `T.source`s, `sourceRoot` and `resourceRoot`, which act as the +roots of our task graph. `allSources` depends on `sourceRoot` by calling +`sourceRoot()` to extract it's value, `classFiles` depends on `allSources` the +same way, and `jar` depends on both `classFiles` and `resourceRoot`. + +Filesystem o1perations in Mill are done using the +[Ammonite-Ops](http://ammonite.io/#Ammonite-Ops) library. + +The above build defines the following task graph: + +``` +sourceRoot -> allSources -> classFiles + | + v + resourceRoot ----> jar +``` + +When you first evaluate `jar` (e.g. via `mill jar` at the command line), it will +evaluate all the defined targets: `sourceRoot`, `allSources`, `classFiles`, +`resourceRoot` and `jar`. + +Subsequent `mill jars` will evaluate only as much as is necessary, depending on +what input sources changed: + +- If the files in `sourceRoot` change, it will re-evaluate `allSources`, + compiling to `classFiles`, and building the `jar` + +- If the files in `resourceRoot` change, it will only re-evaluate `jar` and use + the cached output of `allSources` and `classFiles` + +Apart from the `foo()` call-sites which define what each targets depend on, the +code within each `T{...}` wrapper is arbirary Scala code that can compute an +arbitrary result from it's inputs. + +## Different Kinds of Tasks + +There are four primary kinds of *Tasks* that you should care about: + +- [Targets](#targets), defined using `T{...}` +- [Sources](#sources), defined using `T.source{...}` +- [Commands](#commands), defined using `T.command{...}` + +### Targets + +```scala +def allSources = T{ ls.rec(sourceRoot().path).map(PathRef(_)) } +``` + +`Target`s are defined using the `def foo = T{...}` syntax, and dependencies on +other targets are defined using `foo()` to extract the value from them. Apart +from the `foo()` calls, the `T{...}` block contains arbitrary code that does +some work and returns a result. + +Each target e.g. `classFiles` is assigned a path on disk as scratch space & to +store it's output files at `out/classFiles/dest/`, and it's returned metadata is +automatically JSON-serialized and stored at `out/classFiles/meta.json`. The +return-value of targets has to be JSON-serializable via +[uPickle](https://github.com/lihaoyi/upickle). + +If you want to return a file or a set of files as the result of a `Target`, +write them to disk within your `T.ctx().dest` available through the +[Task Context API](#task-context-api) and return a `PathRef` to the files you +wrote. + +If a target's inputs change but it's output does not, e.g. someone changes a +comment within the source files that doesn't affect the classfiles, then +downstream targets do not re-evaluate. This is determined using the `.hashCode` +of the Target's return value. For target's returning `ammonite.ops.Path`s that +reference files on disk, you can wrap the `Path` in a `PathRef` (shown above) +whose `.hashCode()` will include the hashes of all files on disk at time of +creation. + +The graph of inter-dependent targets is evaluated in topological order; that +means that the body of a target will not even begin to evaluate if one of it's +upstream dependencies has failed. This is unlike normal Scala functions: a plain +old function `foo` would evaluate halfway and then blow up if one of `foo`'s +dependencies throws an exception. + +Targets cannot take parameters and must be 0-argument `def`s defined directly +within a `Module` body + +### Sources + +```scala +def sourceRootPath = pwd / 'src + +def sourceRoots = T.sources{ sourceRootPath } +``` + +`Source`s are defined using `T.source{ ... }`, taking one-or-more +`ammonite.ops.Path`s as arguments. A `Source` is a subclass of +`Target[Seq[PathRef]]`: this means that it's build signature/`hashCode` depends +not just on the path it refers to (e.g. `foo/bar/baz`) but also the MD5 hash of +the filesystem tree under that path. + +`T.source` also has an overload which takes `Seq[PathRef]`, to let you +override-and-extend source lists the same way you would any other `T{...}` +definition: + +```scala +def additionalSources = T.sources{ pwd / 'additionalSources } +def sourceRoots = T.sources{ super.sourceRoots() ++ additionalSources() } +``` + +### Commands + +```scala +def run(mainClsName: String) = T.command{ + %%('java, "-cp", classFiles().path, mainClsName) +} +``` + +Defined using `T.command{ ... }` syntax, `Command`s can run arbitrary code, with +dependencies declared using the same `foo()` syntax (e.g. `classFiles()` above). +Commands can be parametrized, but their output is not cached, so they will +re-evaluate every time even if none of their inputs have changed. + +Like [Targets](#targets), a command only evaluates after all it's upstream +dependencies have completed, and will not begin to run if any upstream +dependency has failed. + +Commands are assigned the same scratch/output directory `out/run/dest/` as +Targets are, and it's returned metadata stored at the same `out/run/meta.json` +path for consumption by external tools. + +Commands can only be defined directly within a `Module` body. + +## Task Context API + +There are several APIs available to you within the body of a `T{...}` or +`T.command{...}` block to help your write the code implementing your Target or +Command: + +### mill.util.Ctx.Dest + +- `T.ctx().dest` +- `implicitly[mill.util.Ctx.Dest]` + +This is the unique `out/classFiles/dest/` path or `out/run/dest/` path that is +assigned to every Target or Command. It is cleared before your task runs, and +you can use it as a scratch space for temporary files or a place to put returned +artifacts. This is guaranteed to be unique for every `Target` or `Command`, so +you can be sure that you will not collide or interfere with anyone else writing +to those same paths. + +### mill.util.Ctx.Log + +- `T.ctx().log` +- `implicitly[mill.util.Ctx.Log]` + +This is the default logger provided for every task. While your task is running, +`System.out` and `System.in` are also redirected to this logger. The logs for a +task are streamed to standard out/error as you would expect, but each task's +specific output is also streamed to a log file on disk e.g. `out/run/log` or +`out/classFiles/log` for you to inspect later. + +### mill.util.Ctx.Env + +- `T.ctx().env` +- `implicitly[mill.util.Ctx.Env]` + +Mill keeps a long-lived JVM server to avoid paying the cost of recurrent +classloading. Because of this, running `System.getenv` in a task might not yield +up to date environment variables, since it will be initialised when the server +starts, rather than when the client executes. To circumvent this, mill's client +sends the environment variables to the server as it sees them, and the server +makes them available as a `Map[String, String]` via the `Ctx` API. + +If the intent is to always pull the latest environment values, the call should +be wrapped in an `Input` as such : + +```scala +def envVar = T.input { T.ctx().env.get("ENV_VAR") } +``` + +## Other Tasks + +- [Anonymous Tasks](#anonymous-tasks), defined using `T.task{...}` +- [Persistent Targets](#persistent-targets) +- [Inputs](#inputs) +- [Workers](#workers) + + +### Anonymous Tasks + +```scala +def foo(x: Int) = T.task{ ... x ... bar() ... } +``` + +You can define anonymous tasks using the `T.task{ ... }` syntax. These are not +runnable from the command-line, but can be used to share common code you find +yourself repeating in `Target`s and `Command`s. + +```scala +def downstreamTarget = T{ ... foo() ... } +def downstreamCommand = T.command{ ... foo() ... } +``` +Anonymous tasks's output does not need to be JSON-serializable, their output is +not cached, and they can be defined with or without arguments. Unlike +[Targets](#targets) or [Commands](#commands), anonymous tasks can be defined +anywhere and passed around any way you want, until you finally make use of them +within a downstream target or command. + +While an anonymous task `foo`'s own output is not cached, if it is used in a +downstream target `bar` and the upstream targets's `baz` `qux` haven't changed, +`bar`'s cached output will be used and `foo`'s evaluation will be skipped +altogether. + +### Persistent Targets +```scala +def foo = T.persistent{ ... } +``` + +Identical to [Targets](#targets), except that the `dest/` directory is not +cleared in between runs. + +This is useful if you are running external incremental-compilers, such as +Scala's [Zinc](https://github.com/sbt/zinc), Javascript's +[WebPack](https://webpack.js.org/), which rely on filesystem caches to speed up +incremental execution of their particular build step. + +Since Mill no longer forces a "clean slate" re-evaluation of `T.persistent` +targets, it is up to you to ensure your code (or the third-party incremental +compilers you rely on!) are deterministic. They should always converge to the +same outputs for a given set of inputs, regardless of what builds and what +filesystem states existed before. + +### Inputs + +```scala +def foo = T.input{ ... } +``` + +A generalization of [Sources](#sources), `T.input`s are tasks that re-evaluate +*every time* (Unlike [Anonymous Tasks](#anonymous-tasks)), containing an +arbitrary block of code. + +Inputs can be used to force re-evaluation of some external property that may +affect your build. For example, if I have a [Target](#targets) `bar` that makes +use of the current git version: + +```scala +def bar = T{ ... %%("git", "rev-parse", "HEAD").out.string ... } +``` + +`bar` will not know that `git rev-parse` can change, and will +not know to re-evaluate when your `git rev-parse HEAD` *does* change. This means +`bar` will continue to use any previously cached value, and `bar`'s output will +be out of date! + +To fix this, you can wrap your `git rev-parse HEAD` in a `T.input`: + +```scala +def foo = T.input{ %%("git", "rev-parse", "HEAD").out.string } +def bar = T{ ... foo() ... } +``` + +This makes `foo` will always re-evaluate every build; if `git rev-parse HEAD` +does not change, that will not invalidate `bar`'s caches. But if `git rev-parse +HEAD` *does* change, `foo`'s output will change and `bar` will be correctly +invalidated and re-compute using the new version of `foo`. + +Note that because `T.input`s re-evaluate every time, you should ensure that the +code you put in `T.input` runs quickly. Ideally it should just be a simple check +"did anything change?" and any heavy-lifting can be delegated to downstream +targets. + +### Workers + +```scala +def foo = T.worker{ ... } +``` + +Most tasks dispose of their in-memory return-value every evaluation; in the case +of [Targets](#targets), this is stored on disk and loaded next time if +necessary, while [Commands](#commands) just re-compute them each time. Even if +you use `--watch` or the Build REPL to keep the Mill process running, all this +state is still discarded and re-built every evaluation. + +Workers are unique in that they store their in-memory return-value between +evaluations. This makes them useful for storing in-memory caches or references +to long-lived external worker processes that you can re-use. + +Mill uses workers to managed long-lived instances of the +[Zinc Incremental Scala Compiler](https://github.com/sbt/zinc) and the +[Scala.js Optimizer](https://github.com/scala-js/scala-js). This lets us keep +them in-memory with warm caches and fast incremental execution. + +Like [Persistent Targets](#persistent-targets), Workers inherently involve +mutable state, and it is up to the implementation to ensure that this mutable +state is only used for caching/performance and does not affect the +externally-visible behavior of the worker. + +## Cheat Sheet + +The following table might help you make sense of the small collection of +different Task types: + +| | Target | Command | Source/Input | Anonymous Task | Persistent Target | Worker | +|:-------------------------------|:-------|:--------|:-------------|:---------------|:------------------|:-------| +| Cached on Disk | X | X | | | X | | +| Must be JSON Writable | X | X | | | X | | +| Must be JSON Readable | X | | | | X | | +| Runnable from the Command Line | X | X | | | X | | +| Can Take Arguments | | X | | X | | | +| Cached between Evaluations | | | | | | X | + diff --git a/docs/pages/5 - Cross Builds.md b/docs/pages/5 - Cross Builds.md deleted file mode 100644 index f92678d5..00000000 --- a/docs/pages/5 - Cross Builds.md +++ /dev/null @@ -1,162 +0,0 @@ -Mill handles cross-building of all sorts via the `Cross[T]` module. - - -## Defining Cross Modules - -You can use this as follows: - -```scala -object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") -class FooModule(crossVersion: String) extends Module{ - def suffix = T{ crossVersion } - def bigSuffix = T{ suffix().toUpperCase() } -} -``` - -This defines three copies of `FooModule`: `"210"`, `"211"` and `"212"`, each of -which has their own `suffix` target. You can then run them via - -```bash -mill show foo[2.10].suffix -mill show foo[2.10].bigSuffix -mill show foo[2.11].suffix -mill show foo[2.11].bigSuffix -mill show foo[2.12].suffix -mill show foo[2.12].bigSuffix -``` - -The modules each also have a `millSourcePath` of - -```text -foo/2.10 -foo/2.11 -foo/2.12 -``` - -And the `suffix` targets will have the corresponding output paths for their -metadata and files: - -```text -foo/2.10/suffix -foo/2.10/bigSuffix -foo/2.11/suffix -foo/2.11/bigSuffix -foo/2.12/suffix -foo/2.12/bigSuffix -``` - -You can also have a cross-build with multiple inputs: - -```scala -val crossMatrix = for{ - crossVersion <- Seq("210", "211", "212") - platform <- Seq("jvm", "js", "native") - if !(platform == "native" && crossVersion != "212") -} yield (crossVersion, platform) - -object foo extends mill.Cross[FooModule](crossMatrix:_*) -class FooModule(crossVersion: String, platform: String) extends Module{ - def suffix = T{ crossVersion + "_" + platform } -} -``` - -Here, we define our cross-values programmatically using a `for`-loop that spits -out tuples instead of individual values. Our `FooModule` template class then -takes two parameters instead of one. This creates the following modules each -with their own `suffix` target: - -```bash -mill show foo[210,jvm].suffix -mill show foo[211,jvm].suffix -mill show foo[212,jvm].suffix -mill show foo[210,js].suffix -mill show foo[211,js].suffix -mill show foo[212,js].suffix -mill show foo[212,native].suffix -``` - -## Using Cross Modules from Outside - -You can refer to targets defined in cross-modules as follows: - -```scala -object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") -class FooModule(crossVersion: String) extends Module{ - def suffix = T{ crossVersion } -} - -def bar = T{ "hello " + foo("2.10").suffix } -``` - -Here, `foo("2.10")` references the `"2.10"` instance of `FooModule`. You can -refer to whatever versions of the cross-module you want, even using multiple -versions of the cross-module in the same target: - -```scala -object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") -class FooModule(crossVersion: String) extends Module{ - def suffix = T{ crossVersion } -} - -def bar = T{ "hello " + foo("2.10").suffix + " world " + foo("2.12").suffix } -``` - -## Using Cross Modules from other Cross Modules - -Targets in cross-modules can depend on one another the same way that external -targets: - -```scala -object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") -class FooModule(crossVersion: String) extends Module{ - def suffix = T{ crossVersion } -} - -object bar extends mill.Cross[BarModule]("2.10", "2.11", "2.12") -class BarModule(crossVersion: String) extends Module{ - def bigSuffix = T{ foo(crossVersion).suffix().toUpperCase() } -} -``` - -Here, you can run: - -```bash -mill show foo[2.10].suffix -mill show foo[2.11].suffix -mill show foo[2.12].suffix -mill show bar[2.10].bigSuffix -mill show bar[2.11].bigSuffix -mill show bar[2.12].bigSuffix -``` - - -## Cross Resolvers - -You can define an implicit `mill.define.Cross.Resolve` within your -cross-modules, which would let you use a shorthand `foo()` syntax when referring -to other cross-modules with an identical set of cross values: - -```scala -trait MyModule extends Module{ - def crossVersion: String - implicit object resolver extends mill.define.Cross.Resolver[MyModule]{ - def resolve[V <: MyModule](c: Cross[V]): V = c.itemMap(List(crossVersion)) - } -} - -object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") -class FooModule(val crossVersion: String) extends MyModule{ - def suffix = T{ crossVersion } -} - -object bar extends mill.Cross[BarModule]("2.10", "2.11", "2.12") -class BarModule(val crossVersion: String) extends MyModule{ - def longSuffix = T{ "_" + foo().suffix() } -} -``` - -While the example `resolver` simply looks up the target `Cross` value for the -cross-module instance with the same `crossVersion`, you can make the resolver -arbitrarily complex. e.g. the `resolver` for `mill.scalalib.CrossSbtModule` -looks for a cross-module instance whose `scalaVersion` is binary compatible -(e.g. 2.10.5 is compatible with 2.10.3) with the current cross-module. \ No newline at end of file diff --git a/docs/pages/5 - Modules.md b/docs/pages/5 - Modules.md new file mode 100644 index 00000000..c8d7378c --- /dev/null +++ b/docs/pages/5 - Modules.md @@ -0,0 +1,158 @@ +Mill modules are `object`s extending `mill.Module`, and let you group related +tasks together to keep things neat and organized. Mill's comes with built in +modules such as `mill.scalalib.ScalaModule` and `mill.scalalib.CrossSbtModule`, +but you can use modules for other purposes as well. + +## Using Modules + +The path to a Mill module from the root of your build file corresponds to the +path you would use to run tasks within that module from the command line. e.g. +for the following build: + +```scala +object foo extends mill.Module{ + def bar = T{ "hello" } + object baz extends mill.Module{ + def qux = T{ "world" } + } +} +``` + +You would be able to run the two targets via `mill foo.bar` or `mill +foo.baz.qux`. You can use `mill show foo.bar` or `mill show foo.baz.qux` to +make Mill echo out the string value being returned by each Target. The two +targets will store their output metadata & files at `./out/foo/bar` and +`./out/foo/baz/qux` respectively. + +Modules also provide a way to define and re-use common collections of tasks, via +Scala `trait`s. For example, you can define your own `FooModule` trait: + +```scala +trait FooModule extends mill.Module{ + def bar = T{ "hello" } + def baz = T{ "world" } +} +``` + +And use it to define multiple modules with the same `bar` and `baz` targets, +along with any other customizations such as `qux`: + +```scala +object foo1 extends FooModule +object foo2 extends FooModule{ + def qux = T{ "I am Cow" } +} +``` + +This would make the following targets available from the command line + +- `mill show foo1.bar` +- `mill show foo1.baz` +- `mill show foo2.bar` +- `mill show foo2.baz` +- `mill show foo2.qux` + +The built in `mill.scalalib` package uses this to define +`mill.scalalib.ScalaModule`, `mill.scalalib.SbtModule` and +`mill.scalalib.TestScalaModule`, all of which contain a set of "standard" +operations such as `compile` `jar` or `assembly` that you may expect from a +typical Scala module. + +## Overriding Targets + +```scala +trait BaseModule extends Module { + def foo = T{ Seq("base") } + def cmd(i: Int) = T.command{ Seq("base" + i) } +} + +object canOverrideSuper with BaseModule { + def foo = T{ super.foo() ++ Seq("object") } + def cmd(i: Int) = T.command{ super.cmd(i)() ++ Seq("object" + i) } +} +``` + +You can override targets and commands to customize them or change what they do. +The overriden version is available via `super`. You can omit the `override` +keyword in Mill builds. + +## millSourcePath + +Each Module has a `millSourcePath` field that corresponds to the path that module +expects it's input files to be on disk. Re-visiting our examples above: + +```scala +object foo extends mill.Module{ + def bar = T{ "hello" } + object baz extends mill.Module{ + def qux = T{ "world" } + } +} +``` + +The `foo` module has a `millSourcePath` of `./foo`, while the `foo.baz` module has a +`millSourcePath` of `./foo/baz`. + +You can use `millSourcePath` to automatically set the source directories of your +modules to match the build structure. You are not forced to rigidly use +`millSourcePath` to define the source folders of all your code, but it can simplify +the common case where you probably want your build-layout on on-disk-layout to +be the same. + +e.g. for `mill.scalalib.ScalaModule`, the Scala source code is assumed by +default to be in `millSourcePath/"src"` while resources are automatically assumed to +be in `millSourcePath/"resources"`. + +You can override `millSourcePath`: + +```scala +object foo extends mill.Module{ + def millSourcePath = super.millSourcePath / "lols" + def bar = T{ "hello" } + object baz extends mill.Module{ + def qux = T{ "world" } + } +} +``` + +And any overrides propagate down to the module's children: in the above example, +module `foo` would have it's `millSourcePath` be `./foo/lols` while module` foo.baz` +would have it's `millSourcePath` be `./foo/lols/baz`. + +Note that `millSourcePath` is generally only used for a module's input source files. +Output is always in the `out/` folder and cannot be changed, e.g. even with the +overriden `millSourcePath` the output paths are still the default `./out/foo/bar` and +`./out/foo/baz/qux` folders. + +## External Modules + +Libraries for use in Mill can define `ExternalModule`s: `Module`s which are +shared between all builds which use that library: + +```scala +package foo +import mill._ + +object Bar extends mill.define.ExternalModule { + def baz = T{ 1 } + def qux() = T.command{ println(baz() + 1) } + + lazy val millDiscover = mill.define.Discover[this.type] +} +``` + +In the above example, `foo.Bar` is an `ExternalModule` living within the `foo` +Java package, containing the `baz` target and `qux` command. Those can be run +from the command line via: + +```bash +mill foo.Bar/baz +mill foo.Bar/qux +``` + +`ExternalModule`s are useful for someone providing a library for use with Mill +that is shared by the entire build: for example, +`mill.scalalib.ScalaWorkerApi/scalaWorker` provides a shared Scala compilation +service & cache that is shared between all `ScalaModule`s, and +`mill.scalalib.GenIdeaModule/idea` lets you generate IntelliJ projects without +needing to define your own `T.command` in your `build.sc` file \ No newline at end of file diff --git a/docs/pages/6 - Cross Builds.md b/docs/pages/6 - Cross Builds.md new file mode 100644 index 00000000..f92678d5 --- /dev/null +++ b/docs/pages/6 - Cross Builds.md @@ -0,0 +1,162 @@ +Mill handles cross-building of all sorts via the `Cross[T]` module. + + +## Defining Cross Modules + +You can use this as follows: + +```scala +object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") +class FooModule(crossVersion: String) extends Module{ + def suffix = T{ crossVersion } + def bigSuffix = T{ suffix().toUpperCase() } +} +``` + +This defines three copies of `FooModule`: `"210"`, `"211"` and `"212"`, each of +which has their own `suffix` target. You can then run them via + +```bash +mill show foo[2.10].suffix +mill show foo[2.10].bigSuffix +mill show foo[2.11].suffix +mill show foo[2.11].bigSuffix +mill show foo[2.12].suffix +mill show foo[2.12].bigSuffix +``` + +The modules each also have a `millSourcePath` of + +```text +foo/2.10 +foo/2.11 +foo/2.12 +``` + +And the `suffix` targets will have the corresponding output paths for their +metadata and files: + +```text +foo/2.10/suffix +foo/2.10/bigSuffix +foo/2.11/suffix +foo/2.11/bigSuffix +foo/2.12/suffix +foo/2.12/bigSuffix +``` + +You can also have a cross-build with multiple inputs: + +```scala +val crossMatrix = for{ + crossVersion <- Seq("210", "211", "212") + platform <- Seq("jvm", "js", "native") + if !(platform == "native" && crossVersion != "212") +} yield (crossVersion, platform) + +object foo extends mill.Cross[FooModule](crossMatrix:_*) +class FooModule(crossVersion: String, platform: String) extends Module{ + def suffix = T{ crossVersion + "_" + platform } +} +``` + +Here, we define our cross-values programmatically using a `for`-loop that spits +out tuples instead of individual values. Our `FooModule` template class then +takes two parameters instead of one. This creates the following modules each +with their own `suffix` target: + +```bash +mill show foo[210,jvm].suffix +mill show foo[211,jvm].suffix +mill show foo[212,jvm].suffix +mill show foo[210,js].suffix +mill show foo[211,js].suffix +mill show foo[212,js].suffix +mill show foo[212,native].suffix +``` + +## Using Cross Modules from Outside + +You can refer to targets defined in cross-modules as follows: + +```scala +object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") +class FooModule(crossVersion: String) extends Module{ + def suffix = T{ crossVersion } +} + +def bar = T{ "hello " + foo("2.10").suffix } +``` + +Here, `foo("2.10")` references the `"2.10"` instance of `FooModule`. You can +refer to whatever versions of the cross-module you want, even using multiple +versions of the cross-module in the same target: + +```scala +object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") +class FooModule(crossVersion: String) extends Module{ + def suffix = T{ crossVersion } +} + +def bar = T{ "hello " + foo("2.10").suffix + " world " + foo("2.12").suffix } +``` + +## Using Cross Modules from other Cross Modules + +Targets in cross-modules can depend on one another the same way that external +targets: + +```scala +object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") +class FooModule(crossVersion: String) extends Module{ + def suffix = T{ crossVersion } +} + +object bar extends mill.Cross[BarModule]("2.10", "2.11", "2.12") +class BarModule(crossVersion: String) extends Module{ + def bigSuffix = T{ foo(crossVersion).suffix().toUpperCase() } +} +``` + +Here, you can run: + +```bash +mill show foo[2.10].suffix +mill show foo[2.11].suffix +mill show foo[2.12].suffix +mill show bar[2.10].bigSuffix +mill show bar[2.11].bigSuffix +mill show bar[2.12].bigSuffix +``` + + +## Cross Resolvers + +You can define an implicit `mill.define.Cross.Resolve` within your +cross-modules, which would let you use a shorthand `foo()` syntax when referring +to other cross-modules with an identical set of cross values: + +```scala +trait MyModule extends Module{ + def crossVersion: String + implicit object resolver extends mill.define.Cross.Resolver[MyModule]{ + def resolve[V <: MyModule](c: Cross[V]): V = c.itemMap(List(crossVersion)) + } +} + +object foo extends mill.Cross[FooModule]("2.10", "2.11", "2.12") +class FooModule(val crossVersion: String) extends MyModule{ + def suffix = T{ crossVersion } +} + +object bar extends mill.Cross[BarModule]("2.10", "2.11", "2.12") +class BarModule(val crossVersion: String) extends MyModule{ + def longSuffix = T{ "_" + foo().suffix() } +} +``` + +While the example `resolver` simply looks up the target `Cross` value for the +cross-module instance with the same `crossVersion`, you can make the resolver +arbitrarily complex. e.g. the `resolver` for `mill.scalalib.CrossSbtModule` +looks for a cross-module instance whose `scalaVersion` is binary compatible +(e.g. 2.10.5 is compatible with 2.10.3) with the current cross-module. \ No newline at end of file diff --git a/docs/pages/6 - Extending Mill.md b/docs/pages/6 - Extending Mill.md deleted file mode 100644 index 75b7643a..00000000 --- a/docs/pages/6 - Extending Mill.md +++ /dev/null @@ -1,183 +0,0 @@ -There are many different ways of extending Mill, depending on how much -customization and flexibility you need. This page will go through your options -from the easiest/least-flexible to the hardest/most-flexible. - -## Custom Targets & Commands - -The simplest way of adding custom functionality to Mill is to define a custom -Target or Command: - -```scala -def foo = T{ ... } -def bar(x: Int, s: String) = T.command{ ... } -``` - -These can depend on other Targets, contain arbitrary code, and be placed -top-level or within any module. If you have something you just want to *do* that -isn't covered by the built-in `ScalaModule`s/`ScalaJSModule`s, simply write a -custom Target (for cached computations) or Command (for un-cached actions) and -you're done. - -For subprocess/filesystem operations, you can use the -[Ammonite-Ops](http://ammonite.io/#Ammonite-Ops) library that comes bundled with -Mill, or even plain `java.nio`/`java.lang.Process`. Each target gets it's own -[T.ctx().dest](http://www.lihaoyi.com/mill/page/tasks#millutilctxdestctx) folder -that you can use to place files without worrying about colliding with other -targets. - -This covers use cases like: - -### Compile some Javascript with Webpack and put it in your runtime classpath: - -```scala -def doWebpackStuff(sources: Seq[PathRef]): PathRef = ??? - -def javascriptSources = T.sources{ millSourcePath / "js" } -def compiledJavascript = T{ doWebpackStuff(javascriptSources()) } -object foo extends ScalaModule{ - def runClasspath = T{ super.runClasspath() ++ compiledJavascript() } -} -``` - -### Deploy your compiled assembly to AWS - -```scala -object foo extends ScalaModule{ - -} - -def deploy(assembly: PathRef, credentials: String) = ??? - -def deployFoo(credentials: String) = T.command{ deployFoo(foo.assembly()) } -``` - - -## Custom Workers - -[Custom Targets & Commands](#custom-targets--commands) are re-computed from -scratch each time; sometimes you want to keep values around in-memory when using -`--watch` or the Build REPL. e.g. you may want to keep a webpack process running -so webpack's own internal caches are hot and compilation is fast: - -```scala -def webpackWorker = T.worker{ - // Spawn a process using java.lang.Process and return it -} - -def javascriptSources = T.sources{ millSourcePath / "js" } - -def doWebpackStuff(webpackProcess: Process, sources: Seq[PathRef]): PathRef = ??? - -def compiledJavascript = T{ doWebpackStuff(webpackWorker(), javascriptSources()) } -``` - -Mill itself uses `T.worker`s for it's built-in Scala support: we keep the Scala -compiler in memory between compilations, rather than discarding it each time, in -order to improve performance. - -## Custom Modules - -```scala -trait FooModule extends mill.Module{ - def bar = T{ "hello" } - def baz = T{ "world" } -} -``` - -Custom modules are useful if you have a common set of tasks that you want to -re-used across different parts of your build. You simply define a `trait` -inheriting from `mill.Module`, and then use that `trait` as many times as you -want in various `object`s: - -```scala -object foo1 extends FooModule -object foo2 extends FooModule{ - def qux = T{ "I am Cow" } -} -``` - -You can also define a `trait` extending the built-in `ScalaModule` if you have -common configuration you want to apply to all your `ScalaModule`s: - -```scala -trait FooModule extends ScalaModule{ - def scalaVersion = "2.11.11" - object test extends Tests{ - def ivyDeps = Agg(ivy"org.scalatest::scalatest:3.0.4") - def testFrameworks = Seq("org.scalatest.tools.Framework") - } -} -``` - -## import $file - -If you want to define some functionality that can be used both inside and -outside the build, you can create a new `foo.sc` file next to your `build.sc`, -`import $file.foo`, and use it in your `build.sc` file: - -```scala -// foo.sc -def fooValue() = 31337 -``` -```scala -// build.sc -import $file.foo -def printFoo() = T.command{ println(foo.fooValue()) } -``` - -Mill's `import $file` syntax supports the full functionality of -[Ammonite Scripts](http://ammonite.io/#ScalaScripts) - -## import $ivy - -If you want to pull in artifacts from the public repositories (e.g. Maven -Central) for use in your build, you can simple use `import $ivy`: - -```scala -// build.sc -import $ivy.`com.lihaoyi::scalatags:0.6.2` - - -def generatedHtml = T{ - import scalatags.Text.all._ - html( - head(), - body( - h1("Hello"), - p("World") - ) - ).render -} -``` - -This creates the `generatedHtml` target which can then be used however you would -like: written to a file, further processed, etc. - -If you want to publish re-usable libraries that *other* people can use in their -builds, simply publish your code as a library to maven central. - -For more information, see Ammonite's -[Ivy Dependencies documentation](http://ammonite.io/#import$ivy) - -## Evaluator Commands - -You can define a command that takes in the current `Evaluator` as an argument, -which you can use to inspect the entire build, or run arbitrary tasks. For -example, here is the `mill.scalalib.GenIdea/idea` command which uses this to -traverse the module-tree and generate an Intellij project config for your build. - -```scala -def idea(ev: Evaluator[Any]) = T.command{ - mill.scalalib.GenIdea( - implicitly, - ev.rootModule, - ev.discover - ) -} -``` - -Many built-in tools are implemented as custom evaluator commands: -[all](intro.html#all), [inspect](intro.html#inspect), -[resolve](intro.html#resolve), [show](intro.html#show). If you want a way to run Mill -commands and programmatically manipulate the tasks and outputs, you do so with -your own evaluator command. diff --git a/docs/pages/7 - Extending Mill.md b/docs/pages/7 - Extending Mill.md new file mode 100644 index 00000000..75b7643a --- /dev/null +++ b/docs/pages/7 - Extending Mill.md @@ -0,0 +1,183 @@ +There are many different ways of extending Mill, depending on how much +customization and flexibility you need. This page will go through your options +from the easiest/least-flexible to the hardest/most-flexible. + +## Custom Targets & Commands + +The simplest way of adding custom functionality to Mill is to define a custom +Target or Command: + +```scala +def foo = T{ ... } +def bar(x: Int, s: String) = T.command{ ... } +``` + +These can depend on other Targets, contain arbitrary code, and be placed +top-level or within any module. If you have something you just want to *do* that +isn't covered by the built-in `ScalaModule`s/`ScalaJSModule`s, simply write a +custom Target (for cached computations) or Command (for un-cached actions) and +you're done. + +For subprocess/filesystem operations, you can use the +[Ammonite-Ops](http://ammonite.io/#Ammonite-Ops) library that comes bundled with +Mill, or even plain `java.nio`/`java.lang.Process`. Each target gets it's own +[T.ctx().dest](http://www.lihaoyi.com/mill/page/tasks#millutilctxdestctx) folder +that you can use to place files without worrying about colliding with other +targets. + +This covers use cases like: + +### Compile some Javascript with Webpack and put it in your runtime classpath: + +```scala +def doWebpackStuff(sources: Seq[PathRef]): PathRef = ??? + +def javascriptSources = T.sources{ millSourcePath / "js" } +def compiledJavascript = T{ doWebpackStuff(javascriptSources()) } +object foo extends ScalaModule{ + def runClasspath = T{ super.runClasspath() ++ compiledJavascript() } +} +``` + +### Deploy your compiled assembly to AWS + +```scala +object foo extends ScalaModule{ + +} + +def deploy(assembly: PathRef, credentials: String) = ??? + +def deployFoo(credentials: String) = T.command{ deployFoo(foo.assembly()) } +``` + + +## Custom Workers + +[Custom Targets & Commands](#custom-targets--commands) are re-computed from +scratch each time; sometimes you want to keep values around in-memory when using +`--watch` or the Build REPL. e.g. you may want to keep a webpack process running +so webpack's own internal caches are hot and compilation is fast: + +```scala +def webpackWorker = T.worker{ + // Spawn a process using java.lang.Process and return it +} + +def javascriptSources = T.sources{ millSourcePath / "js" } + +def doWebpackStuff(webpackProcess: Process, sources: Seq[PathRef]): PathRef = ??? + +def compiledJavascript = T{ doWebpackStuff(webpackWorker(), javascriptSources()) } +``` + +Mill itself uses `T.worker`s for it's built-in Scala support: we keep the Scala +compiler in memory between compilations, rather than discarding it each time, in +order to improve performance. + +## Custom Modules + +```scala +trait FooModule extends mill.Module{ + def bar = T{ "hello" } + def baz = T{ "world" } +} +``` + +Custom modules are useful if you have a common set of tasks that you want to +re-used across different parts of your build. You simply define a `trait` +inheriting from `mill.Module`, and then use that `trait` as many times as you +want in various `object`s: + +```scala +object foo1 extends FooModule +object foo2 extends FooModule{ + def qux = T{ "I am Cow" } +} +``` + +You can also define a `trait` extending the built-in `ScalaModule` if you have +common configuration you want to apply to all your `ScalaModule`s: + +```scala +trait FooModule extends ScalaModule{ + def scalaVersion = "2.11.11" + object test extends Tests{ + def ivyDeps = Agg(ivy"org.scalatest::scalatest:3.0.4") + def testFrameworks = Seq("org.scalatest.tools.Framework") + } +} +``` + +## import $file + +If you want to define some functionality that can be used both inside and +outside the build, you can create a new `foo.sc` file next to your `build.sc`, +`import $file.foo`, and use it in your `build.sc` file: + +```scala +// foo.sc +def fooValue() = 31337 +``` +```scala +// build.sc +import $file.foo +def printFoo() = T.command{ println(foo.fooValue()) } +``` + +Mill's `import $file` syntax supports the full functionality of +[Ammonite Scripts](http://ammonite.io/#ScalaScripts) + +## import $ivy + +If you want to pull in artifacts from the public repositories (e.g. Maven +Central) for use in your build, you can simple use `import $ivy`: + +```scala +// build.sc +import $ivy.`com.lihaoyi::scalatags:0.6.2` + + +def generatedHtml = T{ + import scalatags.Text.all._ + html( + head(), + body( + h1("Hello"), + p("World") + ) + ).render +} +``` + +This creates the `generatedHtml` target which can then be used however you would +like: written to a file, further processed, etc. + +If you want to publish re-usable libraries that *other* people can use in their +builds, simply publish your code as a library to maven central. + +For more information, see Ammonite's +[Ivy Dependencies documentation](http://ammonite.io/#import$ivy) + +## Evaluator Commands + +You can define a command that takes in the current `Evaluator` as an argument, +which you can use to inspect the entire build, or run arbitrary tasks. For +example, here is the `mill.scalalib.GenIdea/idea` command which uses this to +traverse the module-tree and generate an Intellij project config for your build. + +```scala +def idea(ev: Evaluator[Any]) = T.command{ + mill.scalalib.GenIdea( + implicitly, + ev.rootModule, + ev.discover + ) +} +``` + +Many built-in tools are implemented as custom evaluator commands: +[all](intro.html#all), [inspect](intro.html#inspect), +[resolve](intro.html#resolve), [show](intro.html#show). If you want a way to run Mill +commands and programmatically manipulate the tasks and outputs, you do so with +your own evaluator command. diff --git a/docs/pages/7 - Mill Internals.md b/docs/pages/7 - Mill Internals.md deleted file mode 100644 index 3700c9df..00000000 --- a/docs/pages/7 - Mill Internals.md +++ /dev/null @@ -1,357 +0,0 @@ - -## Mill Design Principles - -A lot of mills design principles are intended to fix SBT's flaws, as described -in the blog post -[What's wrong with SBT](http://www.lihaoyi.com/post/SowhatswrongwithSBT.html), -building on the best ideas from tools like [CBT](https://github.com/cvogt/cbt) -and [Bazel](https://bazel.build/), and the ideas from my blog post -[Build Tools as -Pure Functional Programs](http://www.lihaoyi.com/post/BuildToolsasPureFunctionalPrograms.html). -Before working on Mill, read through that post to understand where it is coming -from! - -### Dependency graph first - -Mill's most important abstraction is the dependency graph of `Task`s. -Constructed using the `T{...}` `T.task{...}` `T.command{...}` syntax, these -track the dependencies between steps of a build, so those steps can be executed -in the correct order, queried, or parallelized. - -While Mill provides helpers like `ScalaModule` and other things you can use to -quickly instantiate a bunch of related tasks (resolve dependencies, find -sources, compile, package into jar, ...) these are secondary. When Mill -executes, the dependency graph is what matters: any other mode of organization -(hierarchies, modules, inheritence, etc.) is only important to create this -dependency graph of `Task`s. - -### Builds are hierarchical - -The syntax for running targets from the command line `mill Foo.bar.baz` is -the same as referencing a target in Scala code, `Foo.bar.baz` - -Everything that you can run from the command line lives in an object hierarchy -in your `build.sc` file. Different parts of the hierarchy can have different -`Target`s available: just add a new `def foo = T{...}` somewhere and you'll be -able to run it. - -Cross builds, using the `Cross` data structure, are just another kind of node in -the object hierarchy. The only difference is syntax: from the command line you'd -run something via `mill core.cross[a].printIt` while from code you use -`core.cross("a").printIt` due to different restrictions in Scala/Bash syntax. - -### Caching by default - -Every `Target` in a build, defined by `def foo = T{...}`, is cached by default. -Currently this is done using a `foo/meta.json` file in the `out/` folder. The -`Target` is also provided a `foo/` path on the filesystem dedicated to it, for -it to store output files etc. - -This happens whether you want it to or not. Every `Target` is cached, not just -the "slow" ones like `compile` or `assembly`. - -Caching is keyed on the `.hashCode` of the returned value. For `Target`s -returning the contents of a file/folder on disk, they return `PathRef` instances -whose hashcode is based on the hash of the disk contents. Serialization of the -returned values is tentatively done using uPickle. - -### Short-lived build processes - -The Mill build process is meant to be run over and over, not only as a -long-lived daemon/console. That means we must minimize the startup time of the -process, and that a new process must be able to re-construct the in-memory data -structures where a previous process left off, in order to continue the build. - -Re-construction is done via the hierarchical nature of the build: each `Target` -`foo.bar.baz` has a fixed position in the build hierarchy, and thus a fixed -position on disk `out/foo/bar/baz/meta.json`. When the old process dies and a -new process starts, there will be a new instance of `Target` with the same -implementation code and same position in the build hierarchy: this new `Target` -can then load the `out/foo/bar/baz/meta.json` file and pick up where the -previous process left off. - -Minimizing startup time means aggressive caching, as well as minimizing the -total amount of bytecode used: Mill's current 1-2s startup time is dominated by -JVM classloading. In future, we may have a long lived console or -nailgun/drip-based server/client models to speed up interactive usage, but we -should always keep "cold" startup as fast as possible. - -### Static dependency graph and Applicative tasks - -`Task`s are *Applicative*, not *Monadic*. There is `.map`, `.zip`, but no -`.flatMap` operation. That means that we can know the structure of the entire -dependency graph before we start executing `Task`s. This lets us perform all -sorts of useful operations on the graph before running it: - -- Given a Target the user wants to run, pre-compute and display what targets - will be evaluated ("dry run"), without running them - -- Automatically parallelize different parts of the dependency graph that do not - depend on each other, perhaps even distributing it to different worker - machines like Bazel/Pants can - -- Visualize the dependency graph easily, e.g. by dumping to a DOT file - -- Query the graph, e.g. "why does this thing depend on that other thing?" - -- Avoid running tasks "halfway": if a Target's upstream Targets fail, we can - skip the Target completely rather than running halfway and then bailing out - with an exception - -In order to avoid making people using `.map` and `.zip` all over the place when -defining their `Task`s, we use the `T{...}`/`T.task{...}`/`T.command{...}` -macros which allow you to use `Task#apply()` within the block to "extract" a -value. - -```scala -def test() = T.command{ - TestRunner.apply( - "mill.UTestFramework", - runDepClasspath().map(_.path) :+ compile().path, - Seq(compile().path) - -} -``` - -This is roughly to the following: - -```scala -def test() = T.command{ T.zipMap(runDepClasspath, compile, compile){ - (runDepClasspath1, compile2, compile3) => - TestRunner.apply( - "mill.UTestFramework", - runDepClasspath1.map(_.path) :+ compile2.path, - Seq(compile3.path) - ) -} -``` - -This is similar to SBT's `:=`/`.value` macros, or `scala-async`'s -`async`/`await`. Like those, the `T{...}` macro should let users program most of -their code in a "direct" style and have it "automatically" lifted into a graph -of `Task`s. - -## How Mill aims for Simple - -Why should you expect that the Mill build tool can achieve simple, easy & -flexible, where other build tools in the past have failed? - -Build tools inherently encompass a huge number of different concepts: - -- What "Tasks" depends on what? -- How do I define my own tasks? -- Where do source files come from? -- What needs to run in what order to do what I want? -- What can be parallelized and what can't? -- How do tasks pass data to each other? What data do they pass? -- What tasks are cached? Where? -- How are tasks run from the command line? -- How do you deal with the repetition inherent a build? (e.g. compile, run & - test tasks for every "module") -- What is a "Module"? How do they relate to "Tasks"? -- How do you configure a Module to do something different? -- How are cross-builds (across different configurations) handled? - -These are a lot of questions to answer, and we haven't even started talking -about the actually compiling/running any code yet! If each such facet of a build -was modelled separately, it's easy to have an explosion of different concepts -that would make a build tool hard to understand. - -Before you continue, take a moment to think: how would you answer to each of -those questions using an existing build tool you are familiar with? Different -tools like [SBT](http://www.scala-sbt.org/), -[Fake](https://fake.build/legacy-index.html), [Gradle](https://gradle.org/) or -[Grunt](https://gruntjs.com/) have very different answers. - -Mill aims to provide the answer to these questions using as few, as familiar -core concepts as possible. The entire Mill build is oriented around a few -concepts: - -- The Object Hierarchy -- The Call Graph -- Instantiating Traits & Classes - -These concepts are already familiar to anyone experienced in Scala (or any other -programming language...), but are enough to answer all of the complicated -build-related questions listed above. - -## The Object Hierarchy - -The module hierarchy is the graph of objects, starting from the root of the -`build.sc` file, that extend `mill.Module`. At the leaves of the hierarchy are -the `Target`s you can run. - -A `Target`'s position in the module hierarchy tells you many things. For -example, a `Target` at position `core.test.compile` would: - -- Cache output metadata at `out/core/test/compile/meta.json` - -- Output files to the folder `out/core/test/compile/dest/` - -- Source files default to a folder in `core/test/`, `core/test/src/` - -- Be runnable from the command-line via `mill core.test.compile` - -- Be referenced programmatically (from other `Target`s) via `core.test.compile` - -From the position of any `Target` within the object hierarchy, you immediately -know how to run it, find its output files, find any caches, or refer to it from -other `Target`s. You know up-front where the `Target`'s data "lives" on disk, and -are sure that it will never clash with any other `Target`'s data. - -## The Call Graph - -The Scala call graph of "which target references which other target" is core to -how Mill operates. This graph is reified via the `T{...}` macro to make it -available to the Mill execution engine at runtime. The call graph tells you: - -- Which `Target`s depend on which other `Target`s - -- For a given `Target` to be built, what other `Target`s need to be run and in - what order - -- Which `Target`s can be evaluated in parallel - -- What source files need to be watched when using `--watch` on a given target (by - tracing the call graph up to the `Source`s) - -- What a given `Target` makes available for other `Target`s to depend on (via - its return value) - -- Defining your own task that depends on others is as simple as `def foo = - T{...}` - -The call graph within your Scala code is essentially a data-flow graph: by -defining a snippet of code: - -```scala -val b = ... -val c = ... -val d = ... -val a = f(b, c, d) -``` - -you are telling everyone that the value `a` depends on the values of `b` `c` and -`d`, processed by `f`. A build tool needs exactly the same data structure: -knowing what `Target` depends on what other `Target`s, and what processing it -does on its inputs! - -With Mill, you can take the Scala call graph, wrap everything in the `T{...}` -macro, and get a `Target`-dependency graph that matches exactly the call-graph -you already had: - -```scala -val b = T{ ... } -val c = T{ ... } -val d = T{ ... } -val a = T{ f(b(), c(), d()) } -``` - -Thus, if you are familiar with how data flows through a normal Scala program, -you already know how data flows through a Mill build! The Mill build evaluation -may be incremental, it may cache things, it may read and write from disk, but -the fundamental syntax, and the data-flow that syntax represents, is unchanged -from your normal Scala code. - -## Instantiating Traits & Classes - -Classes and traits are a common way of re-using common data structures in Scala: -if you have a bunch of fields which are related and you want to make multiple -copies of those fields, you put them in a class/trait and instantiate it over -and over. - -In Mill, inheriting from traits is the primary way for re-using common parts of -a build: - -- Scala "project"s with multiple related `Target`s within them, are just a - `Trait` you instantiate - -- Replacing the default `Target`s within a project, making them do new - things or depend on new `Target`s, is simply `override`-ing them during - inheritence. - -- Modifying the default `Target`s within a project, making use of the old value - to compute the new value, is simply `override`ing them and using `super.foo()` - -- Required configuration parameters within a `project` are `abstract` members. - -- Cross-builds are modelled as instantiating a (possibly anonymous) class - multiple times, each instance with its own distinct set of `Target`s - -In normal Scala, you bundle up common fields & functionality into a `class` you -can instantiate over and over, and you can override the things you want to -customize. Similarly, in Mill, you bundle up common parts of a build into -`trait`s you can instantiate over and over, and you can override the things you -want to customize. "Subprojects", "cross-builds", and many other concepts are -reduced to simply instantiating a `trait` over and over, with tweaks. - -## Prior Work - -### SBT - -Mill is built as a substitute for SBT, whose problems are -[described here](http://www.lihaoyi.com/post/SowhatswrongwithSBT.html). -Nevertheless, Mill takes on some parts of SBT (builds written in Scala, Task -graph with an Applicative "idiom bracket" macro) where it makes sense. - -### Bazel - -Mill is largely inspired by [Bazel](https://bazel.build/). In particular, the -single-build-hierarchy, where every Target has an on-disk-cache/output-directory -according to their position in the hierarchy, comes from Bazel. - -Bazel is a bit odd in it’s own right. The underlying data model is good -(hierarchy + cached dependency graph) but getting there is hell. It (like SBT) is -also a 3-layer interpretation model, but layers 1 & 2 are almost exactly the -same: mutable python which performs global side effects (layer 3 is the same -dependency-graph evaluator as SBT/mill). - -You end up having to deal with a non-trivial python codebase where everything -happens via: - -```python -do_something(name="blah") -``` - -or - -```python -do_other_thing(dependencies=["blah"]) - -``` -where `"blah"` is a global identifier that is often constructed programmatically -via string concatenation and passed around. This is quite challenging. - -Having the two layers be “just python” is great since people know python, but I -think unnecessary two have two layers ("evaluating macros" and "evaluating rule -impls") that are almost exactly the same, and I think making them interact via -return values rather than via a global namespace of programmatically-constructed -strings would make it easier to follow. - -With Mill, I’m trying to collapse Bazel’s Python layer 1 & 2 into just 1 layer -of Scala, and have it define its dependency graph/hierarchy by returning -values, rather than by calling global-side-effecting APIs. I've had trouble -trying to teach people how-to-bazel at work, and am pretty sure we can make -something that's easier to use. - -### Scala.Rx - -Mill's "direct-style" applicative syntax is inspired by my old -[Scala.Rx](https://github.com/lihaoyi/scala.rx) project. While there are -differences (Mill captures the dependency graph lexically using Macros, Scala.Rx -captures it at runtime, they are pretty similar. - -The end-goal is the same: to write code in a "direct style" and have it -automatically "lifted" into a dependency graph, which you can introspect and use -for incremental updates at runtime. - -Scala.Rx is itself build upon the 2010 paper -[Deprecating the Observer Pattern](https://infoscience.epfl.ch/record/148043/files/DeprecatingObserversTR2010.pdf). - -### CBT - -Mill looks a lot like [CBT](https://github.com/cvogt/cbt). The inheritance based -model for customizing `Module`s/`ScalaModule`s comes straight from there, as -does the "command line path matches Scala selector path" idea. Most other things -are different though: the reified dependency graph, the execution model, the -caching module all follow Bazel more than they do CBT diff --git a/docs/pages/8 - Mill Internals.md b/docs/pages/8 - Mill Internals.md new file mode 100644 index 00000000..3700c9df --- /dev/null +++ b/docs/pages/8 - Mill Internals.md @@ -0,0 +1,357 @@ + +## Mill Design Principles + +A lot of mills design principles are intended to fix SBT's flaws, as described +in the blog post +[What's wrong with SBT](http://www.lihaoyi.com/post/SowhatswrongwithSBT.html), +building on the best ideas from tools like [CBT](https://github.com/cvogt/cbt) +and [Bazel](https://bazel.build/), and the ideas from my blog post +[Build Tools as +Pure Functional Programs](http://www.lihaoyi.com/post/BuildToolsasPureFunctionalPrograms.html). +Before working on Mill, read through that post to understand where it is coming +from! + +### Dependency graph first + +Mill's most important abstraction is the dependency graph of `Task`s. +Constructed using the `T{...}` `T.task{...}` `T.command{...}` syntax, these +track the dependencies between steps of a build, so those steps can be executed +in the correct order, queried, or parallelized. + +While Mill provides helpers like `ScalaModule` and other things you can use to +quickly instantiate a bunch of related tasks (resolve dependencies, find +sources, compile, package into jar, ...) these are secondary. When Mill +executes, the dependency graph is what matters: any other mode of organization +(hierarchies, modules, inheritence, etc.) is only important to create this +dependency graph of `Task`s. + +### Builds are hierarchical + +The syntax for running targets from the command line `mill Foo.bar.baz` is +the same as referencing a target in Scala code, `Foo.bar.baz` + +Everything that you can run from the command line lives in an object hierarchy +in your `build.sc` file. Different parts of the hierarchy can have different +`Target`s available: just add a new `def foo = T{...}` somewhere and you'll be +able to run it. + +Cross builds, using the `Cross` data structure, are just another kind of node in +the object hierarchy. The only difference is syntax: from the command line you'd +run something via `mill core.cross[a].printIt` while from code you use +`core.cross("a").printIt` due to different restrictions in Scala/Bash syntax. + +### Caching by default + +Every `Target` in a build, defined by `def foo = T{...}`, is cached by default. +Currently this is done using a `foo/meta.json` file in the `out/` folder. The +`Target` is also provided a `foo/` path on the filesystem dedicated to it, for +it to store output files etc. + +This happens whether you want it to or not. Every `Target` is cached, not just +the "slow" ones like `compile` or `assembly`. + +Caching is keyed on the `.hashCode` of the returned value. For `Target`s +returning the contents of a file/folder on disk, they return `PathRef` instances +whose hashcode is based on the hash of the disk contents. Serialization of the +returned values is tentatively done using uPickle. + +### Short-lived build processes + +The Mill build process is meant to be run over and over, not only as a +long-lived daemon/console. That means we must minimize the startup time of the +process, and that a new process must be able to re-construct the in-memory data +structures where a previous process left off, in order to continue the build. + +Re-construction is done via the hierarchical nature of the build: each `Target` +`foo.bar.baz` has a fixed position in the build hierarchy, and thus a fixed +position on disk `out/foo/bar/baz/meta.json`. When the old process dies and a +new process starts, there will be a new instance of `Target` with the same +implementation code and same position in the build hierarchy: this new `Target` +can then load the `out/foo/bar/baz/meta.json` file and pick up where the +previous process left off. + +Minimizing startup time means aggressive caching, as well as minimizing the +total amount of bytecode used: Mill's current 1-2s startup time is dominated by +JVM classloading. In future, we may have a long lived console or +nailgun/drip-based server/client models to speed up interactive usage, but we +should always keep "cold" startup as fast as possible. + +### Static dependency graph and Applicative tasks + +`Task`s are *Applicative*, not *Monadic*. There is `.map`, `.zip`, but no +`.flatMap` operation. That means that we can know the structure of the entire +dependency graph before we start executing `Task`s. This lets us perform all +sorts of useful operations on the graph before running it: + +- Given a Target the user wants to run, pre-compute and display what targets + will be evaluated ("dry run"), without running them + +- Automatically parallelize different parts of the dependency graph that do not + depend on each other, perhaps even distributing it to different worker + machines like Bazel/Pants can + +- Visualize the dependency graph easily, e.g. by dumping to a DOT file + +- Query the graph, e.g. "why does this thing depend on that other thing?" + +- Avoid running tasks "halfway": if a Target's upstream Targets fail, we can + skip the Target completely rather than running halfway and then bailing out + with an exception + +In order to avoid making people using `.map` and `.zip` all over the place when +defining their `Task`s, we use the `T{...}`/`T.task{...}`/`T.command{...}` +macros which allow you to use `Task#apply()` within the block to "extract" a +value. + +```scala +def test() = T.command{ + TestRunner.apply( + "mill.UTestFramework", + runDepClasspath().map(_.path) :+ compile().path, + Seq(compile().path) + +} +``` + +This is roughly to the following: + +```scala +def test() = T.command{ T.zipMap(runDepClasspath, compile, compile){ + (runDepClasspath1, compile2, compile3) => + TestRunner.apply( + "mill.UTestFramework", + runDepClasspath1.map(_.path) :+ compile2.path, + Seq(compile3.path) + ) +} +``` + +This is similar to SBT's `:=`/`.value` macros, or `scala-async`'s +`async`/`await`. Like those, the `T{...}` macro should let users program most of +their code in a "direct" style and have it "automatically" lifted into a graph +of `Task`s. + +## How Mill aims for Simple + +Why should you expect that the Mill build tool can achieve simple, easy & +flexible, where other build tools in the past have failed? + +Build tools inherently encompass a huge number of different concepts: + +- What "Tasks" depends on what? +- How do I define my own tasks? +- Where do source files come from? +- What needs to run in what order to do what I want? +- What can be parallelized and what can't? +- How do tasks pass data to each other? What data do they pass? +- What tasks are cached? Where? +- How are tasks run from the command line? +- How do you deal with the repetition inherent a build? (e.g. compile, run & + test tasks for every "module") +- What is a "Module"? How do they relate to "Tasks"? +- How do you configure a Module to do something different? +- How are cross-builds (across different configurations) handled? + +These are a lot of questions to answer, and we haven't even started talking +about the actually compiling/running any code yet! If each such facet of a build +was modelled separately, it's easy to have an explosion of different concepts +that would make a build tool hard to understand. + +Before you continue, take a moment to think: how would you answer to each of +those questions using an existing build tool you are familiar with? Different +tools like [SBT](http://www.scala-sbt.org/), +[Fake](https://fake.build/legacy-index.html), [Gradle](https://gradle.org/) or +[Grunt](https://gruntjs.com/) have very different answers. + +Mill aims to provide the answer to these questions using as few, as familiar +core concepts as possible. The entire Mill build is oriented around a few +concepts: + +- The Object Hierarchy +- The Call Graph +- Instantiating Traits & Classes + +These concepts are already familiar to anyone experienced in Scala (or any other +programming language...), but are enough to answer all of the complicated +build-related questions listed above. + +## The Object Hierarchy + +The module hierarchy is the graph of objects, starting from the root of the +`build.sc` file, that extend `mill.Module`. At the leaves of the hierarchy are +the `Target`s you can run. + +A `Target`'s position in the module hierarchy tells you many things. For +example, a `Target` at position `core.test.compile` would: + +- Cache output metadata at `out/core/test/compile/meta.json` + +- Output files to the folder `out/core/test/compile/dest/` + +- Source files default to a folder in `core/test/`, `core/test/src/` + +- Be runnable from the command-line via `mill core.test.compile` + +- Be referenced programmatically (from other `Target`s) via `core.test.compile` + +From the position of any `Target` within the object hierarchy, you immediately +know how to run it, find its output files, find any caches, or refer to it from +other `Target`s. You know up-front where the `Target`'s data "lives" on disk, and +are sure that it will never clash with any other `Target`'s data. + +## The Call Graph + +The Scala call graph of "which target references which other target" is core to +how Mill operates. This graph is reified via the `T{...}` macro to make it +available to the Mill execution engine at runtime. The call graph tells you: + +- Which `Target`s depend on which other `Target`s + +- For a given `Target` to be built, what other `Target`s need to be run and in + what order + +- Which `Target`s can be evaluated in parallel + +- What source files need to be watched when using `--watch` on a given target (by + tracing the call graph up to the `Source`s) + +- What a given `Target` makes available for other `Target`s to depend on (via + its return value) + +- Defining your own task that depends on others is as simple as `def foo = + T{...}` + +The call graph within your Scala code is essentially a data-flow graph: by +defining a snippet of code: + +```scala +val b = ... +val c = ... +val d = ... +val a = f(b, c, d) +``` + +you are telling everyone that the value `a` depends on the values of `b` `c` and +`d`, processed by `f`. A build tool needs exactly the same data structure: +knowing what `Target` depends on what other `Target`s, and what processing it +does on its inputs! + +With Mill, you can take the Scala call graph, wrap everything in the `T{...}` +macro, and get a `Target`-dependency graph that matches exactly the call-graph +you already had: + +```scala +val b = T{ ... } +val c = T{ ... } +val d = T{ ... } +val a = T{ f(b(), c(), d()) } +``` + +Thus, if you are familiar with how data flows through a normal Scala program, +you already know how data flows through a Mill build! The Mill build evaluation +may be incremental, it may cache things, it may read and write from disk, but +the fundamental syntax, and the data-flow that syntax represents, is unchanged +from your normal Scala code. + +## Instantiating Traits & Classes + +Classes and traits are a common way of re-using common data structures in Scala: +if you have a bunch of fields which are related and you want to make multiple +copies of those fields, you put them in a class/trait and instantiate it over +and over. + +In Mill, inheriting from traits is the primary way for re-using common parts of +a build: + +- Scala "project"s with multiple related `Target`s within them, are just a + `Trait` you instantiate + +- Replacing the default `Target`s within a project, making them do new + things or depend on new `Target`s, is simply `override`-ing them during + inheritence. + +- Modifying the default `Target`s within a project, making use of the old value + to compute the new value, is simply `override`ing them and using `super.foo()` + +- Required configuration parameters within a `project` are `abstract` members. + +- Cross-builds are modelled as instantiating a (possibly anonymous) class + multiple times, each instance with its own distinct set of `Target`s + +In normal Scala, you bundle up common fields & functionality into a `class` you +can instantiate over and over, and you can override the things you want to +customize. Similarly, in Mill, you bundle up common parts of a build into +`trait`s you can instantiate over and over, and you can override the things you +want to customize. "Subprojects", "cross-builds", and many other concepts are +reduced to simply instantiating a `trait` over and over, with tweaks. + +## Prior Work + +### SBT + +Mill is built as a substitute for SBT, whose problems are +[described here](http://www.lihaoyi.com/post/SowhatswrongwithSBT.html). +Nevertheless, Mill takes on some parts of SBT (builds written in Scala, Task +graph with an Applicative "idiom bracket" macro) where it makes sense. + +### Bazel + +Mill is largely inspired by [Bazel](https://bazel.build/). In particular, the +single-build-hierarchy, where every Target has an on-disk-cache/output-directory +according to their position in the hierarchy, comes from Bazel. + +Bazel is a bit odd in it’s own right. The underlying data model is good +(hierarchy + cached dependency graph) but getting there is hell. It (like SBT) is +also a 3-layer interpretation model, but layers 1 & 2 are almost exactly the +same: mutable python which performs global side effects (layer 3 is the same +dependency-graph evaluator as SBT/mill). + +You end up having to deal with a non-trivial python codebase where everything +happens via: + +```python +do_something(name="blah") +``` + +or + +```python +do_other_thing(dependencies=["blah"]) + +``` +where `"blah"` is a global identifier that is often constructed programmatically +via string concatenation and passed around. This is quite challenging. + +Having the two layers be “just python” is great since people know python, but I +think unnecessary two have two layers ("evaluating macros" and "evaluating rule +impls") that are almost exactly the same, and I think making them interact via +return values rather than via a global namespace of programmatically-constructed +strings would make it easier to follow. + +With Mill, I’m trying to collapse Bazel’s Python layer 1 & 2 into just 1 layer +of Scala, and have it define its dependency graph/hierarchy by returning +values, rather than by calling global-side-effecting APIs. I've had trouble +trying to teach people how-to-bazel at work, and am pretty sure we can make +something that's easier to use. + +### Scala.Rx + +Mill's "direct-style" applicative syntax is inspired by my old +[Scala.Rx](https://github.com/lihaoyi/scala.rx) project. While there are +differences (Mill captures the dependency graph lexically using Macros, Scala.Rx +captures it at runtime, they are pretty similar. + +The end-goal is the same: to write code in a "direct style" and have it +automatically "lifted" into a dependency graph, which you can introspect and use +for incremental updates at runtime. + +Scala.Rx is itself build upon the 2010 paper +[Deprecating the Observer Pattern](https://infoscience.epfl.ch/record/148043/files/DeprecatingObserversTR2010.pdf). + +### CBT + +Mill looks a lot like [CBT](https://github.com/cvogt/cbt). The inheritance based +model for customizing `Module`s/`ScalaModule`s comes straight from there, as +does the "command line path matches Scala selector path" idea. Most other things +are different though: the reified dependency graph, the execution model, the +caching module all follow Bazel more than they do CBT diff --git a/readme.md b/readme.md index 1eec0e66..e0f34278 100644 --- a/readme.md +++ b/readme.md @@ -328,30 +328,44 @@ rm -rf out/ ## Changelog -### Master +### 0.2.0 -- Universal (combined batch/sh) script generation for launcher, assembly, and release +- Universal (combined batch/sh) script generation for launcher, assembly, and + release - For some shell (e.g., `ksh` or `fish`), a shebang line should be added, e.g., using GNU sed: - - ```bash - sed -i '1s;^;#!/usr/bin/env sh\n;' - ``` - - Or download directly with shebang added as follows: - - ```bash - sudo sh -c '(echo "#!/usr/bin/env sh" && curl -L ) > /usr/local/bin/mill && chmod +x /usr/local/bin/mill' - ``` - - On Windows, save `` as `mill.bat` - - Windows client/server improvements -- Windows repl support (note: MSYS2 subsystem/shell will be supported when jline3 v3.6.3 is released) +- Windows repl support (note: MSYS2 subsystem/shell will be supported when jline3 + v3.6.3 is released) - Fixed Java 9 support +- Remove need for running `publishAll` using `--interactive` when on OSX and + your GPG key has a passphrase + +- First-class support for `JavaModule`s + +- Properly pass compiler plugins to Scaladoc ([#282](https://github.com/lihaoyi/mill/issues/282)) + +- Support for ivy version-pinning via `ivy"...".forceVersion()` + +- Support for ivy excludes via `ivy"...".exclude()` ([#254](https://github.com/lihaoyi/mill/pull/254)) + +- Make `ivyDepsTree` properly handle transitive dependencies ([#226](https://github.com/lihaoyi/mill/issues/226)) + +- Fix handling of `runtime`-scoped ivy dependencies ([#173](https://github.com/lihaoyi/mill/issues/173)) + +- Make environment variables available to Mill builds ([#257](https://github.com/lihaoyi/mill/issues/257)) + +- Support ScalaCheck test runner ([#286](https://github.com/lihaoyi/mill/issues/286)) + +- Support for using Typelevel Scala ([#275](https://github.com/lihaoyi/mill/issues/275)) + +- If a module depends on multiple submodules with different versions of an + ivy dependency, only one version is resolved ([#273](https://github.com/lihaoyi/mill/issues/273)) + + + ### 0.1.7 - Windows batch (.bat) generation for launcher, assembly, and release -- cgit v1.2.3