Skip to content

Scala modularization and classpaths

eugene yokota edited this page Apr 18, 2019 · 6 revisions

Scala modularization, distribution, and classpaths

Scala modularization makes it necessary to deal with some existing issues that mainly relate to the (Java) boot classpath and to using Scala jars from a lib/.

Java boot classpath (bootstrap classpath)

The term "boot classpath" comes from Java's Bootstrap Classes, which are the classes that implement the Java Platform. By default, bootstrap classes are in the rt.jar and several other jar files in the jre/lib directory according to How Classes are Found.

javac supports programmers wanting to resolve boot class (or extension class) references using an alternative Java platform implementation by providing -bootclasspath options.

The Scala compiler also provides -javabootclasspath <path> flag to override Java boot classpath.

For the Java boot classpath, the scala script justs puts everything in lib/ on it. This is not the right thing to do because it makes classes not on the user's classpath available. It also forces a particular version of the config, JLine, and JAnsi libraries. (Currently Scala's JLine is in a custom namespace, but I believe a goal is to use the standard JLine.) I think it is closer to the right thing to do for the scalac script, because no user code runs and configures a particular classpath. (This is not strictly true with macros anymore, but I expect macros are discouraged from using JLine.)

I assume the original reason for putting anything on the boot classpath is to avoid some jvm overhead somewhere, perhaps class file verification. One possibility is to include something on the boot classpath only if it is on the normal classpath. Some things might automatically be added to the normal classpath, like scala-library.jar, but that is separate from the boot classpath issue.

Selecting Scala jars

There are various situations where a tool might need to select a set of Scala jars:

  • default library classpath: for example, sbt adds scala-library.jar when autoScalaLibrary := true
  • compiler classpath: the jars needed to invoke scalac on a user's project
  • repl classpath
  • cached Scala classpath: sbt creates a class loader with the standard Scala jars needed and keeps it around. "needed" is loosely defined, but this would be the library and anything needed for scalac or the repl. It may or may not include actors or other things split off with modularization.

There isn't much of a problem when dealing with managed dependencies, only the decision of what jars to include. For the default library classpath, I'd propose that the default be the minimal core, that this module be called scala-library, and that there be a scala-library-all or something to pull in all library components. For the cached classpath, I'd propose that it be scala-compiler-all, which includes everything that has previously been included (repl, scaladoc, maybe adding scalap?). I'm not sure about things like scala-actors or other optional components. Whatever is included here has to be downloaded by every sbt user whether they use it or not.

When it comes to the lib/ directory of a locally built Scala or a Scala distribution, things are harder. There is just the lib/ directory without any information. In the completely unmanaged use case, there is no knowledge of dependencies. So, it isn't possible to say "all jars needed to run scalac" or "all jars in scala-library"

In the case where someone is still using managed dependencies, such as taking an existing project and setting scalaHome to use a locally built Scala version, sbt will use the dependency information from the configured repositories, but substitute the jars from the local Scala. This of course will run into problems if the dependencies have changed in the locally built Scala version.