Compiling without module identities
Richard S. Hall
heavy at ungoverned.org
Tue Feb 3 19:24:17 EST 2009
Alex Buckley wrote:
> So, with org.arrakis.* in /Users/atreides :
>
> javac -classpath /Users/atreides
> -d /Users/harkonnen
> com/hakonnen/Foo.java
>
> javac -classpath /Users/atreides
> -d /Users/harkonnen
> -moduleClasspath /Users/harkonnen
> com/hakonnen/FooImpl.java
>
> OK, there's a foreign directory at -classpath which holds Other
> Modules. This is not aligned with the current meaning of -classpath,
> which holds both source and classfiles from logically Any Module. Then
> there's -d, which unfortunately throws all classfiles into one place,
> so you have to divvy Spice classes out into atreides up front. Then
> there's -moduleClasspath (really -localModulepath) which you only need
> to set if the code in FooImpl.java refers to module-private types in
> any type previously compiled and now available on -localModulepath
> (regardless of whether it's also found in -d) and not found on
> -classpath.
>
> You have to get all those paths exactly right on every compilation,
> and still are limited to compiling one module at a time. (And the
> -classpath and -localModulepath directories must not be aliased on the
> filesystem.)
>
> This is supposed to be simple?
>
> I reject the idea that #2 - separate compilation of a module's source
> files - is uncommon. Surely you compile individual sources while
> developing a bundle? Perhaps I am missing something, because I can
> hardly believe someone would argue against separate compilation in Java.
I don't think Hal is arguing against separate compilation, especially
since his example demonstrated separate compilation.
If I understand Hal's proposal correctly, perhaps I can state it a
different way.
I will introduce a "current context" concept for javac, which means
something along the lines of "the directly relevant files for the .java
file(s) currently being compiled."
In standard javac (e.g., "javac foo/Foo.java") the "current context" is
to search for needed types relative to the current directory, which is
also happens to be the default for class path. If you set the class
path, then the current context becomes whatever the class path value is.
This is because current javac does not distinguish between the "current
context" and the class path...it considers them to be the same thing.
So, the proposal is to make them not the same thing. We can still assume
the current context for javac is relative to the current directory, but
anything found on the current context is considered to be in the same
module. On the other hand, anything found on the class path is
considered to be from a different module. So, something like:
javac foo/Foo.java
Would have no entries in its class path, but its current context would
be types relative to the root of the current directory and they would be
considered to be part of the same module. If you did something like:
javac -modulepath /path/foo.jar foo/Foo.java
Then the class path would still have no entries, but the module path
would set the current context to have one entry where all types
reachable from it would be considered to be in the same module as
foo.Foo. If you did something like:
javac -modulepath /path/foo.jar -classpath /path/bar.jar foo/Foo.java
Then foo.jar types and foo.Far would be the current context and thus in
the same module, but types in bar.jar would be in a different module.
For legacy code, this would have no impact, since by definition all
legacy code is either in no module or a "default" module, but it doesn't
make a difference which view you have since there would be no module
visibility associated with legacy code, only the standard visibility
modifiers which would still apply the same way whether the code is in
the same module or not.
-> richard
>
> With module membership in source, the compiler can place generated
> classfiles in a directory corresponding to the module. No manual
> divvying up, whether you compile one module's sources at a time or
> many. Leave -d alone; have -m take a directory under which a directory
> for the current module is created, and put classfiles in there.
>
> Then:
>
> Put "module M;" in individual .java files or, if you prefer, set
> membership for a whole package by putting "module M @ 1.0;" in
> package-info.java.
>
> javac -classpath /Users/atreides
> -m /Users/harkonnen
> com/hakonnen/Foo.java
>
> javac -classpath /Users/harkonnen
> -m /Users/harkonnen
> com/hakonnen/FooImpl.java
>
> Like today, -classpath and -m (modular version of -d) are often the
> same. javac knows the module of FooImpl and can get the module of Foo
> when it reads Foo.class from -classpath, like today.
>
> Again, I have no objection to a -module option on the command line
> that sets module identity for command line and -sourcepath files. But
> it's an additional mechanism to source membership.
>
> Alex
>
> Hal Hildebrand wrote:
>> So Richard Hall suggested that I make a detailed proposal of what I'm
>> talking about so that hopefully it will make the issue more clear.
>>
>> In my experience of developing modular code, I never compile more
>> than the module in question. I do this by mapping Maven projects
>> onto the produced OSGi bundle, one for one. But this can also be
>> easily done with Ant, Make or any other build tool. Thus, I'm
>> assuming that the *only* source code that will be compiled is to be
>> of a particular module, regardless of whether the source being
>> compiled is of the full module or a subset of the module. The reason
>> is quite simple. If I compile the source of more than one module,
>> then I'm adding more work to my life by having to sort out the
>> resulting class files into their individual modules. Divvying up a
>> mess of class files into separate jars is something that we used to
>> do in our builds here at House Harkonnen, and that process actually
>> caused a great deal of pain, as the splitting up of the compiled code
>> is hard work - especially hard to automate in a large build. We spent
>> a lot of effort to reorganize the source code and compilation such
>> that we didn't have to perform this step - i.e. effectively
>> modularizing the code. Organizing your source and compilation by
>> module is considered best practice and besides the positive benefits,
>> doing so is known to prevent other nasty things like circularities in
>> your code.
>>
>> So, if we can agree that any source we are currently compiling is
>> from a single module, then the issue is how to treat the
>> dependencies. The dependencies of the compiled source fall into two
>> categories:
>> 1. additional classes that are in the same module
>> 2. the dependencies of the module (aka the "imports", in OSGi slang)
>>
>>
>> What I was proposing is that the java compiler be extended to know
>> about the additional distinction of the module in the class path.
>> For example here is my module:
>>
>> com.harkonnen.Foo
>> com.harkonnen.impl.FooImpl
>>
>> With dependencies on:
>>
>> org.arrakis.Spice
>>
>> So, the scenario is that we want to compile Foo and FooImpl
>> separately, and somehow make use of a module private method on Foo
>> from FooImpl. Foo has dependency on Spice, FooImpl has dependency on
>> Foo. The dependency Spice, being in a separate module, will have
>> been previously compiled and is rooted in the directory
>> /Users/atreides. To accomplish this, I posit the idea of the module
>> class path that the java compiler now knows about. Let's say that
>> the new flag is "-moduleClasspath". So, the compilation of Foo looks
>> like:
>>
>> java -classpath /Users/atreides -d /Users/harkonnen com/hakonnen/Foo
>>
>> To compile FooImpl:
>>
>> java -classpath /Users/atreides -d /Users/harkonnen -moduleClasspath
>> /Users/harkonnen com/hakonnen/Foo'
>>
>> Because the Spice is obtained from outside the moduleClasspath, the
>> compiler knows that it is *not* in the current module being compiled.
>> Because FooImpl *is* on the moduleClasspath, it knows that it is in
>> the current module being compiled. Thus, FooImpl can use a module
>> private method on Foo, but cannot use any module private methods on
>> Spice.
>>
>> In this example, it's important to realize that the *only* source
>> code that will be compiled is source that is in the current module.
>> Thus, the question as to "what module any source code is in" has
>> exactly one answer: "the current module".
>> It is also important to note that the additional "module class path"
>> is needed only to handle two edge cases of compilation:
>>
>> 1. libraries in the OSGi bundle class path
>> 2. separate compilation of a module's source files
>>
>>
>> I believe that the probability of #1 is far higher than the
>> probability of #2, given my experience in creating modular code, as
>> well as what I've seen in large scale commercial development here at
>> House Harkonnen and other corporations. Consequently, if you're
>> doing what comes naturally to developers compiling modules - i.e.
>> compiling the entire source of the module - then you would not need
>> any additional class path at all. The compiler knows that any source
>> it is compiling is a module and any dependency it finds from the
>> class path is considered to be in another module (can be multiple
>> modules as it does not matter to the semantics).
More information about the jsr294-modularity-observer
mailing list