Compiling without module identities

Stuart McCulloch mcculls at gmail.com
Wed Feb 4 04:20:53 EST 2009


2009/2/4 Hal Hildebrand <hal.hildebrand at gmail.com>

>
> On Feb 3, 2009, at 6:56 PM, Alex Buckley wrote:
>
>  I did understand.
>> You compile the Spice source ahead of time and ensure
>> the resulting classes are in /Users/atreides by the time you compile
>> Foo/FooImpl. ("divvy Spice classes out into atreides up front.")
>>
>> That was the easy part! The hard part was having to remember to set the
>> current context separately from the current classpath:
>>
>> (Richard, tweaked because compiling FooImpl is the interesting case)
>> javac -classpath /path/bar.jar -modulepath /path/foo.jar
>>  foo/FooImpl.java
>>
>
> Alex, I find this literally impossible to believe that this is the "hard
> part".  I'm not sure, but it seems almost like you have never used a build
> system such as Ant, Make, Maven or Ivy before, because "remembering where
> your compilation dependencies are located" and "configuring the compilation
> target to use them" is hardly considered to be the "hard part" about
> development.   In an IDE, this is literally the easiest thing in the world.
>
> If your argument is that it's "hard to remember where your dependencies
> are, therefore your proposed solution is unacceptable", I will just have to
> say that's the biggest whale tale I've heard in a long time.  What you're
> saying is that the every day task of a Java developer is simply too complex
> for you to fathom.
>

As an observer I have to say putting the module information in the source
(or package-info) does makes more sense to me. Relying on setting the
moduleClasspath sounds decidedly non-declarative and fragile.

For example, if one of the source files in a dependent module happened
to have a later timestamp than its class file then javac would decide to
recompile it, and this could mess things up - a similar situation actually
happened in real-life with Maven and took a lot of cycles to debug:


http://mail-archives.apache.org/mod_mbox/maven-users/200809.mbox/%3C48DC5EFE.8030301@Sun.COM%3E

re. refactoring, can you use package-info to override the original module
setting in the source? ie. you take the original files, add your customized
package-info's for the relevant packages, and rebuild?


>  (Hal)
>> javac -classpath /Users/atreides -moduleClasspath /Users/harkonnen
>>  com/hakonnen/FooImpl.java
>>
>> The idea of module equality without identity is clever, but javac is
>> already burdened by interacting command line options. (No-one has mentioned
>> -sourcepath yet and ITS interaction with -classpath.) So I am naturally
>> sceptical of more options, especially those which add further meaning to
>> -classpath.
>>
>
> And yet you're willing to completely destroy any chance of module packaging
> refactoring by forever burning the module identity into the source file at
> the drop of a hat.
>
>  Plus, in the model where only equality matters, the error messages will be
>> pretty poor. If FooImpl does try to access a module-private type on the
>> classpath (and not on the moduleClasspath), then the compiler can't say
>> anything about accessing and accessed module. This is weird, and quite
>> unlike every other static access control failure.
>>
>
> I'm not sure where I see the poor quality of the message.  I assume the
> compiler knows the class and method attempting to be accessed.  The message
> that that is not in the current module being compiled seems perfectly
> adequate.  "The method Foo.bozo attempted to access a module private method
> Spice.blue() in another module".  Maybe I'm just smarter than I should be,
> but I think I most Java programmers
>
> To summarize, your argument seems to be thus:
>
> a) Managing dependencies is hard!
> b) javac is way too complex already
> c) The obvious error message is terribly inadequate.
>
> And because of this line of reasoning, you think that the only solution is
> to encode the identity of the module in the source.
>
>
>  Hal, you said:
>>
>>> Yes, that is correct.  What I was attempting to say was that I
>>> believe that any added "complexity" in having to distinguish between
>>> the two types of dependencies is the less common case.  I was
>>> attempting to show that it was easy to handle the edge case Alex
>>> presented where the compilation of the module source was not achieved
>>> with a single compilation.  In my experience, multiple source
>>> directories which combine into a single logical module are extremely
>>> uncommon.  Incremental compilation, however, is something handled
>>> automagically by the IDE - i.e. we don't do that by hand any more.
>>> Perhaps Alex's experience is different.  It doesn't change the model
>>> in any way, however.
>>>
>>
>> I am proposing a model where multiple source directories (really the
>> packages therein) naturally combine into a single logical module. Types in
>> those packages will naturally be compiled separately, and will commonly need
>> access to module-private artifacts elsewhere in the module. It is not an
>> edge case. It is the common case, so javac should not require complicated
>> options to achieve it.
>>
>> Putting module membership in source helps javac achieve this common case,
>> and permits multiple modules' sources to be compiled in one invocation, and
>> - most important of all - makes it clear to the programmer what the boundary
>> of module-private accessibility is. Millions of people will be learning
>> about module-privacy and immediately asking "so, what's a module?". The
>> answer should be more explicit than "it depends on how you invoke javac /
>> configure your IDE". Sorry if you disagree.
>>
>> Alex
>>
>> P.S. You would need true module identities for accessing and accessed
>> types in order to define hierarchical module privacy. (Types in module M.N
>> can see module-private types in module M.) I'm not suggesting 294 defines
>> that, but it shows that focusing on module equality because it works for
>> module privacy today could backfire.
>>
>> Hal Hildebrand wrote:
>>
>>> On Feb 3, 2009, at 4:24 PM, Richard S. Hall wrote:
>>>
>>>> Alex Buckley wrote:
>>>>
>>>>> So, with org.arrakis.* in /Users/atreides :
>>>>> javac -classpath /Users/atreides -d /Users/harkonnen
>>>>> com/hakonnen/Foo.java
>>>>> javac -classpath /Users/atreides -d /Users/harkonnen -moduleClasspath
>>>>> /Users/harkonnen com/hakonnen/FooImpl.java
>>>>> OK, there's a foreign directory at -classpath which holds Other
>>>>> Modules. This is not aligned with the current meaning of
>>>>> -classpath, which holds both source and classfiles from logically
>>>>> Any Module. Then there's -d, which unfortunately throws all
>>>>> classfiles into one place, so you have to divvy Spice classes out
>>>>> into atreides up front.
>>>>>
>>>> There seems to be some confusion on this point, Alex.  The Spice
>>> classes (i.e. dependencies that are in another module) are
>>> *previously* compiled - i.e. they are accessed by the compiler as
>>> class files, not source files.  That was my first assumption in
>>> stating that *all* source being compiled is considered to be in the
>>> "current" module.  And the way I understand javac working is that any
>>> dependency which is a class file is *not* thrown into the output
>>> directory.  Consequently, the only thing in the output directory
>>> would be the compilation of the module source.
>>> There is no compilation of *multiple* module source files in the
>>> model I presented.
>>> In actuality, there's no need to use -d at all.  Consequently, you
>>> can remove the directive from my example, reducing the perceived
>>> complexity and nothing changes.
>>>
>>>> Then there's -moduleClasspath (really -localModulepath) which you
>>>>> only need to set if the code in FooImpl.java refers to
>>>>> module-private types in any type previously compiled and now
>>>>> available on -localModulepath (regardless of whether it's also
>>>>> found in -d) and not found on -classpath.
>>>>> You have to get all those paths exactly right on every
>>>>> compilation, and still are limited to compiling one module at a
>>>>> time. (And the -classpath and -localModulepath directories must
>>>>> not be aliased on the filesystem.)
>>>>> This is supposed to be simple?
>>>>> I reject the idea that #2 - separate compilation of a module's
>>>>> source files - is uncommon. Surely you compile individual sources
>>>>> while developing a bundle? Perhaps I am missing something,
>>>>> because I can hardly believe someone would argue against separate
>>>>> compilation in Java.
>>>>>
>>>> I don't think Hal is arguing against separate compilation,
>>>> especially since his example demonstrated separate compilation.
>>>>
>>> Yes, that is correct.  What I was attempting to say was that I
>>> believe that any added "complexity" in having to distinguish between
>>> the two types of dependencies is the less common case.  I was
>>> attempting to show that it was easy to handle the edge case Alex
>>> presented where the compilation of the module source was not achieved
>>> with a single compilation.  In my experience, multiple source
>>> directories which combine into a single logical module are extremely
>>> uncommon. Incremental compilation, however, is something handled
>>> automagically by the IDE - i.e. we don't do that by hand any more.
>>> Perhaps Alex's experience is different.  It doesn't change the model
>>> in any way, however.
>>>
>>>> If I understand Hal's proposal correctly, perhaps I can state it a
>>>> different way.
>>>> I will introduce a "current context" concept for javac, which means
>>>> something along the lines of "the directly relevant files for the
>>>> .java file(s) currently being compiled."
>>>> In standard javac (e.g., "javac foo/Foo.java") the "current
>>>> context" is to search for needed types relative to the current
>>>> directory, which is also happens to be the default for class path.
>>>> If you set the class path, then the current context becomes
>>>> whatever the class path value is. This is because current javac
>>>> does not distinguish between the "current context" and the class
>>>> path...it considers them to be the same thing.
>>>>
>>> Exactly.
>>>
>>>> So, the proposal is to make them not the same thing. We can still assume
>>>> the current context for javac is relative to the current directory, but
>>>> anything found on the current context is considered
>>>> to be in the same module. On the other hand, anything found on the
>>>> class path is considered to be from a different module.  So,
>>>> something like:
>>>> javac foo/Foo.java
>>>> Would have no entries in its class path, but its current context
>>>> would be types relative to the root of the current directory and
>>>> they would be considered to be part of the same module. If you did
>>>> something like:
>>>> javac -modulepath /path/foo.jar foo/Foo.java
>>>> Then the class path would still have no entries, but the module
>>>> path would set the current context to have one entry where all
>>>> types reachable from it would be considered to be in the same
>>>> module as foo.Foo. If you did something like:
>>>> javac -modulepath /path/foo.jar -classpath /path/bar.jar
>>>> foo/Foo.java
>>>> Then foo.jar types and foo.Far would be the current context and
>>>> thus in the same module, but types in bar.jar would be in a
>>>> different module.
>>>> For legacy code, this would have no impact, since by definition all
>>>> legacy code is either in no module or a "default" module, but it doesn't
>>>> make a difference which view you have since there would be
>>>> no module visibility associated with legacy code, only the standard
>>>> visibility modifiers which would still apply the same way whether
>>>> the code is in the same module or not.
>>>>
>>> Yes, this is precisely what I was trying to get across.
>>>
>>>> With module membership in source, the compiler can place
>>>>> generated classfiles in a directory corresponding to the module.
>>>>> No manual divvying up, whether you compile one module's sources
>>>>> at a time or many. Leave -d alone; have -m take a directory under
>>>>> which a directory for the current module is created, and put
>>>>> classfiles in there.
>>>>> Then:
>>>>> Put "module M;" in individual .java files or, if you prefer, set
>>>>> membership for a whole package by putting "module M @ 1.0;" in
>>>>> package-info.java.
>>>>> javac -classpath /Users/atreides -m /Users/harkonnen
>>>>> com/hakonnen/Foo.java
>>>>> javac -classpath /Users/harkonnen -m /Users/harkonnen
>>>>> com/hakonnen/FooImpl.java
>>>>> Like today, -classpath and -m (modular version of -d) are often
>>>>> the same. javac knows the module of FooImpl and can get the
>>>>> module of Foo when it reads Foo.class from -classpath, like
>>>>> today.
>>>>> Again, I have no objection to a -module option on the command
>>>>> line that sets module identity for command line and -sourcepath
>>>>> files. But it's an additional mechanism to source membership.
>>>>> Alex
>>>>> Hal Hildebrand wrote:
>>>>>
>>>>>> So Richard Hall suggested that I make a detailed proposal of
>>>>>> what I'm talking about so that hopefully it will make the issue
>>>>>> more clear.
>>>>>> In my experience of developing modular code, I never compile
>>>>>> more than the module in question.  I do this by mapping Maven
>>>>>> projects onto the produced OSGi bundle, one for one.  But this
>>>>>> can also be easily done with Ant, Make or any other build tool.
>>>>>> Thus, I'm assuming that the *only* source code that will be
>>>>>> compiled is to be of a particular module, regardless of whether
>>>>>> the source being compiled is of the full module or a subset of
>>>>>> the module.  The reason is quite simple.  If I compile the
>>>>>> source of more than one module, then I'm adding more work to my
>>>>>> life by having to sort out the resulting class files into their
>>>>>> individual modules.  Divvying up a mess of class files into
>>>>>> separate jars is something that we used to do in our builds
>>>>>> here at House Harkonnen, and that process actually caused a
>>>>>> great deal of pain, as the splitting up of the compiled code is
>>>>>> hard work - especially hard to automate in a large build. We
>>>>>> spent a lot of effort to reorganize the source code and compilation
>>>>>> such that we didn't have to perform this step -
>>>>>> i.e. effectively modularizing the code.  Organizing your source
>>>>>> and compilation by module is considered best practice and
>>>>>> besides the positive benefits, doing so is known to prevent
>>>>>> other nasty things like circularities in your code.
>>>>>> So, if we can agree that any source we are currently compiling
>>>>>> is from a single module, then the issue is how to treat the
>>>>>> dependencies.  The dependencies of the compiled source fall
>>>>>> into two categories: 1. additional classes that are in the same
>>>>>> module 2. the dependencies of the module (aka the "imports", in
>>>>>> OSGi slang)
>>>>>> What I was proposing is that the java compiler be extended to
>>>>>> know about the additional distinction of the module in the
>>>>>> class path. For example here is my module:
>>>>>> com.harkonnen.Foo com.harkonnen.impl.FooImpl
>>>>>> With dependencies on:
>>>>>> org.arrakis.Spice
>>>>>> So, the scenario is that we want to compile Foo and FooImpl
>>>>>> separately, and somehow make use of a module private method on
>>>>>> Foo from FooImpl.  Foo has dependency on Spice, FooImpl has
>>>>>> dependency on Foo.  The dependency Spice, being in a separate
>>>>>> module, will have been previously compiled and is rooted in the
>>>>>> directory /Users/atreides.  To accomplish this, I posit the
>>>>>> idea of the module class path that the java compiler now knows
>>>>>> about.  Let's say that the new flag is "-moduleClasspath".  So,
>>>>>> the compilation of Foo looks like:
>>>>>> java -classpath /Users/atreides -d /Users/harkonnen
>>>>>> com/hakonnen/Foo
>>>>>> To compile FooImpl:
>>>>>> java -classpath /Users/atreides -d /Users/harkonnen
>>>>>> -moduleClasspath /Users/harkonnen com/hakonnen/Foo'
>>>>>> Because the Spice is obtained from outside the moduleClasspath,
>>>>>> the compiler knows that it is *not* in the current module being
>>>>>> compiled.  Because FooImpl *is* on the moduleClasspath, it
>>>>>> knows that it is in the current module being compiled.  Thus,
>>>>>> FooImpl can use a module private method on Foo, but cannot use
>>>>>> any module private methods on Spice.
>>>>>> In this example, it's important to realize that the *only*
>>>>>> source code that will be compiled is source that is in the
>>>>>> current module. Thus, the question as to "what module any
>>>>>> source code is in" has exactly one answer: "the current
>>>>>> module". It is also important to note that the additional
>>>>>> "module class path" is needed only to handle two edge cases of
>>>>>> compilation:
>>>>>> 1. libraries in the OSGi bundle class path 2. separate
>>>>>> compilation of a module's source files
>>>>>> I believe that the probability of #1 is far higher than the
>>>>>> probability of #2, given my experience in creating modular
>>>>>> code, as well as what I've seen in large scale commercial
>>>>>> development here at House Harkonnen and other corporations.
>>>>>> Consequently, if you're doing what comes naturally to
>>>>>> developers compiling modules - i.e. compiling the entire source
>>>>>> of the module - then you would not need any additional class
>>>>>> path at all.  The compiler knows that any source it is
>>>>>> compiling is a module and any dependency it finds from the
>>>>>> class path is considered to be in another module (can be multiple
>>>>>> modules as it does not matter to the semantics).
>>>>>>
>>>>>
>


-- 
Cheers, Stuart
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/jsr294-modularity-observer/attachments/20090204/8d9a4f4d/attachment-0001.html>


More information about the jsr294-modularity-observer mailing list