Compiling without module identities

Richard S. Hall heavy at ungoverned.org
Tue Feb 3 23:05:20 EST 2009


Alex Buckley wrote:
> I did understand. You compile the Spice source ahead of time and ensure
> the resulting classes are in /Users/atreides by the time you compile
> Foo/FooImpl. ("divvy Spice classes out into atreides up front.")
>
> That was the easy part! The hard part was having to remember to set the
> current context separately from the current classpath:

Actually, I think it sounds quite simple. Anything in defined to be in 
the current module is listed on the module class path, everything 
external to the module is on the class path. Sounds straightforward to me.

> (Richard, tweaked because compiling FooImpl is the interesting case)
> javac -classpath /path/bar.jar -modulepath /path/foo.jar
>   foo/FooImpl.java
>
> (Hal)
> javac -classpath /Users/atreides -moduleClasspath /Users/harkonnen
>   com/hakonnen/FooImpl.java
>
> The idea of module equality without identity is clever, but javac is 
> already burdened by interacting command line options. (No-one has 
> mentioned -sourcepath yet and ITS interaction with -classpath.) So I 
> am naturally sceptical of more options, especially those which add 
> further meaning to -classpath.

Well, that is why we are discussing it, to see how everything interacts. :-)

> Plus, in the model where only equality matters, the error messages 
> will be pretty poor. If FooImpl does try to access a module-private 
> type on the classpath (and not on the moduleClasspath), then the 
> compiler can't say anything about accessing and accessed module. This 
> is weird, and quite unlike every other static access control failure.

Probably not the biggest failing, but a valid point.

> Hal, you said:
>> Yes, that is correct.  What I was attempting to say was that I
>> believe that any added "complexity" in having to distinguish between
>> the two types of dependencies is the less common case.  I was
>> attempting to show that it was easy to handle the edge case Alex
>> presented where the compilation of the module source was not achieved
>> with a single compilation.  In my experience, multiple source
>> directories which combine into a single logical module are extremely
>> uncommon.  Incremental compilation, however, is something handled
>> automagically by the IDE - i.e. we don't do that by hand any more.
>> Perhaps Alex's experience is different.  It doesn't change the model
>> in any way, however.
>
> I am proposing a model where multiple source directories (really the 
> packages therein) naturally combine into a single logical module. 
> Types in those packages will naturally be compiled separately, and 
> will commonly need access to module-private artifacts elsewhere in the 
> module. It is not an edge case. It is the common case, so javac should 
> not require complicated options to achieve it.

Do you really think it is the common case that a project has multiple 
"src" directories? I understand happens, perhaps even semi-regularly, 
but I find it hard to believe it is the common case.

> Putting module membership in source helps javac achieve this common 
> case, and permits multiple modules' sources to be compiled in one 
> invocation, and - most important of all - makes it clear to the 
> programmer what the boundary of module-private accessibility is. 
> Millions of people will be learning about module-privacy and 
> immediately asking "so, what's a module?". The answer should be more 
> explicit than "it depends on how you invoke javac / configure your 
> IDE". Sorry if you disagree.

I hope we keep this level of concern for simplicity of explanation when 
we move on to other topics. :-)

> P.S. You would need true module identities for accessing and accessed 
> types in order to define hierarchical module privacy. (Types in module 
> M.N can see module-private types in module M.) I'm not suggesting 294 
> defines that, but it shows that focusing on module equality because it 
> works for module privacy today could backfire.

Possibly.

-> richard

>
> Hal Hildebrand wrote:
>>
>> On Feb 3, 2009, at 4:24 PM, Richard S. Hall wrote:
>>>
>>>
>>> Alex Buckley wrote:
>>>> So, with org.arrakis.* in /Users/atreides :
>>>>
>>>> javac -classpath /Users/atreides -d /Users/harkonnen 
>>>> com/hakonnen/Foo.java
>>>>
>>>> javac -classpath /Users/atreides -d /Users/harkonnen 
>>>> -moduleClasspath /Users/harkonnen com/hakonnen/FooImpl.java
>>>>
>>>> OK, there's a foreign directory at -classpath which holds Other 
>>>> Modules. This is not aligned with the current meaning of
>>>> -classpath, which holds both source and classfiles from logically
>>>> Any Module. Then there's -d, which unfortunately throws all
>>>> classfiles into one place, so you have to divvy Spice classes out
>>>> into atreides up front.
>>
>> There seems to be some confusion on this point, Alex.  The Spice
>> classes (i.e. dependencies that are in another module) are
>> *previously* compiled - i.e. they are accessed by the compiler as
>> class files, not source files.  That was my first assumption in
>> stating that *all* source being compiled is considered to be in the
>> "current" module.  And the way I understand javac working is that any
>> dependency which is a class file is *not* thrown into the output
>> directory.  Consequently, the only thing in the output directory
>> would be the compilation of the module source.
>>
>> There is no compilation of *multiple* module source files in the
>> model I presented.
>>
>> In actuality, there's no need to use -d at all.  Consequently, you
>> can remove the directive from my example, reducing the perceived
>> complexity and nothing changes.
>>
>>>> Then there's -moduleClasspath (really -localModulepath) which you
>>>>  only need to set if the code in FooImpl.java refers to
>>>> module-private types in any type previously compiled and now
>>>> available on -localModulepath (regardless of whether it's also
>>>> found in -d) and not found on -classpath.
>>>>
>>>> You have to get all those paths exactly right on every
>>>> compilation, and still are limited to compiling one module at a
>>>> time. (And the -classpath and -localModulepath directories must
>>>> not be aliased on the filesystem.)
>>>>
>>>> This is supposed to be simple?
>>
>>>> I reject the idea that #2 - separate compilation of a module's
>>>> source files - is uncommon. Surely you compile individual sources
>>>> while developing a bundle? Perhaps I am missing something,
>>>> because I can hardly believe someone would argue against separate
>>>> compilation in Java.
>>>
>>> I don't think Hal is arguing against separate compilation,
>>> especially since his example demonstrated separate compilation.
>>
>> Yes, that is correct.  What I was attempting to say was that I
>> believe that any added "complexity" in having to distinguish between
>> the two types of dependencies is the less common case.  I was
>> attempting to show that it was easy to handle the edge case Alex
>> presented where the compilation of the module source was not achieved
>> with a single compilation.  In my experience, multiple source
>> directories which combine into a single logical module are extremely
>> uncommon. Incremental compilation, however, is something handled
>> automagically by the IDE - i.e. we don't do that by hand any more.
>> Perhaps Alex's experience is different.  It doesn't change the model
>> in any way, however.
>>
>>> If I understand Hal's proposal correctly, perhaps I can state it a
>>>  different way.
>>>
>>> I will introduce a "current context" concept for javac, which means
>>>  something along the lines of "the directly relevant files for the
>>>  .java file(s) currently being compiled."
>>>
>>> In standard javac (e.g., "javac foo/Foo.java") the "current
>>> context" is to search for needed types relative to the current
>>> directory, which is also happens to be the default for class path.
>>> If you set the class path, then the current context becomes
>>> whatever the class path value is. This is because current javac
>>> does not distinguish between the "current context" and the class
>>> path...it considers them to be the same thing.
>>
>> Exactly.
>>
>>> So, the proposal is to make them not the same thing. We can still 
>>> assume the current context for javac is relative to the current 
>>> directory, but anything found on the current context is considered
>>> to be in the same module. On the other hand, anything found on the
>>> class path is considered to be from a different module.  So,
>>> something like:
>>>
>>> javac foo/Foo.java
>>>
>>> Would have no entries in its class path, but its current context
>>> would be types relative to the root of the current directory and
>>> they would be considered to be part of the same module. If you did
>>> something like:
>>>
>>> javac -modulepath /path/foo.jar foo/Foo.java
>>>
>>> Then the class path would still have no entries, but the module
>>> path would set the current context to have one entry where all
>>> types reachable from it would be considered to be in the same
>>> module as foo.Foo. If you did something like:
>>>
>>> javac -modulepath /path/foo.jar -classpath /path/bar.jar
>>> foo/Foo.java
>>>
>>> Then foo.jar types and foo.Far would be the current context and
>>> thus in the same module, but types in bar.jar would be in a
>>> different module.
>>>
>>> For legacy code, this would have no impact, since by definition all
>>>  legacy code is either in no module or a "default" module, but it 
>>> doesn't make a difference which view you have since there would be
>>> no module visibility associated with legacy code, only the standard
>>>  visibility modifiers which would still apply the same way whether
>>> the code is in the same module or not.
>>
>> Yes, this is precisely what I was trying to get across.
>>
>>>> With module membership in source, the compiler can place
>>>> generated classfiles in a directory corresponding to the module.
>>>> No manual divvying up, whether you compile one module's sources
>>>> at a time or many. Leave -d alone; have -m take a directory under
>>>> which a directory for the current module is created, and put
>>>> classfiles in there.
>>>>
>>>> Then:
>>>>
>>>> Put "module M;" in individual .java files or, if you prefer, set
>>>>  membership for a whole package by putting "module M @ 1.0;" in 
>>>> package-info.java.
>>>>
>>>> javac -classpath /Users/atreides -m /Users/harkonnen 
>>>> com/hakonnen/Foo.java
>>>>
>>>> javac -classpath /Users/harkonnen -m /Users/harkonnen 
>>>> com/hakonnen/FooImpl.java
>>>>
>>>> Like today, -classpath and -m (modular version of -d) are often
>>>> the same. javac knows the module of FooImpl and can get the
>>>> module of Foo when it reads Foo.class from -classpath, like
>>>> today.
>>>>
>>>> Again, I have no objection to a -module option on the command
>>>> line that sets module identity for command line and -sourcepath
>>>> files. But it's an additional mechanism to source membership.
>>>>
>>>> Alex
>>>>
>>>> Hal Hildebrand wrote:
>>>>> So Richard Hall suggested that I make a detailed proposal of
>>>>> what I'm talking about so that hopefully it will make the issue
>>>>> more clear.
>>>>>
>>>>> In my experience of developing modular code, I never compile
>>>>> more than the module in question.  I do this by mapping Maven
>>>>> projects onto the produced OSGi bundle, one for one.  But this
>>>>> can also be easily done with Ant, Make or any other build tool.
>>>>> Thus, I'm assuming that the *only* source code that will be
>>>>> compiled is to be of a particular module, regardless of whether
>>>>> the source being compiled is of the full module or a subset of
>>>>> the module.  The reason is quite simple.  If I compile the
>>>>> source of more than one module, then I'm adding more work to my
>>>>> life by having to sort out the resulting class files into their
>>>>> individual modules.  Divvying up a mess of class files into
>>>>> separate jars is something that we used to do in our builds
>>>>> here at House Harkonnen, and that process actually caused a
>>>>> great deal of pain, as the splitting up of the compiled code is
>>>>> hard work - especially hard to automate in a large build. We
>>>>> spent a lot of effort to reorganize the source code and 
>>>>> compilation such that we didn't have to perform this step -
>>>>> i.e. effectively modularizing the code.  Organizing your source
>>>>> and compilation by module is considered best practice and
>>>>> besides the positive benefits, doing so is known to prevent
>>>>> other nasty things like circularities in your code.
>>>>>
>>>>> So, if we can agree that any source we are currently compiling
>>>>> is from a single module, then the issue is how to treat the 
>>>>> dependencies.  The dependencies of the compiled source fall
>>>>> into two categories: 1. additional classes that are in the same
>>>>> module 2. the dependencies of the module (aka the "imports", in
>>>>> OSGi slang)
>>>>>
>>>>>
>>>>> What I was proposing is that the java compiler be extended to
>>>>> know about the additional distinction of the module in the
>>>>> class path. For example here is my module:
>>>>>
>>>>> com.harkonnen.Foo com.harkonnen.impl.FooImpl
>>>>>
>>>>> With dependencies on:
>>>>>
>>>>> org.arrakis.Spice
>>>>>
>>>>> So, the scenario is that we want to compile Foo and FooImpl 
>>>>> separately, and somehow make use of a module private method on
>>>>> Foo from FooImpl.  Foo has dependency on Spice, FooImpl has
>>>>> dependency on Foo.  The dependency Spice, being in a separate
>>>>> module, will have been previously compiled and is rooted in the
>>>>> directory /Users/atreides.  To accomplish this, I posit the
>>>>> idea of the module class path that the java compiler now knows
>>>>> about.  Let's say that the new flag is "-moduleClasspath".  So,
>>>>> the compilation of Foo looks like:
>>>>>
>>>>> java -classpath /Users/atreides -d /Users/harkonnen
>>>>> com/hakonnen/Foo
>>>>>
>>>>> To compile FooImpl:
>>>>>
>>>>> java -classpath /Users/atreides -d /Users/harkonnen
>>>>> -moduleClasspath /Users/harkonnen com/hakonnen/Foo'
>>>>>
>>>>> Because the Spice is obtained from outside the moduleClasspath,
>>>>> the compiler knows that it is *not* in the current module being
>>>>>  compiled.  Because FooImpl *is* on the moduleClasspath, it
>>>>> knows that it is in the current module being compiled.  Thus,
>>>>> FooImpl can use a module private method on Foo, but cannot use
>>>>> any module private methods on Spice.
>>>>>
>>>>> In this example, it's important to realize that the *only*
>>>>> source code that will be compiled is source that is in the
>>>>> current module. Thus, the question as to "what module any
>>>>> source code is in" has exactly one answer: "the current
>>>>> module". It is also important to note that the additional
>>>>> "module class path" is needed only to handle two edge cases of
>>>>> compilation:
>>>>>
>>>>> 1. libraries in the OSGi bundle class path 2. separate
>>>>> compilation of a module's source files
>>>>>
>>>>>
>>>>> I believe that the probability of #1 is far higher than the 
>>>>> probability of #2, given my experience in creating modular
>>>>> code, as well as what I've seen in large scale commercial
>>>>> development here at House Harkonnen and other corporations.
>>>>> Consequently, if you're doing what comes naturally to
>>>>> developers compiling modules - i.e. compiling the entire source
>>>>> of the module - then you would not need any additional class
>>>>> path at all.  The compiler knows that any source it is
>>>>> compiling is a module and any dependency it finds from the
>>>>> class path is considered to be in another module (can be multiple 
>>>>> modules as it does not matter to the semantics).
>>


More information about the jsr294-modularity-observer mailing list