0

I was trying to reduce the time it takes for an ant build to complete. Most of the build time is taken by GWT compiler.

Following ant script is written on the lines of scripts found in official GWT examples. Notice how two GWT modules are being passed to the Complier. When you run this script, the GWT compiler compiles the two modules sequentially.

<target name="gwtc" description="GWT compile to JavaScript">
    <java failonerror="true" fork="true" classname="com.google.gwt.dev.Compiler">
        ........
        ........

        <arg value="com.af.gwtmodules.dashboard.Dashboard" />
        <arg value="com.af.gwtmodules.administration.Administration" />
        <arg line=" -localWorkers 16" />
    </java>
</target>

I changed the task to run 2 compile tasks in parallel and in each task I pass only one GWT module to the compiler.

<target name="gwtc" description="GWT compile to JavaScript">
<parallel threadsperprocessor="16">
    <java failonerror="true" fork="true" classname="com.google.gwt.dev.Compiler">
        ........
        ........

        <arg value="com.af.gwtmodules.dashboard.Dashboard" />
        <arg line=" -localWorkers 16" />
    </java>

    <java failonerror="true" fork="true" classname="com.google.gwt.dev.Compiler">
        ........
        ........

        <arg value="com.af.gwtmodules.administration.Administration" />
        <arg line=" -localWorkers 16" />
    </java>

</parallel>
</target>

This indeed runs faster as expected. However, I wonder whether the GWT compiler can do a better job at code optimization if it is given all modules at once instead of each module separately. For example, the two modules use a lot of common code. So if the compiler can see the entire code base at once, it can find more redundant code. In theory, it can create a single JS artefact for the common code and separate JS artifacts for code that is not common. This would have the effect of reducing download time for the user who accesses both modules as common JS artifact would be downloaded only once.

As far as I understand GWT modules are independent and so there would be no cross module optimizations. But the fact that GWT compiler internally does not parallelize this makes me think that there could be some cross module optimizations or other ramifications because of which Google engineers decided against parallelizing it beyond a limit.

I would like to know if parallelizing compile the way I have done, has any effect on quality of generated code.

Dojo
  • 5,374
  • 4
  • 49
  • 79

1 Answers1

1

If your CPU runs at 100% or you use all of available memory, it does not matter how many tasks you run in parallel. In fact, you may slow down the performance, not improve it, by pushing tasks in parallel.

You already set localWorkers to 16. That's a lot of parallel threads. By passing two tasks you simply double the number of threads. If you get any performance improvement from increasing this number - go for it, although your results look surprising (either your app is very small or your computer is a monster).

There are no optimization benefits from compiling modules sequentially vs in parallel, as far as I know. You can always verify it by looking the at the size of the compiled code.

You may find this post interesting:

GWT Compilation Performance

Andrei Volgin
  • 40,755
  • 6
  • 49
  • 58
  • The CPU does not run at 100% in the first case. The task is CPU intensive but it does not use all CPU threads on machine. It maxes out only a few cores/threads while others remain idle. That's why running more tasks in parallel has reduced the compile time in second case. Thanks for the link, and yes, I should probably just compare the CRC of the output files in each case. – Dojo Jun 12 '14 at 17:07
  • I don't get any improvement beyond 8 localWorkers, and I have a very powerful computer with 32GB of RAM. – Andrei Volgin Jun 12 '14 at 17:24
  • The improvement is not because of local workers defined in the compiler argument, it's because Ant runs two compiler tasks in parallel. This is on a Dual Quad Xeon machine with 16 CPU treads. – Dojo Jun 13 '14 at 04:41
  • You say that 32 threads in two compiler tasks run significantly faster than 32 threads would run in a single compiler task. I guess it's possible, for example, if you don't provide enough memory to a single compiler task. – Andrei Volgin Jun 13 '14 at 05:07
  • I know what you are talking about - the context switch overhead. Don't take these numbers seriously...I just copy pasted the same fragment of code twice when making it parallel I was concerned about not utilizing 100% on all cores and not concerned about the context switching overhead till I fixed that problem. After the task was added, it is hitting 100% on all cores, now I might change the number of workers to an optimal value to reduce unnecessary context switching. – Dojo Jun 13 '14 at 06:42
  • Yes because the compiler compiles only one module at a time. There is some parallelism when compiling a single module too but that utilizes only a few threads in-spite of how many threads you configure. Not all jobs can be broken down into parallel sub jobs. So the compiler is able to use only few of the available threads. Running two compiler tasks at once helps utilize the other cores. – Dojo Jun 13 '14 at 06:51