5

We have a product with about 50 assemblies (DLL files) of which most are needed and loaded at startup of the main executable. The result is that even on a moderately fast machine, loading time of the assemblies and JIT'ing them takes between 2-3 seconds, which is in our case an unacceptable overhead.

If we load the program once and run it multiple times from the same, still running executable, the timings start at several ms. But for end-users this is not an option, they will run it from the commandline.

I would like to speed up the loading on first instantiating the executable. There is a small speedup between a cold start (after reboot of Windows) and a warm start, but only marginally so. What techniques or tools are available in .NET that we can use to speedup loading (note: we tried ILMERGE, which helps about 30% only and NGEN is not an option, it needs to be run at a variety of systems and architectures).

I was considering creating a service and/or a specific CLR hosting environment, but hopefully there is a simpler, more trivial solution. I have not tried GAC'ing yet.

Abel
  • 56,041
  • 24
  • 146
  • 247
  • I think this question has been asked before [check out this post][1] [1]: http://stackoverflow.com/questions/5175743/pre-load-all-assemblies-jit – Zeus82 Jul 24 '14 at 12:05
  • @Jeeve, thanks, but not quite. I updated my question to include NGEN. Also, `Assembly.Load` does not quite help, it still leaves us with quite a startup time (in fact, it changes nothing). And ILMERGE we already tried and it helps a little. – Abel Jul 24 '14 at 12:25
  • Why can't you use NGEN? "it needs to be run at a variety of systems and architectures" I don't understand that point? – usr Jul 24 '14 at 12:30
  • 1
    Consider using profile-based multicore JIT in .NET 4.5 – usr Jul 24 '14 at 12:31
  • 1
    @usr: NGEN is a trade-off. It does not do certain optimizations and creates a binary that can run on multiple versions of a certain architecture. One possibility to research is, though, to NGEN it on any particular system at first startup. Whether or not the missing optimizations pose a problem then, I'll have to see. I am not aware of profile-based multicore JIT, reading it now. – Abel Jul 24 '14 at 12:34
  • @usr: we are still in .NET 4.0, improvement with multicore JIT is considered about 20-30%, which helps a bit, true. But we're not yet ready with upgrading. Reference: http://blogs.msdn.com/b/dotnet/archive/2012/10/18/an-easy-solution-for-improving-app-launch-performance.aspx – Abel Jul 24 '14 at 12:41
  • Look into how DevArt Code Compare does it: They keep the compare app always running because it indeed takes seconds to launch it. This is probably the best you can do. AFAIK the slow .NET startup times partially lead to the Vista restart. Vista was supposed to have lots of .NET components. In the RTM Vista I believe the CLR didn't even need to load on boot. Too slow. – usr Jul 24 '14 at 13:03
  • @usr: not sure what you are getting at. We typically use Windows 7, 8 or 2012 Server. Cold start (after reboot) is not so interesting (can be as high as 6s), but warm start is (from 2-3s). The "app always running" for commandline apps can, I think, only be done with a service, or a custom CLR host (acting in conjunction with a service). – Abel Jul 24 '14 at 13:10
  • Have the app always running in the background. When launched from the command line just relay the command to the already running background process and forward the results. That makes the command line app tiny and launch quickly. It certainly does not need to load 50 assemblies. – usr Jul 24 '14 at 13:17
  • The next verson of the jitter, called ryujit will speed jitting up, butit won't help if the bottleneck is actually something else. But it's only CTP, and not production ready. – CodesInChaos Jul 24 '14 at 13:23
  • @CodesInChaos: yes, I've been following the compiler-as-a-service and ryujit, it is certainly promising. \@Downvoter: care to elaborate why my question is unclear so that I can improve? – Abel Jul 24 '14 at 13:51

2 Answers2

1

Use NGEN and multicore JIT. Refactor your application so that it needs less assemblies and less code at startup. Use ILMerge to reduce the number of assemblies and hopefully even trim away some unused code.

All of these are optimizations you can make. They do not offer groundbreaking improvements. .NET has no such options available.

Look into how DevArt Code Compare does it: They keep the compare app always running because it indeed takes seconds to launch it.

Have the app always running in the background. When launched from the command line just relay the command to the already running background process and forward the results. That makes the command line app tiny and launch quickly. It certainly does not need to load 50 assemblies.

This is probably the best you can do because it almost eliminates startup entirely.

usr
  • 168,620
  • 35
  • 240
  • 369
  • Unfortunately, this project is pretty clean, there is (almost) no unused or redundant code in the 300k something lines (can be seen by profiler). The biggest gain is with ILMERGE, which we use. I think the service, or keep-executable-in-memory approach (aka background process) is the way to go indeed, as apparently .NET is not going to do that for me. And since we need and use a custom CLR host anyway, I can just as well build this into our "own" CLR. – Abel Jul 25 '14 at 00:23
  • I don't see how a CLR host could help with that because if the app is launched externally a new process will be spawned. Anyway, you sound like you've got the problem solved (as well as possible under the circumstances). – usr Jul 25 '14 at 00:33
0

You should still be able to use NGEN by making it part of your deployment package (Setup Project)

MarkO
  • 2,143
  • 12
  • 14
  • Was think in this direction as well. That's the only way, actually. NGEN is always executed on the target machine, not the build machine. – usr Jul 24 '14 at 13:02
  • Just re-profiled with and without NGEN. The difference is, unfortunately, negligible. The smallest load (the one that uses least dependencies) has 1.7s warm startup time with NGEN and without between 1.7 - 2.1s. A full load takes 3.1s both with and without NGEN. On loaded reruns (from a different executable, calling the processor multiple times), the smallest runtime is 0.1s and the complex scenario 0.3s. So I guess that NGEN does not improve startup time as much as propagated. Looks like the only alternative is a service, which is interoped from a tiny wrapper application... – Abel Jul 24 '14 at 13:06