2

We are using Roslyn (the nuget package Microsoft.CodeAnalysis.CSharp, version 1.0.0.0-beta2) to compile generated code. We have 5000 CSharp files as String in memory and transform those to SyntaxTree's:

f => CSharpSyntaxTree.ParseText(f)

Then we create a compilation:

      CSharpCompilation compilation = CSharpCompilation.Create(assemblyName ,
               syntaxTrees: files.Keys,
               options: new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary,
                               optimizationLevel: OptimizationLevel.Release
               )
      );

And then we compile:

compilation.Emit(
            Path.Combine(assemblyOutputPath, assemblyName + ".dll"), 
            Path.Combine(assemblyOutputPath, assemblyName + ".pdb"));

Sometimes this takes a couple of minutes (like 5 to 10). But sometimes we see compiletimes of multiple hours (2 hours up to 6 hours).

How can I debug/trace this? Are there any API's for getting verbose information from Roslyn?

Edit: We run this process on a virtual Windows 2012 R2 instance with 4 GB RAM. I'm not sure what the impact is of such a setup.

Edit2: Yes, the process consumes a lot of memory and cpu (I saw usage spiking to 1500 MB ram). But in the case where it is slow, there is still plenty memory as far as I can see.

Edit3: We have run the process with performance counters on. Memory usage seems fine, no where near any limits. CPU also not near any limits, but we do see some spikes now and then:

enter image description here

Edit4: To react on @pharring , we are not compiling binaries. And I'm not sure if it is directly related to the source code. These compile times are not always this high. And only on the virtual machines. We haven't seen it on our development machines. On the other hand, it is a large dll. The resulting dll is 37 MB. I cannot link the source code for others to look at, but I will try to make a dump.

Michiel Overeem
  • 3,894
  • 2
  • 27
  • 39
  • A couple of questions: how do process each file? Sequentially or in some sort of parallel approach? For the runs that take a long time, do you see lots of disk or CPU usage? – Jason Malinowski Feb 19 '15 at 18:13
  • @JasonMalinowski Normally I see a lot of CPU and RAM (which is fine). In this case, there was plenty RAM available, but it seems to wait on something. – Michiel Overeem Feb 20 '15 at 07:22
  • And are you doing anything in parallel? – Jason Malinowski Feb 21 '15 at 01:20
  • No, but I think Roslyn does some things in parallel by default? I do see multiple cores in usage. – Michiel Overeem Feb 21 '15 at 08:56
  • Correct, we do some parallelization under the covers. If you were mixing that with some other parallelization I could imagine that you might end up with some very suboptimal threading behavior. But that doesn't sound to be the case here. – Jason Malinowski Feb 22 '15 at 05:36
  • @JasonMalinowski any idea on how I can get more info from Roslyn on what is going on? Is there a timing/log API ? – Michiel Overeem Mar 02 '15 at 08:52
  • Hm can't you use the debugger & wait for 30 minutes, and stop all the threads to see what is happening? – Erti-Chris Eelmaa Mar 02 '15 at 10:25
  • @ChrisEelmaa if I could reproduce it on a machine to debug, yes, but unfortunately... – Michiel Overeem Mar 02 '15 at 13:29
  • I've forwarded your question to our perf team that might know of good ways to investigate this. Something like procdump can be used to create dumps after a certain amount of time, if you had a crash dump or profiler trace of the process that'd be useful. But without solid data like that I won't be able to help, sorry. – Jason Malinowski Mar 07 '15 at 03:09

0 Answers0