Can one persist MPGO data between builds?

Question

I've just read a post about MPGO (Managed Profile Guided Optimization) and the process described is:

Obtain a machine with Visual Studio 11 Ultimate Beta and your application installed.
Run the MPGO tool (as an administrator) with the necessary parameters:
MPGO -scenario MyLargeApp.exe -AssembyList *.* -OutDir C:\Optimized\ The optimized IL assemblies are created in the C:\Optimized folder.
Run the NGen tool (as an administrator) with the necessary parameters for each application DLL: NGEN.exe myLargeApp.exe
Run your application – it will now use the optimized native images.

This seems to imply that you have to perform the guiding scenarios on the binaries that go into your released product.

It doesn't make sense to me that manual intervention is needed during the build process. Is there a way to perform the guiding scenarios once and then commit the data generated so it will be automatically inserted into the compiled assemblies in future builds?

score 2 · Answer 1 · answered Jun 07 '12 at 04:57

Years ago I worked in a build lab at Microsoft that handled a lot of managed code. Let me reinforce, this was many was years ago, before managed MPGO was public. But back then they would use old profile data (usually from the day before, but sometimes up to a week old) to 'partially optimize' a set of internal binaries. I can't speak to numbers, but we wouldn't have done it if it didn't have some benefit. Those 'partially optimized' binaries would only be used for automated smoke testing, and were only for internal use. Only fully optimized binaries (whose profile data was collected from the same build) would ever be released.

I'm not an expert, but from what I understand MPGO guidance data uses method signatures (like used by debug symbols) and file offsets which aren't stable between builds. Then the question becomes: what percentage is stable enough to have some benefit?

Lets say a method name changes for a method that is used a lot. Then, of course, 'hot' pages in the old binary (because of that method) won't be found in the new binary, and the page that gets used a lot will probably be put at the 'end' of the optimized binary with the code that is never used. On the other side of the coin: what % of methods are renamed from one daily build? (Or even more frequent with CI?) I'd guess less than 1%.

Let me jump back to the internal builds. Of course gathering new perf profile data took a while, so time-sensitive internal functions (that need to run just after the build) would run using the partially-optimized build flavor, because that build would complete hours before the fully-optimized build flavor. Let me explain why it took so long. IIRC we used profile 'passes', where core library scenarios are run first, those binaries are optimized, and the optimized core are used in later 'end-to-end' scenarios (i.e. server-side web service, or client-side GUI scenarios). So the core libraries would get profiled and optimized multiple times. As you can guess, all this takes time, which is why the 'fully analyzed/optimized' build took a LONG time.

I hope this was helpful.

P.S. This question reminded me of 32-bit DLL rebase issues.

Can one persist MPGO data between builds?

1 Answers1