When to create and distribute "reference assemblies"?

Question

C# 7.1 introduced a few new command line parameters to help create "reference assemblies". By documentation it outputs an assembly which:

have their method bodies replaced with a single throw null body, but include all members except anonymous types.

I've found an interesting note that it is more stable on changes:

That means it changes less often than the full assembly--many common development activities don't change the interface, only the implementation. That means that incremental builds can be much faster- ...

and that it is probably necessary for roslyn itself ..

We will be introducing a second concept, which is "reference assemblies" (also called skeleton assemblies). [---] They will be used for build scenarios.

.. whatever those "build scenarios" are for Roslyn.

I understand that for ordinary .NET assembly users, such assembly is probably smaller and slightly faster to load for reflection. Ok, but:

usually you also care about execution and the implementation assembly already contains all the data from reference assembly,
quite often you don't care about that minor performance difference on loading,
and most importantly - usually you don't have that stripped-down reference assembly available (distributed) at all.

It's usefulness seems rather niche.

So, I wonder about the general assembly producer side of things - when should one consider explicitly using those new compiler flags to create a reference assembly? Does it have a any practical use outside Roslyn itself at all?

You have been using such assemblies for the past 8 years. That is how the .NET 4.x framework reference assemblies are implemented. Stored in the c:\program files (x86)\reference assemblies directory. They provide a basic guarantee that the (many) changes that Microsoft has made to the framework code does not break any existing program. Such breakage can be excessively painful. And happened in the past, the WaitHandle.WaitOne(int) overload got added in a .NET 2.0 service pack. Kaboom on a machine that wasn't updated. — Hans Passant, Apr 11 '18 at 14:29
This matters a lot more to Microsoft than it does to us. We can simply increment the [AssemblyVersion], forcing our clients to rebuild their program. — Hans Passant, Apr 11 '18 at 14:33
In old PCL days, some NuGet packages use their own tricks to generate assemblies for reference purposes only, for PCL projects to consume, and other real assemblies for each target platforms. They might benefit from the Roslyn feature, but with the death of PCL, .NET Standard is far better an alternative. — Lex Li, Apr 12 '18 at 00:12

score 3 · Accepted Answer · answered Apr 11 '18 at 16:19

The motivation for this feature is indeed build scenarios, but they're not specific to Roslyn; they're your build scenarios, too.

When you build your project, the build engine (MSBuild) needs to decide whether each output of the build is up to date with respect to its inputs. For example, if you don't change anything and just run build twice in a row, the second time doesn't need to invoke the C# compiler: the assembly was already correct.

Reference assemblies allow skipping the compile step for assemblies in more scenarios, so your builds can be faster. I think an example would help illustrate.

Suppose you have a solution containing B.exe that depends on A.dll.

The compiler command line for B would look something like

csc.exe /out:B.exe /r:..\A\bin\A.dll Program.cs

And its inputs would be

The source for B (Program.cs)
The assembly for A.

If you change the source of A and build your solution, the compiler must run for A, producing a new A.dll. Then, since A.dll is an input to the compilation of B, B has to be recompiled, too.

Using a reference assembly for A changes this slightly

csc.exe /out:B.exe /r:..\A\bin\ref\A.dll Program.cs

The input for A is now its reference assembly, rather than its implementation/normal assembly.

Since the reference assembly is smaller than the full assembly, that has a minor effect on build time all by itself. But that's not enough to justify this feature. What's important is that the compiler only cares about the public API surface of the passed-in references. If an internal implementation detail of the assembly has changed, assemblies that reference it do not need to be recompiled to pick up the new behavior. As @Hans Passant mentions in comments, this is how the .NET Framework itself can deliver compatible performance improvements and bug fixes on unchanged user code.

The benefit of the reference assemblies feature comes from the MSBuild work done to use them. Suppose you change an internal implementation detail in A but don't change its public interface. On the next build,

The compiler must run for A, because source files for A changed.
The compiler emits both A.dll (with the changed implementation) and ref\A.dll, which is identical to the previous reference assembly.
Since ref\A.dll is identical to the previous output, it does not get copied to A's output folder.
When it is time for B's compiler to run, it sees that none of its inputs have changed--neither B's own code, nor A's reference assembly, so the compiler doesn't have to run.
B then copies the updated A.dll to its output and is ready to run with the new behavior.

The effect of skipping downstream compilation can compound as you go along in a large solution--changing a comment in {ProjectName}.Utilities.dll no longer requires building everything!

Many changes involve changing both the public API surface and the internal implementation, so this change doesn't speed up all builds, but it does speed up many builds.

Good explanation, makes perfect sense for build chains in large solution.. It seems it is less useful for non-framework assembly distribution scenarios, right? — Imre Pühvel, Apr 12 '18 at 07:43
That's right; _distributing_ these reference assemblies hasn't been a big factor in their design so far. They're useless at runtime (since they don't contain implementations) but can be used in build scenarios. NuGet packages can contain a `ref/` folder, as @Lex Li mentioned in a comment, but there's no current integration to do that for you. — Rainer Sigwald, Apr 13 '18 at 18:50

When to create and distribute "reference assemblies"?

1 Answers1