In my Azure role startup code I instantiate a DCOM object to ensure that it can be instantiated and then immediately release it since I don't really need it at that moment.
I do that in a separate thread that actually new
s the corresponding C# RCW class and the main thread Thread.Join()
s that thread with a 30-seconds timeout. In case the thread is still running after Thread.Join()
returns this means the DCOM object takes suspiciously long to create and so Thread.Abort()
is called and the role restarts. 30 second should be enough - the object is lightweight and doesn't do anything time-consuming on instantiation.
That code worked just fine until I tried to scale up my service dramatically. I asked to support to lift the Compute cores quota and tried to scale to 100 (one hundred) instances.
Now most of the instances started fine, but some of them faced exactly the situation described above - the DCOM object creation took too long and so the code threw exception which caused the role to restart.
I repeated the test several times. Once I ask to scale up by some dozens of instances the problem is reproduced in some of the newly started instances. Since all the instances are uniform I have no idea what might be causing this behavior.
What might be the reason for the DCOM object to take so long in some instances only?