0

I need to enumerate keys in the machine key container. Although this is generally an optional provider function, both MS_STRONG_PROV and MS_ENH_RSA_AES_PROV support it. I do not think I am doing anything wrong or unusual: first, acquiring a context handle with CryptAcquireContext(... CRYPT_MACHINE_KEYSET | CRYPT_VERIFYCONTEXT ...), then calling CryptGetProvParam(... PP_ENUMCONTAINERS ...) repeatedly until the enumeration is exhausted:

void enum_keys(HCRYPTPROV hprov) {
  BYTE buf[1024];  // Max key name length we support.
  for (DWORD first_next = CRYPT_FIRST; 1; first_next = CRYPT_NEXT) {
    DWORD buf_len = sizeof buf;
    if (!CryptGetProvParam(hprov, PP_ENUMCONTAINERS, buf, &buf_len, first_next)) {
      if (GetLastError() == ERROR_NO_MORE_ITEMS) break;
      else exit(1);
    }
  }
}

void do_benchmark(DWORD enum_flags) {
  enum_flags |= CRYPT_VERIFYCONTEXT;
  HCRYPTPROV hprov;
  if (!CryptAcquireContext(&hprov, NULL, MS_ENH_RSA_AES_PROV_A,
                           PROV_RSA_AES, enum_flags))
    exit(1);

  int K = 100;
  ClockIn();  // Pseudocode
  for (int i = 0; i < K; ++i)
    enum_keys (hprov);
  ClockOut();  // Pseudocode.
  printf(" %f ms per pass\n", TimeElapsed() / K);

  CryptReleaseContext(hprov, 0);
}

void main() {
  printf("--- User key store access performance test... ");
  do_benchmark(0);
  printf("--- Machine key store access performance test... ");
  do_benchmark(CRYPT_MACHINE_KEYSET);
}

To benchmark the enumeration, I am leaving context acquisition and release out of the loop, and clocking only the enumeration, and repeat the enumeration 100 times. What I am noticing is that the enumeration is significantly slower for a normal user than an administrator. When I run the test as myself (member of Administrators with UAC enabled), I am getting

--- User key store access performance test...  3.317211 ms per pass
--- Machine key store access performance test...  78.051593 ms per pass

However, when I run the same test from an elevated prompt, the result is dramatically different:

--- User key store access performance test...  3.279580 ms per pass
--- Machine key store access performance test...  1.499939 ms per pass

Under the hood, more keys are reported to an admin than to non-admin user, but that's expected and normal. What I do not understand is why the enumeration is ~40 times slower for a non-admin user. Any pointers?

I am putting the full source of my test into a Gist. The test is run on a pretty generic Windows 7 machine without any crypto hardware.

Added: on a Server 2012 virtual machine on a Server 2012 HyperV host, the slowdown factor was even greater, over 130: 440 vs 3.3 ms. 440ms is a performance issue for me, indeed.

  • Please don't link to offsite content. If the code is important to your question (and it usually is), put it in your question. – IInspectable Sep 11 '15 at 11:07
  • @IInspectable: Thanks, I did not know links were not allowed in questions. In fact SO even presents an "insert a link" button when asking one. If you can give me a pointer to the rule, I'd appreciate that. But mainly, I am torn. In fact, this is a ~60 line program, and you can say that nothing is essential in it: saying "I'm enumerating keys the way one supposed to" would be enough, but should elicit a "show me at least some of your code" follow-up); on the other hand, everything is essential in it for a quick reproduction. Would you advise posting the full source into the question then? – kkm inactive - support strike Sep 11 '15 at 19:03
  • 2
    Linking to offsite content is not strictly prohibited. Questions on stackoverflow, however, should be self-contained. Should the offsite resource become temporarily unavailable, or disappear for good, a question may cease to be useful. Guidelines can be found at [How do I ask a good question?](http://stackoverflow.com/help/how-to-ask) I would recommend adding the entire source code where you are linking to it now. The effect you are observing may or may not be caused by your code. Without seeing the code it is more difficult to provide good answers. – IInspectable Sep 11 '15 at 19:14
  • 1
    @IInspectable: Ah, now I think I understand the spirit of the rule. "Being self contained" is the touchstone. I hope I got it right then (but let me know if I am mistaken please): I think my question *is* self-contained w/o the code, as the 1st paragraph describes what I do briefly, and the code is akin to an M&M disclosure, that would be helpful to someone who wants (I hope!) dig deeper. I often wanted I had a full source code when reproducing interesting questions, not just an MCVE! All in all, I appreciate your feedback highly. I'll add an MCVE, but keep the link to the Gist. Thanks much! – kkm inactive - support strike Sep 11 '15 at 20:05
  • How many keys are being enumerated? If it's only 1, there may be a problem, if there's 10 million, then 400ms sounds perfectly acceptable. – theB Sep 12 '15 at 11:44
  • @theB: About a dozen keys returned to an admin on either machine, and only 1 of these to a user, others being apparently protected from access. Why do you think the ×130 slowdown of admin versus non-admin enumeration would be "perfectly acceptable" if there were many keys? Is the access check really such computationally complex? – kkm inactive - support strike Sep 14 '15 at 17:39

1 Answers1

0

Could it be related to this issue from Microsoft:

You experience poor performance when you call the CryptAcquireContext function in Windows Server 2008 R2 or in Windows 7

From the issue: "This issue occurs because of a change in the CryptAcquireContext function in Windows Server 2008 R2 and in Windows 7.

This change checks whether the function runs in a domain environment. However, the process is time-consuming and causes the increased running time of the CryptAcquireContext function."

dparnas
  • 4,090
  • 4
  • 33
  • 52