1
root@testpgmaster:/# puppetd --test
info: Creating a new SSL key for testpgmaster
warning: peer certificate won't be verified in this SSL session
info: Caching certificate for ca
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
info: Creating a new SSL certificate request for testpgmaster
info: Certificate Request fingerprint (md5): C9:83:59:EF:6A:B8:16:31:B6:92:5D:70:F1:67:39:4F
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
Exiting; no certificate found and waitforcert is disabled
root@testpgmaster:/# 

This is normal output, and the result is, as expected, the proper creation of the certificate request. The problem is that almost every time the warning is shown, there is a delay of about two minutes.

When running the same command with strace, then, in the chaotic strace output, I have, somewhere:

socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 6
fcntl(6, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(6, {sa_family=AF_INET, sin_port=htons(8140), sin_addr=inet_addr("10.0.2.15")}, 16) = -1 EINPROGRESS (Operation now in progress)
...
[lots of intermediate output]
...
select(7, [6], [], [], {119, 999900})   <- The pause occurs here

AFAIU, this means that it blocks while waiting for the puppet master to respond to something.

The puppetmaster runs passenger on apache2, listening on port 8140. It also has a firewall, but port 8140 allows the agent. In fact the agent is an openvz container running inside the machine which acts as the master; the container uses NAT and the host does forwarding; there could certainly be a problem with the setup of all that, but the problem occurs only when the message "warning: peer certificate won't be verified in this SSL session"; in all other cases, everything runs without problem.

Antonis Christofides
  • 2,598
  • 2
  • 23
  • 35
  • Does it only occur when the client's generating its certificate for the first time? And can you provide a bit more context on the `strace` output - what's happening before and after the pause? My hunch right now would be a lack of entropy for certificate creation, but more information would be helpful. – Shane Madden Feb 14 '12 at 17:36
  • The delay occurs every single time the message "warning: peer certificate..." appears; the message appears, and then there is the two-minute delay before the next message. There is no delay when creating the certificate; i.e., the message "info: Creating a new SSL..." is immediately followed by the message showing the fingerprint. Then, three more warnings, three more delays. After all this finishes, I go to the master and sign the certificate; I rerun "puppetd --test" on the client, it shows me the same warning once, same delay, then the normal puppet output, and never again any delays. – Antonis Christofides Feb 15 '12 at 10:08
  • I've added lots more strace context at http://pastebin.com/4N4JAHpq. Search in the page for "pause" in order to go to the pause point. – Antonis Christofides Feb 15 '12 at 10:19

1 Answers1

2

The problem occurred when the puppet master was trying to resolve the name of the client from its IP address. Because of a firewall misconfiguration, incoming UDP packets were being filtered, which means that any DNS query eventually timed out.

I located the problem by running wireshark on the puppet master, as I suspected that it was a network timeout, and hoped to find the culprit by looking at the few packets that occurred just before the pause. Indeed it worked. It is strange that the total failure of DNS did not cause any other problem; one reason is that I was using an apt proxy, which means that apt-get did not make DNS requests.

Antonis Christofides
  • 2,598
  • 2
  • 23
  • 35