2

Almost every time I run kitchen converge with ec2 driver it is able to create the server and establish an ssh connection but then after detecting the chef omnibus installation it tries to transfer files but fails with an unhelpful error. I've tried using different versions of net-ssh and reinstalling chefdk. I've gotten it to successfully converge once out of maybe 30 times and can't figure out what the difference is.

Has anybody else run into this problem?

-----> Starting Kitchen (v1.10.2)
-----> Creating <default-rhel7>...
       If you are not using an account that qualifies under the AWS
free-tier, you may be charged to run these suites. The charge
should be minimal, but neither Test Kitchen nor its maintainers
are responsible for your incurred costs.

       Instance <i-167bf188> requested.
       Polling AWS for existence, attempt 0...
       Attempting to tag the instance, 0 retries
       EC2 instance <i-167bf188> created.
       Waited 0/600s for instance <i-167bf188> to become ready.
       Waited 5/600s for instance <i-167bf188> to become ready.
       Waited 10/600s for instance <i-167bf188> to become ready.
       Waited 15/600s for instance <i-167bf188> to become ready.
       Waited 20/600s for instance <i-167bf188> to become ready.
       Waited 25/600s for instance <i-167bf188> to become ready.
       Waited 30/600s for instance <i-167bf188> to become ready.
       Waited 35/600s for instance <i-167bf188> to become ready.
       Waited 40/600s for instance <i-167bf188> to become ready.
       Waited 45/600s for instance <i-167bf188> to become ready.
       Waited 50/600s for instance <i-167bf188> to become ready.
       Waited 55/600s for instance <i-167bf188> to become ready.
       EC2 instance <i-167bf188> ready.
       Waiting for SSH service on 10.254.105.26:22, retrying in 3 seconds
       Waiting for SSH service on 10.254.105.26:22, retrying in 3 seconds
       Waiting for SSH service on 10.254.105.26:22, retrying in 3 seconds
       [SSH] Established
       Finished creating <default-rhel7> (1m56.70s).
-----> Converging <default-rhel7>...
       Preparing files for transfer
       Preparing dna.json
       Preparing current project directory as a cookbook
       Removing non-cookbook files before transfer
       Preparing validation.pem
       Preparing client.rb
-----> Chef Omnibus installation detected (install only if missing)
       Transferring files to <default-rhel7>
C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/ruby_compat.rb:25:in `select': closed stream (IOError)
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/ruby_compat.rb:25:in `io_select'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/transport/packet_stream.rb:75:in `available_for_read?'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/transport/packet_stream.rb:87:in `next_packet'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/transport/session.rb:193:in `block in poll_message'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/transport/session.rb:188:in `loop'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/transport/session.rb:188:in `poll_message'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:474:in `dispatch_incoming_packets'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:225:in `preprocess'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:206:in `process'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:170:in `block in loop'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:170:in `loop'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:170:in `loop'
    from C:/Users/AlexKiaie/AppData/Local/chefdk/gem/ruby/2.1.0/gems/net-ssh-3.2.0/lib/net/ssh/connection/session.rb:119:in `close'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/test-kitchen-1.10.2/lib/kitchen/transport/ssh.rb:115:in `close'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/test-kitchen-1.10.2/lib/kitchen/transport/ssh.rb:97:in `cleanup!'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/test-kitchen-1.10.2/lib/kitchen/instance.rb:274:in `cleanup!'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/test-kitchen-1.10.2/lib/kitchen/command.rb:209:in `run_action_in_thread'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/test-kitchen-1.10.2/lib/kitchen/command.rb:173:in `block (2 levels) in run_action'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/logging-2.1.0/lib/logging/diagnostic_context.rb:450:in `call'
    from C:/opscode/chefdk/embedded/lib/ruby/gems/2.1.0/gems/logging-2.1.0/lib/logging/diagnostic_context.rb:450:in `block in create_with_logging_context'
chicks
  • 2,393
  • 3
  • 24
  • 40
Alex
  • 1,161
  • 1
  • 7
  • 7
  • Are you on any kind of funky corp network that either intercepts traffic or insists on its own idle timeouts? – coderanger Jul 22 '16 at 07:53
  • If this is in VPC check the CPU on the Internet Gateways and see if they're getting too close to max. – chicks Jul 23 '16 at 01:04

1 Answers1

4

I had a similar issue. After a lot of digging, I found in the /var/log/secure an interesting message -

"localhost sshd[1081]: error: no more sessions".

By default SSHD has 10 sessions allowed; these aren't logged in sessions. If for some reason there are sessions which either aren't closed properly or are open, you will receive this error.

I then went into my .kitchen.yml and added:

max_ssh_sessions: 1

to the transport section. So it now looks like:

transport:
  ssh_key: ./kitchen.pem
  # need to get this key from vault, then place it on the kitchen ecs container
  connection_timeout: 10
  connection_retries: 5
  max_ssh_sessions: 1
  username: centos

It is noticeably slower when I run test kitchen. However, it works 100% of the time. What I think is happening is that kitchen is opening multiple SSH sessions to speed up the installation of the tools required. Eg yum for ansible/git/whatever and the /tmp/install.sh for chef.

Hope this helps someone. It took me a little while to find out.

Tejas Pandya
  • 3,987
  • 1
  • 26
  • 51