0

I recently added two new Compute Nodes on HPE CLuster , But surprisingly, I am Unable to ssh into the new Compute Nodes from the Head Node . [Unable to SSH to new Compute Nodes][1]

(base) [root@hn001 ~]# su harender
[harender@hn001 ~/14-project-HWIS-dynamics]
$ ssh cn001
Warning: Permanently added 'cn001,10.218.96.91' (ECDSA) to the list of known hosts.
Last login: Wed Jul  6 15:34:24 2022 from hn001.iit
[harender@cn001 ~/14-project-HWIS-dynamics]
$ exit
logout
Connection to cn001 closed.
[harender@hn001 ~/14-project-HWIS-dynamics]
$ ssh cn002
Warning: Permanently added 'cn002,10.218.96.90' (ECDSA) to the list of known hosts.
Last login: Wed Jul  6 13:11:22 2022 from hn001.iit
[harender@cn002 ~/14-project-HWIS-dynamics]
$ exit
logout
Connection to cn002 closed.
[harender@hn001 ~/14-project-HWIS-dynamics]
$ ssh cn003
Warning: Permanently added 'cn003,10.218.96.95' (ECDSA) to the list of known hosts.
harender@cn003's password:
Permission denied, please try again.
harender@cn003's password:
[harender@hn001 ~/14-project-HWIS-dynamics]
$ ssh cn004
Warning: Permanently added 'cn004,10.218.96.96' (ECDSA) to the list of known hosts.
harender@cn004's password:
Permission denied, please try again.
harender@cn004's password:
[harender@hn001 ~/14-project-HWIS-dynamics]
$

Secondly when I check the Home Directories on the Compute Node 1 on which we are able to SSH , We find these entries.

[Home Directory on Compute Node 001]

[root@cn001 home]# ls -l
total 60
drwx------ 18 admin    admin    4096 Jun  9 15:12 admin
drwxr-xr-x 20 akshay   student  4096 Jul  5 23:45 akshay
drwx------  6 atul     atul     4096 Mar  9  2021 atul
drwx------ 21 harender student  4096 Jul  4 08:24 harender
drwx------  6 hemant   Faculty  4096 Aug 11  2021 hemant
drwx------  2 root     root    16384 Jan 28  2021 lost+found
drwx------ 12 monika   student  4096 Jul  4 12:32 monika
drwx------ 18 navneet  student  4096 Jul  6 12:27 navneet
drwx------  6 pbs      pbs      4096 Mar  3  2021 pbs
drwx------ 20 rohan    student  4096 May 22 13:08 rohan
drwx------ 19 shobhna  student  4096 Jun 27 12:41 shobhna
drwxr-xr-x  2 root     root     4096 Mar 11  2021 temp

Now when I check the Compute Node 3 and 4 on which we are unable to ssh from Head Node , we find a weird entry of UID and GID in place of OwnerName and GroupName instead.

[Home directory on ComputeNode 003]

[root@cn003 log]# ls -la /home/
total 68
drwxr-xr-x  14 root  root   4096 Jul  6 12:27 .
dr-xr-xr-x. 22 root  root   4096 May 23 13:04 ..
drwx------  18 admin admin  4096 Jun  9 15:12 admin
drwxr-xr-x  20  1007  1005  4096 Jul  5 23:45 akshay
drwx------   6  1003  1004  4096 Mar  9  2021 atul
drwx------  21  1004  1005  4096 Jul  4 08:24 harender
drwx------   6  1010  1006  4096 Aug 11  2021 hemant
drwx------   2 root  root  16384 Jan 28  2021 lost+found
drwx------  12  1006  1005  4096 Jul  4 12:32 monika
drwx------  18  1009  1005  4096 Jul  6 15:10 navneet
drwx------   6  1002  1002  4096 Mar  3  2021 pbs
drwx------  20  1008  1005  4096 May 22 13:08 rohan
drwx------  19  1005  1005  4096 Jul  6 14:46 shobhna
drwxr-xr-x   2 root  root   4096 Mar 11  2021 temp
[root@cn003 log]#

This is what ssh -v cn003 threw when I tried to ssh with verbose.

[atul@hn001 root]$ ssh -V
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
[atul@hn001 root]$ exit
exit
(base) [root@hn001 ~]# su atul
[atul@hn001 root]$ ssh -v cn003
OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017
debug1: Reading configuration data /home/atul/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Connecting to cn003 [10.218.96.95] port 22.
debug1: Connection established.
debug1: identity file /home/atul/.ssh/id_rsa type 1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_rsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_dsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_dsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_ecdsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_ecdsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_ed25519 type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/atul/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.4
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.4
debug1: match: OpenSSH_7.4 pat OpenSSH* compat 0x04000000
debug1: Authenticating to cn003:22 as 'atul'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: curve25519-sha256 need=64 dh_need=64
debug1: kex: curve25519-sha256 need=64 dh_need=64
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:f7VqrIHzOBmUfewXGUgyzymd4auSxcrmCtGidMTBFD8
Warning: Permanently added 'cn003,10.218.96.95' (ECDSA) to the list of known hosts.
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Next authentication method: gssapi-keyex
debug1: No valid Key exchange context
debug1: Next authentication method: gssapi-with-mic
debug1: Unspecified GSS failure.  Minor code may provide more information
No Kerberos credentials available (default cache: KEYRING:persistent:1003)
debug1: Unspecified GSS failure.  Minor code may provide more information
No Kerberos credentials available (default cache: KEYRING:persistent:1003)
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /home/atul/.ssh/id_rsa
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Trying private key: /home/atul/.ssh/id_dsa
debug1: Trying private key: /home/atul/.ssh/id_ecdsa
debug1: Trying private key: /home/atul/.ssh/id_ed25519
debug1: Next authentication method: password
atul@cn003's password:
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
Permission denied, please try again.
atul@cn003's password:
Gerald Schneider
  • 23,274
  • 8
  • 57
  • 89
  • What did HPE support say when you logged this case with them? Also 'HPE CLuster' doesn't really narrow down what actual products you're using, can you add more detail please. – Chopper3 Jul 06 '22 at 09:35
  • Actually using OpenSource Software hence , We are unable to log a case , Can you check with the images that I shared , If you could understand the unusual behavior of the /home on the two nodes as listed above. – Aditya Kaushal Jul 06 '22 at 09:47
  • 1
    Check the sshd logs on the problematic servers, run ssh with verbose output. – Gerald Schneider Jul 06 '22 at 10:08
  • 1
    And please, don't post screenshots of text you could copy&paste. Just copy&paste it. You are just making it harder to extract information. – Gerald Schneider Jul 06 '22 at 10:08
  • Also, provide more information about the nodes. Are they using a central user directory, are the homes local or network mounted, etc. – Gerald Schneider Jul 06 '22 at 10:11
  • Hi Gerald , /home is created on the HeadNode and is mounted on all the Compute Nodes , I just ran the ssh using verbose from the Head Node by switching to one of the users and here is the Debug for the same After Entering the Password , It says Permission Denied. – Aditya Kaushal Jul 06 '22 at 10:18
  • Moreover the /home is mounted to all the nodes via nfs . – Aditya Kaushal Jul 06 '22 at 12:00

0 Answers0