5

I'm configuring a new SAN for a set of Ubuntu 18.04-based servers. Each of nodes can mount the ext4 formatted partition fine.

Being new to both multipath and iSCSI, I'm not sure if what I'm seeing is "normal". I'm having two problems so far.

  1. When I create a file with touch, the other nodes do not see it. I'm used to some kind of delay with NFS mounted drives, but basically, the other nodes never seen it (i.e., I'm still waiting and I guess an hour has passed already).

  2. More worrying, is that when I list a copied file with ls or du the directory it is in, I get an error "Bad message". I looked around the Internet and it seems the solution is to unmount the drive and then use fsck to check it. That is, data corruption might have occurred. However, on the computer I copied the file with (i.e., computer A), the file is fine. When I ls it with another computer (i.e., computer B), I get this error.

In the management software of the SAN, I don't see any disk errors.

All of the servers and the SAN are connected to a single switch for a local network. They are physically close to each other -- they are in the same rack.

Are these two situations "normal"? If not, any suggestions on what I can do?

Dave M
  • 4,514
  • 22
  • 31
  • 30
Ray
  • 167
  • 6

3 Answers3

7

That is normal behavior for a non-clustered file system.

To use iSCSI SAN with Ubuntu compute servers, a clustered file system should be used.

You should probably learn more about GPFS, GFS2, Lustre, GlusterFS, and OCFS2 and use one of them on top of iSCSI SAN.

Edit: Good description of what’s going on can be found here:

https://forums.starwindsoftware.com/viewtopic.php?f=5&t=1392

BaronSamedi1958
  • 13,676
  • 1
  • 21
  • 53
A.Newgate
  • 1,484
  • 6
  • 11
  • 2
    Thanks a lot for your list of suggestions! I don't know what is best or most popular, but I tried to pick whichever seemed easiest to set up (i.e., there was documentation I could follow). I ended up choosing OCFS2 and it seems to be working now. I've used NFS + ext4 many times; I didn't realise a SAN would be different. Thanks for your reply! – Ray Jul 17 '20 at 03:53
4

ext is not a cluster-aware filesystem, so the moment a second node mounts it it will be corrupted. This is because there's no common block locking mechanism, which there is with a cluster-aware filesystem

Use a cluster-awre filesystem.

Chopper3
  • 101,299
  • 9
  • 108
  • 239
  • Oh? That was such a newbie mistake... So, what should I use? Do you have any suggestions for an Ubuntu system? I was looking at [here](https://en.wikipedia.org/wiki/Clustered_file_system) just now, but I'm not sure what to choose... – Ray Jul 16 '20 at 11:25
  • Though I don't know what I'm doing, I looked around and found some information about `ocfs2` with Ubuntu. I will give that a try... Thank you! – Ray Jul 16 '20 at 13:32
  • Sorry, was away all yesterday (26th wedding anniversary!) - I would have suggested OCFS2 too, good spot, it's quite easy to configure too :) – Chopper3 Jul 17 '20 at 06:57
  • No worries and happy anniversary! :-) Good to know I probably made the right decision. The list of choices was a bit daunting but it did seem easy to configure. Thanks a lot! – Ray Jul 18 '20 at 14:45
3

Um...

A SAN is not NFS. Unless you're using a shared/cluster filesystem, you can't just mount something ext4 onto multiple hosts.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
  • *yikes* I did not realise this. How embarassing... Do you have a suggestion as to what I should consider for Ubuntu? – Ray Jul 16 '20 at 11:27