2

Linux server will hang and when I restart it will take around 2 hours to come up and resume normal operations.

I checked the system log files and found mounting fs with errors, running e2fsck is recommended

I found some solution here, which says says using below commands

tune2fs -c 100 /dev/sdx1

tune2fs -i 90d /dev/sdx1

Got one more solution

/etc/fstab file looks like this:

file system           dir           type         options              dump  pass

UUID=123-456-ABC-DEF   /             ext4     defaults,noatime        0       0

we need to change pass value from 0 to 1 so that it will allow disk clean up

I'm really afraid as its a production box and if something goes wrong I should reconfigure.

what is the best approach and any suggestions are welcome.

Manju Kb
  • 21
  • 1
  • 4

2 Answers2

1

Does you try to run e2fsck(or fsck)? It may help... Also I could recommend to check hardware status of your disks. Try to install smartmontools and check error log of your disk. You can use command for this:

smartctl --all /dev/sdx

And I am strongly recommend to made a backup before doing something.

user2986553
  • 390
  • 1
  • 4
1

Make sure your backups are intact and workable before making major changes to the host filesystem. The following could possibly be classified as "major changes".

You're able to mount this filesystem, so it's not that damaged. You should definitely enable "pass" on the fstab to make sure your filesystems get checked regularly. EXT4 is a filesystem that requires offline maintenance like this every so often. Always use a pass option for it.

However, before you check and repair the filesystem on /, make sure the underlying disk is healthy. smartctl is a great tool for that, as user2986553 already mentioned. If your underlying block device has problems, running a filesystem repair has the potential of causing more problems than it would normally fix.

Once you've ensured your disk is healthy, run an offline check on your root filesystem. You will have to reboot for this, as EXT4 cannot be repaired while online, and read-only checks on an active mount will provide unreliable results. The easiest way to ensure this happens on your next reboot is to create a file named "forcefsck" on the root of the filesystem.

Create a file named "forcefsck" to force a check at the next mount attempt:

# touch /forcefsck

Reboot, and you should see the check run. When it completes and mounts the filesystem, it should delete that "forcefsck" file. You may want to make sure it's gone and delete it if it's not.

This will very likely go well and fix any problems. EXT4 can be automatically repaired very easily, and can be fixed after some pretty impressive failures.

Spooler
  • 7,046
  • 18
  • 29