We use a CEPH cluster to store images of our virtual machines. This cluster contains 3 monitors, 4 storage nodes and 1 admin.
CEPH OSD TREE
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 21.82190 root default
-2 5.45547 host ceph01
0 1.09109 osd.0 up 1.00000 1.00000
1 1.09109 osd.1 up 1.00000 1.00000
2 1.09109 osd.2 up 1.00000 1.00000
3 1.09109 osd.3 up 1.00000 1.00000
4 1.09109 osd.4 up 1.00000 1.00000
-3 5.45547 host ceph02
5 1.09109 osd.5 up 1.00000 1.00000
6 1.09109 osd.6 up 1.00000 1.00000
7 1.09109 osd.7 up 1.00000 1.00000
8 1.09109 osd.8 up 1.00000 1.00000
9 1.09109 osd.9 up 1.00000 1.00000
-4 5.45547 host ceph03
10 1.09109 osd.10 up 1.00000 1.00000
11 1.09109 osd.11 up 1.00000 1.00000
12 1.09109 osd.12 up 1.00000 1.00000
13 1.09109 osd.13 up 1.00000 1.00000
14 1.09109 osd.14 up 1.00000 1.00000
-5 5.45547 host ceph04
16 1.09109 osd.16 down 0 1.00000
17 1.09109 osd.17 down 0 1.00000
18 1.09109 osd.18 down 0 1.00000
19 1.09109 osd.19 down 0 1.00000
15 1.09109 osd.15 down 0 1.00000
First, since the last CentOs update, we can't synchronize our 4th server. On the other servers, there were no problems after the update. We tried to sync with:
nodown
option- Running VMs
- Stopped VMs
- Change HDD
- Change HDD slot
Does anyone have an idea or a lead for resynchronizing it?
Actually, we're considering a fresh installation of CentOS for server ceph04.
Secondly, we want to update the cluster. Is it possible to do this without disrupting the use of the cluster (with the VMs on)?
More infos
- OS:
CentOS Linux release 7.7.1908 (Core)
- CEPH version:
10.2.11
- Some machines have a large disk access.