4

We have been working on restoring our Ceph cluster after losing a large number of OSDs. We have all PGs active now except for 80 PGs that are stuck in the "incomplete" state. These PGs are referencing OSD.8 which we removed 2 weeks ago due to corruption.

We would like to abandon the "incomplete" PGs as they are not restorable. We have tried the following:

  1. Per the docs, we made sure min_size on the corresponding pools was set to 1. This did not clear the condition.
  2. Ceph would not let us issue "ceph osd lost N" because OSD.8 had already been removed from the cluster.
  3. We also tried "ceph pg force_create_pg X" on all the PGs. The 80 PGs moved to "creating" for a few minutes but then all went back to "incomplete".

How do we abandon these PGs to allow recovery to continue? Is there some way to force individual PGs to be marked as "lost"?

To remove the OSD we used the procedure from the web site here:

http://docs.ceph.com/docs/jewel/rados/operations/add-or-rm-osds/#removing-osds-manual

Basically:

ceph osd crush remove 8
ceph auth del osd.8
ceph osd rm 8

Some miscellaneous data below:

djakubiec@dev:~$ ceph osd lost 8 --yes-i-really-mean-it
osd.8 is not down or doesn't exist


djakubiec@dev:~$ ceph osd tree
ID WEIGHT   TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 58.19960 root default
-2  7.27489     host node24
 1  7.27489         osd.1        up  1.00000          1.00000
-3  7.27489     host node25
 2  7.27489         osd.2        up  1.00000          1.00000
-4  7.27489     host node26
 3  7.27489         osd.3        up  1.00000          1.00000
-5  7.27489     host node27
 4  7.27489         osd.4        up  1.00000          1.00000
-6  7.27489     host node28
 5  7.27489         osd.5        up  1.00000          1.00000
-7  7.27489     host node29
 6  7.27489         osd.6        up  1.00000          1.00000
-8  7.27539     host node30
 9  7.27539         osd.9        up  1.00000          1.00000
-9  7.27489     host node31
 7  7.27489         osd.7        up  1.00000          1.00000

BUT, even though OSD 8 no longer exists I see still lots of references to OSD 8 in various ceph dumps and query's.

Interestingly, we do still see weird entries in the CRUSH map (should I do something about these?):

# devices
device 0 device0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 device8
device 9 osd.9

And for what it is worth, here is the ceph -s:

cluster 10d47013-8c2a-40c1-9b4a-214770414234
 health HEALTH_ERR
        212 pgs are stuck inactive for more than 300 seconds
        93 pgs backfill_wait
        1 pgs backfilling
        101 pgs degraded
        63 pgs down
        80 pgs incomplete
        89 pgs inconsistent
        4 pgs recovery_wait
        1 pgs repair
        132 pgs stale
        80 pgs stuck inactive
        132 pgs stuck stale
        103 pgs stuck unclean
        97 pgs undersized
        2 requests are blocked > 32 sec
        recovery 4394354/46343776 objects degraded (9.482%)
        recovery 4025310/46343776 objects misplaced (8.686%)
        2157 scrub errors
        mds cluster is degraded
 monmap e1: 3 mons at {core=10.0.1.249:6789/0,db=10.0.1.251:6789/0,dev=10.0.1.250:6789/0}
        election epoch 266, quorum 0,1,2 core,dev,db
  fsmap e3627: 1/1/1 up {0=core=up:replay}
 osdmap e4293: 8 osds: 8 up, 8 in; 144 remapped pgs
        flags sortbitwise
  pgmap v1866639: 744 pgs, 10 pools, 7668 GB data, 20673 kobjects
        8339 GB used, 51257 GB / 59596 GB avail
        4394354/46343776 objects degraded (9.482%)
        4025310/46343776 objects misplaced (8.686%)
             362 active+clean
             112 stale+active+clean
              89 active+undersized+degraded+remapped+wait_backfill
              66 active+clean+inconsistent
              63 down+incomplete
              19 stale+active+clean+inconsistent
              17 incomplete
               5 active+undersized+degraded+remapped
               4 active+recovery_wait+degraded
               2 active+undersized+degraded+remapped+inconsistent+wait_backfill
               1 stale+active+clean+scrubbing+deep+inconsistent+repair
               1 active+remapped+inconsistent+wait_backfill
               1 active+clean+scrubbing+deep
               1 active+remapped+wait_backfill
               1 active+undersized+degraded+remapped+backfilling
DanJ
  • 1,654
  • 1
  • 16
  • 23

0 Answers0