1

I have successfully installed Ceph(v15.2.6) cluster with cephadm tool on several vm-nodes (Centos7). On each vm-node a separate block device /dev/sdb has been configured as a dedicated ceph-device. Up to that moment all the ceph-activities looked OK.

After some time I had to re-format /dev/sdb on a second vm-node. Ceph-cluster picked up the re-formatted device, but I got a health warning CEPHADM_REFRESH_FAILED. After some investigation I got the following:

> ceph health detail

[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
    host node02 ceph-volume inventory failed: cephadm exited with an error code: 1, stderr:INFO:cephadm:/bin/docker:stderr  stderr: blkid: error: /dev/fd0: No such device or address
INFO:cephadm:/bin/docker:stderr -->  KeyError: 'ceph.cluster_name'
Traceback (most recent call last):
  File "<stdin>", line 5203, in <module>
  File "<stdin>", line 1115, in _infer_fsid
  File "<stdin>", line 1198, in _infer_image
  File "<stdin>", line 3321, in command_ceph_volume
  File "<stdin>", line 877, in call_throws
RuntimeError: Failed command: /bin/docker run --rm --net=host --ipc=host --privileged --group-add=disk -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15.2.6 -e NODE_NAME=node02 -v /var/run/ceph/b2f1a03e-07c4-11eb-9602-005056010012:/var/run/ceph:z -v /var/log/ceph/b2f1a03e-07c4-11eb-9602-005056010012:/var/log/ceph:z -v /var/lib/ceph/b2f1a03e-07c4-11eb-9602-005056010012/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm --entrypoint /usr/sbin/ceph-volume docker.io/ceph/ceph:v15.2.6 inventory --format=json

Manual running Failed command "docker run... ceph-volume inventory --format=json" on the node02 gave the following log info:

[2020-11-27 15:00:46,486][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 151, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.6/site-packages/ceph_volume/inventory/main.py", line 38, in main
    self.format_report(Devices())
  File "/usr/lib/python3.6/site-packages/ceph_volume/inventory/main.py", line 48, in format_report
    print(json.dumps(inventory.json_report()))
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 51, in json_report
    output.append(device.json_report())
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 208, in json_report
    output['lvs'] = [lv.report() for lv in self.lvs]
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 208, in <listcomp>
    output['lvs'] = [lv.report() for lv in self.lvs]
  File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 789, in report
    'cluster_name': self.tags['ceph.cluster_name'],
KeyError: 'ceph.cluster_name'

But if I run "docker run... ceph-volume inventory --format=plain" I get no error:


Device Path               Size         rotates available Model name
/dev/fd0                  4.00 KB      True    False     
/dev/rbd0                 20.00 GB     False   False     
/dev/rbd1                 20.00 GB     False   False     
/dev/rbd2                 8.00 GB      False   False     
/dev/sda                  120.00 GB    True    False     Virtual disk
/dev/sdb                  500.00 GB    True    False     Virtual disk

To my opinion there is definitely a bug somewhere in /usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py of the docker.io/ceph/ceph:v15.2.6 docker image.

Does anybody know how to overcome the ceph-volume inventory --format=json problem?

Or maybe someone can suggest how to debug python script to catch the bug?

Thank you!

alex007
  • 93
  • 9

0 Answers0