2

I recently encountered a disk failure, for which I got a replacement disk, and successfully replaced the disk as well as completed the resilvering process. I was able to use the raidz2 without issues via samba, but then went to clear the errors as zfs prompted me to, and the command "froze" and had no output. I let it be over the night, and as it was still showing nothing in the morning, I terminated the command, which still didn't do anything, so I just closed down my putty ssh connection to it. After then, I was unable to access the data, import or import the pool no matter what i tried.

The system is on ubuntu(host name UBUNTU-SERVER-KVM), and I have another virtual ubuntu installed on it(host name ubuntu-server) which usually accesses the zfs via samba shares (I never figured out how to have it directly accessible in the virtual machines also). I suspect I may have done part of the zfs work on the main ubuntu-kvm, and part of it on the virtual ubuntu installation. I fear this caused my issues, and I'm unable to import the pool in any way. The ubuntu-kvm machine's zpool.cache show the "old" pool setup, with the dead drive and not the new one. Which seems to confirm that I mistakenly did the resilver on the virtual ubuntu machine, instead of the ubuntu-kvm that usually had the zpool.

I have a feeling that if I could somehow correct the paths the pool is trying to import, my data would still be there as the resilvering process completed and I was able to access it previously. The virtual ubuntu refers the its drives as "/dev/vd*" and the ubuntu-kvm hosting the virtual machine shows them as "/dev/sd*".

Anyone has any ideas what I can do next to recover the data? Yes, I do have the most critical parts backed up in the cloud, but there are plenty of other things lost that I'd much rather they weren't :)

Here's some info that's hopefully helpful:

import attempts

    user@UBUNTU-SERVER-KVM ~> sudo zpool import
       pool: tank
         id: 3866261861707315207
      state: FAULTED
     status: One or more devices were being resilvered.
     action: The pool cannot be imported due to damaged devices or data.
            The pool may be active on another system, but can be imported using
            the '-f' flag.
     config:

            tank                  FAULTED  corrupted data
              raidz2-0                DEGRADED
                sdc                   ONLINE
                sdf                   ONLINE
                replacing-2           DEGRADED
                  352909589583250342  OFFLINE
                  sde                 ONLINE
                sda                   ONLINE
                sdd                   ONLINE
                sdb                   ONLINE
    user@UBUNTU-SERVER-KVM ~> sudo zpool import -f tank
    cannot import 'tank': I/O error
            Destroy and re-create the pool from
            a backup source.
    user7@UBUNTU-SERVER-KVM ~> sudo zpool import -d /dev/disk/by-id/ tank
    cannot import 'tank': I/O error
            Destroy and re-create the pool from
            a backup source.

zdb

     user@UBUNTU-SERVER-KVM ~> sudo zdb
        tank:
            version: 5000
            name: 'tank'
            state: 0
            txg: 15981041
            pool_guid: 3866261861707315207
            errata: 0
            hostname: 'UBUNTU-SERVER-KVM'
            vdev_children: 1
            vdev_tree:
                type: 'root'
                id: 0
                guid: 3866261861707315207
                children[0]:
                    type: 'raidz'
                    id: 0
                    guid: 2520364627045826300
                    nparity: 2
                    metaslab_array: 34
                    metaslab_shift: 38
                    ashift: 12
                    asize: 36006962135040
                    is_log: 0
                    create_txg: 4
                    children[0]:
                        type: 'disk'
                        id: 0
                        guid: 4891017201165304687
                        path: '/dev/disk/by-id/ata-WDC_WD60EFRX-68L0BN1_WD-***-part1'
                        whole_disk: 1
                        DTL: 241
                        create_txg: 4
                    children[1]:
                        type: 'disk'
                        id: 1
                        guid: 7457881130536207675
                        path: '/dev/disk/by-id/ata-WDC_WD60EFRX-68L0BN1_WD-***-part1'
                        whole_disk: 1
                        DTL: 240
                        create_txg: 4
                    children[2]:
                        type: 'disk'
                        id: 2
                        guid: 352909589583250342
                        path: '/dev/vde1'
                        whole_disk: 1
                        not_present: 1
                        DTL: 159
                        create_txg: 4
                    children[3]:
                        type: 'disk'
                        id: 3
                        guid: 10598130582029967766
                        path: '/dev/disk/by-id/ata-WDC_WD60EFRX-68L0BN1_WD-***-part1'
                        whole_disk: 1
                        DTL: 239
                        create_txg: 4
                    children[4]:
                        type: 'disk'
                        id: 4
                        guid: 1949004718048415909
                        path: '/dev/disk/by-id/ata-WDC_WD60EFRX-68L0BN1_WD-***-part1'
                        whole_disk: 1
                        DTL: 238
                        create_txg: 4
                    children[5]:
                        type: 'disk'
                        id: 5
                        guid: 13752847360965334531
                        path: '/dev/disk/by-id/ata-WDC_WD60EFRX-68L0BN1_WD-***-part1'
                        whole_disk: 1
                        DTL: 237
                        create_txg: 4
            features_for_read:
                com.delphix:hole_birth
                com.delphix:embedded_data

zdb -l /dev/sde1

 user@UBUNTU-SERVER-KVM ~> sudo zdb -l /dev/sde1
    --------------------------------------------
    LABEL 0
    --------------------------------------------
        version: 5000
        name: 'tank'
        state: 0
        txg: 15981229
        pool_guid: 3866261861707315207
        errata: 0
        hostname: 'ubuntu-server'
        top_guid: 2520364627045826300
        guid: 1885359927145031384
        vdev_children: 1
        vdev_tree:
            type: 'raidz'
            id: 0
            guid: 2520364627045826300
            nparity: 2
            metaslab_array: 34
            metaslab_shift: 38
            ashift: 12
            asize: 36006962135040
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 4891017201165304687
                path: '/dev/vdc1'
                whole_disk: 1
                DTL: 241
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 7457881130536207675
                path: '/dev/vdf1'
                whole_disk: 1
                DTL: 240
                create_txg: 4
            children[2]:
                type: 'replacing'
                id: 2
                guid: 9514120513744452300
                whole_disk: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 352909589583250342
                    path: '/dev/vde1/old'
                    whole_disk: 1
                    not_present: 1
                    DTL: 159
                    create_txg: 4
                    offline: 1
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 1885359927145031384
                    path: '/dev/vde1'
                    whole_disk: 1
                    DTL: 231
                    create_txg: 4
                    resilver_txg: 15981226
            children[3]:
                type: 'disk'
                id: 3
                guid: 10598130582029967766
                path: '/dev/vdg1'
                whole_disk: 1
                DTL: 239
                create_txg: 4
            children[4]:
                type: 'disk'
                id: 4
                guid: 1949004718048415909
                path: '/dev/vdd1'
                whole_disk: 1
                DTL: 238
                create_txg: 4
            children[5]:
                type: 'disk'
                id: 5
                guid: 13752847360965334531
                path: '/dev/vdb1'
                whole_disk: 1
                DTL: 237
                create_txg: 4
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data
    --------------------------------------------
    LABEL 1
    --------------------------------------------
        version: 5000
        name: 'tank'
        state: 0
        txg: 15981229
        pool_guid: 3866261861707315207
        errata: 0
        hostname: 'ubuntu-server'
        top_guid: 2520364627045826300
        guid: 1885359927145031384
        vdev_children: 1
        vdev_tree:
            type: 'raidz'
            id: 0
            guid: 2520364627045826300
            nparity: 2
            metaslab_array: 34
            metaslab_shift: 38
            ashift: 12
            asize: 36006962135040
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 4891017201165304687
                path: '/dev/vdc1'
                whole_disk: 1
                DTL: 241
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 7457881130536207675
                path: '/dev/vdf1'
                whole_disk: 1
                DTL: 240
                create_txg: 4
            children[2]:
                type: 'replacing'
                id: 2
                guid: 9514120513744452300
                whole_disk: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 352909589583250342
                    path: '/dev/vde1/old'
                    whole_disk: 1
                    not_present: 1
                    DTL: 159
                    create_txg: 4
                    offline: 1
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 1885359927145031384
                    path: '/dev/vde1'
                    whole_disk: 1
                    DTL: 231
                    create_txg: 4
                    resilver_txg: 15981226
            children[3]:
                type: 'disk'
                id: 3
                guid: 10598130582029967766
                path: '/dev/vdg1'
                whole_disk: 1
                DTL: 239
                create_txg: 4
            children[4]:
                type: 'disk'
                id: 4
                guid: 1949004718048415909
                path: '/dev/vdd1'
                whole_disk: 1
                DTL: 238
                create_txg: 4
            children[5]:
                type: 'disk'
                id: 5
                guid: 13752847360965334531
                path: '/dev/vdb1'
                whole_disk: 1
                DTL: 237
                create_txg: 4
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data
    --------------------------------------------
    LABEL 2
    --------------------------------------------
        version: 5000
        name: 'tank'
        state: 0
        txg: 15981229
        pool_guid: 3866261861707315207
        errata: 0
        hostname: 'ubuntu-server'
        top_guid: 2520364627045826300
        guid: 1885359927145031384
        vdev_children: 1
        vdev_tree:
            type: 'raidz'
            id: 0
            guid: 2520364627045826300
            nparity: 2
            metaslab_array: 34
            metaslab_shift: 38
            ashift: 12
            asize: 36006962135040
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 4891017201165304687
                path: '/dev/vdc1'
                whole_disk: 1
                DTL: 241
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 7457881130536207675
                path: '/dev/vdf1'
                whole_disk: 1
                DTL: 240
                create_txg: 4
            children[2]:
                type: 'replacing'
                id: 2
                guid: 9514120513744452300
                whole_disk: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 352909589583250342
                    path: '/dev/vde1/old'
                    whole_disk: 1
                    not_present: 1
                    DTL: 159
                    create_txg: 4
                    offline: 1
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 1885359927145031384
                    path: '/dev/vde1'
                    whole_disk: 1
                    DTL: 231
                    create_txg: 4
                    resilver_txg: 15981226
            children[3]:
                type: 'disk'
                id: 3
                guid: 10598130582029967766
                path: '/dev/vdg1'
                whole_disk: 1
                DTL: 239
                create_txg: 4
            children[4]:
                type: 'disk'
                id: 4
                guid: 1949004718048415909
                path: '/dev/vdd1'
                whole_disk: 1
                DTL: 238
                create_txg: 4
            children[5]:
                type: 'disk'
                id: 5
                guid: 13752847360965334531
                path: '/dev/vdb1'
                whole_disk: 1
                DTL: 237
                create_txg: 4
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data
    --------------------------------------------
    LABEL 3
    --------------------------------------------
        version: 5000
        name: 'tank'
        state: 0
        txg: 15981229
        pool_guid: 3866261861707315207
        errata: 0
        hostname: 'ubuntu-server'
        top_guid: 2520364627045826300
        guid: 1885359927145031384
        vdev_children: 1
        vdev_tree:
            type: 'raidz'
            id: 0
            guid: 2520364627045826300
            nparity: 2
            metaslab_array: 34
            metaslab_shift: 38
            ashift: 12
            asize: 36006962135040
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 4891017201165304687
                path: '/dev/vdc1'
                whole_disk: 1
                DTL: 241
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 7457881130536207675
                path: '/dev/vdf1'
                whole_disk: 1
                DTL: 240
                create_txg: 4
            children[2]:
                type: 'replacing'
                id: 2
                guid: 9514120513744452300
                whole_disk: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 352909589583250342
                    path: '/dev/vde1/old'
                    whole_disk: 1
                    not_present: 1
                    DTL: 159
                    create_txg: 4
                    offline: 1
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 1885359927145031384
                    path: '/dev/vde1'
                    whole_disk: 1
                    DTL: 231
                    create_txg: 4
                    resilver_txg: 15981226
            children[3]:
                type: 'disk'
                id: 3
                guid: 10598130582029967766
                path: '/dev/vdg1'
                whole_disk: 1
                DTL: 239
                create_txg: 4
            children[4]:
                type: 'disk'
                id: 4
                guid: 1949004718048415909
                path: '/dev/vdd1'
                whole_disk: 1
                DTL: 238
                create_txg: 4
            children[5]:
                type: 'disk'
                id: 5
                guid: 13752847360965334531
                path: '/dev/vdb1'
                whole_disk: 1
                DTL: 237
                create_txg: 4
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data

EDIT1: Using a magical, undefined option that I was unable to determine what did, it seems i was able to import the pool, though zpool iostat shows an empty table, I cannot do a scrub as it says "pool is currently unavailable", history returns nothing, "zdb -u tank" returns "zdb: can't open 'tank': Input/output error", cannot detach the old dead harddrive as. The action has also changed from "cannot be imported due to damaged devices or data." to "Wait for the resilver to complete", but the resilver is going on at "1/s", and the config list shows none of the drives as resilvering. This has been going on for some days now without any changes in resilver count or percentage.

user@ubuntu-server ~> sudo zpool import -fFV
   pool: tank
     id: 3866261861707315207
  state: FAULTED
 status: One or more devices were being resilvered.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
 config:

        tank                  FAULTED  corrupted data
          raidz2-0                DEGRADED
            vdc                   ONLINE
            vdf                   ONLINE
            replacing-2           DEGRADED
              352909589583250342  OFFLINE
              vde                 ONLINE
            vdg                   ONLINE
            vdd                   ONLINE
            vdb                   ONLINE

user@ubuntu-server ~> sudo zpool import -fFV tank
user@ubuntu-server ~> sudo zpool status
  pool: tank
 state: FAULTED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Nov  1 21:02:15 2018
    31.0T scanned out of 31.0T at 1/s, (scan is slow, no estimated time)
    5.17T resilvered, 100.02% done
config:

        NAME                      STATE     READ WRITE CKSUM
        tank                  FAULTED      0     0     1  corrupted data
          raidz2-0                DEGRADED     0     0     6
            vdc                   ONLINE       0     0     0
            vdf                   ONLINE       0     0     0
            replacing-2           DEGRADED     0     0     0
              352909589583250342  OFFLINE      0     0     0  was /dev/vde1/old
              vde                 ONLINE       0     0     0
            vdg                   ONLINE       0     0     0
            vdd                   ONLINE       0     0     0
            vdb                   ONLINE       0     0     0



user@ubuntu-server /tank> sudo zpool detach tank 352909589583250342
cannot open 'tank': pool is unavailable
Chris6647
  • 21
  • 2
  • It appears you have some bizarre situation where a VM and the host are both trying to access the same zpool at once. This is probably the cause of the problem. Find the VM that is trying to mount the pool, and remove ZFS from it entirely. Then stop your VMs, restart the host, and go back to attempting to recover the pool. – Michael Hampton Nov 05 '18 at 22:46
  • I had already tried turning off the virtual Ubuntu installation without it helping, but dared not remove zfs from it in cade that would remove some data that could prove crucial to restoring the pool. Are you certain this wouldn't be the case? – Chris6647 Nov 05 '18 at 23:13
  • You need to completely uninstall ZFS from the VM, if you were also running it on the host with the same block devices. Otherwise you risk even more data loss than you have already experienced. Already it is somewhat unlikely you will be able to recover anything; if you don't do this, it's pretty much guaranteed. – Michael Hampton Nov 05 '18 at 23:17
  • @MichaelHampton - Unfortunately that did not seem to change anything :( I'm still unable to import it with a variety of different settings, and it still complains with I/O Error. Any other suggestions? – Chris6647 Nov 06 '18 at 18:43
  • 1
    Move on to restoring from backup, and now you've learned to never, ever attempt to have two machines writing to the same zpool at once. It only results in data loss. – Michael Hampton Nov 06 '18 at 19:02

0 Answers0