2

Edit1: Unlike what I originally thought when creating this question, it seems that the problem was NOT directly caused by the unplanned power-cycle, but by a mistake I might have done with beadm in the past, and which became only effective due to the first reboot since then.

This was (and still is) the actual core of the question:

I'd think that most of the changes since that snapshot OUGHT to be still somewhere on the disk. Is there any remote chance (short of writing my own zfs-reader for the raw device) to access specific files that were changed since that last snapshot?

Further Info as requested in comments:

beadm list

BE Name          Flags Mountpoint Space   Policy Created          
---------------- ----- ---------- ------- ------ ---------------- 
11.4.23.69.3     NR    /          73.93G  static 2020-07-27 12:08 
solaris          -     -          95.53G  static 2020-05-23 19:35 
solaris-backup-1 -     -          374.41M static 2020-07-22 14:37 

zfs list:

NAME                               USED  AVAIL  REFER  MOUNTPOINT
rpool                              382G   713G  73.5K  /rpool
rpool/ROOT                         170G   713G    31K  none
rpool/ROOT/11.4.23.69.3           75.5G   713G  63.3G  /
rpool/ROOT/11.4.23.69.3/var       3.12G   713G  1.14G  /var
rpool/ROOT/solaris                94.2G   713G   143G  /
rpool/ROOT/solaris-backup-1       98.9M   713G  48.5G  /
rpool/ROOT/solaris-backup-1/var      1K   713G  1.13G  /var
rpool/ROOT/solaris/var             503M   713G  1.29G  /var
rpool/VARSHARE                     102M   713G  24.7M  /var/share
rpool/VARSHARE/kvol               27.7M   713G    31K  /var/share/kvol
rpool/VARSHARE/kvol/dump_summary  1.22M   713G  1.02M  -
rpool/VARSHARE/kvol/ereports      10.2M   713G  10.0M  -
rpool/VARSHARE/kvol/kernel_log    16.2M   713G  16.0M  -
rpool/VARSHARE/pkg                  63K   713G    32K  /var/share/pkg
rpool/VARSHARE/pkg/repositories     31K   713G    31K  /var/share/pkg/repositories
rpool/VARSHARE/sstore             30.0M   713G  30.0M  /var/share/sstore/repo
rpool/VARSHARE/tmp                20.0M   713G  20.0M  /var/tmp
rpool/VARSHARE/zones                31K   713G    31K  /system/zones
rpool/dump                        63.1G   713G  63.1G  -
rpool/export                      20.5G   713G    32K  /export
rpool/export/home                 20.5G   713G  7.26G  /export/home
rpool/export/home/avl             9.30G   713G  9.30G  /export/home/avl

(most of that - except my homedir - is what the machine came with)

Only the root-filesystem seems to have been rolled back, my homedir still has all the recent files.

According to output of df -kl the root filesystem is currently this one: rpool/ROOT/11.4.23.69.3, which already corresponds to the newest BE available.

I also hope to learn from answers what really might have caused the rollback. No, I don't exactly remember my beadm invocations. I only remember that I changed the BE for next boot, but then changed it back to current BE and didn't reboot - until that electric power failure.

Maybe answers here might also save someone else later.

avl42
  • 41
  • 5
  • How did the system lose power? – ewwhite Nov 02 '20 at 13:19
  • You haven't provided any real details. What does "switched back to latest snapshot" really mean? ZFS doesn't "switch back" - a ZFS file system can be rolled back to the last snapshot but that requires an explicit command. You need to precisely explain what happened and provide actual details. – Andrew Henle Nov 02 '20 at 14:20
  • @ewwhite "lose power" maybe some main switch was manually turned off in the server room, or maybe a bug was fried and caused a close circuit and some safety switch to trigger... I have no idea, and don't understand the point of the question. – avl42 Nov 02 '20 at 16:31
  • @AndrewHenle "a ZFS file system can be rolled back" - well apparently exactly that happened, but automatically - once the electrical power was up again. - I'd provide more details, if only I understood what kind of details are missing. – avl42 Nov 02 '20 at 16:34
  • @avl42 What filesystem was rolled back? What zpool is it in? What's the status of that zpool? What snapshots and clones are in the zfs filesystem that was rolled back? – Andrew Henle Nov 02 '20 at 21:52
  • @AndrewHenle it was the root-filesystem, and (as I know now) unfortunately, it did not only contain the OS... – avl42 Nov 03 '20 at 07:20

2 Answers2

3

Data is discarded. You might recover some traces with a special tools, but there's no solid way to rely on.

BaronSamedi1958
  • 13,676
  • 1
  • 21
  • 53
  • 1
    But a rollback wouldn't happen just because of a power loss. There's a lot left out of this question. FWIW, I wouldn't be surprised if multiple boot environments are involved, and someone created those BEs with things like user's home directories in them... – Andrew Henle Nov 02 '20 at 15:06
  • I'd be interested in such "special tools". Is there something like a "zfs-doctor" that would allow me to extract newer(than snapshot) versions of files by name, or that would create an entry in .zfs/snapshots for the latest state of each file, except for those that are indeed no longer physically accessible? – avl42 Nov 02 '20 at 16:39
  • @AndrewHenle, I checked "beadm list" as well, and it has the same-dated entries as the list of snapshots had. It is even possible that some "experimenting" with beadm in the past led to the rollback now on reboot, but is there a chance to recover at least single files from that rollback? In particular it is possible that I once tried selecting a different BE for next boot, but then changed the BE back to what it was before... maybe that second change did not really undo the first one, but caused now a rollback to whatever was the freshest snapshot back then... – avl42 Nov 02 '20 at 16:44
  • 2
    There’s a bunch of so-called “ZFS recovery tools” both commercial and open source. For example... https://www.klennet.com/zfs-recovery/ – BaronSamedi1958 Nov 02 '20 at 19:18
  • @avl42 What's the output from `beadm list`? When you boot to a new boot environment, you won't see any of the changes written to the old boot environment after you created the new one. **Anything** in a boot environment will be affected - so if you put your user's home directories or your database system's data storage files in a filesystem that's part of the boot environment, you won't be able to see those changes in the new boot environment. If you know what you're doing, boot environments are a ***HELL OF A LOT*** better than "`yum update -y` (dang I ***HOPE*** nothing breaks!!!)" – Andrew Henle Nov 02 '20 at 21:56
1

All my data is still there...

Some friendly (and patient) guy explained the concept of boot environments and snapshots to me.

Essentially, for someone like me with more of a linux than solaris background, these Boot environments appear like "alternative devices to mount to /" with some magic to share common files (like snapshots do) and also some magic shared with the system-update tools to have upgrades actually go to a different boot-environment.

I did some system update back on July 27th, and the new system was installed to a new boot-env 11.4.... (see "zfs list" in the question text), but through all these months, we were still running the boot-env named "solaris", because I never rebooted after the upgrade.

With the reboot after recent power-fail, it then mounted the "11.4..." root filessystem, which of course had none of the latest changes of "solaris" BE.

In the meantime I've already reconstructed most of the lost changes, but for those remaining changes that I couldn't reconstruct from my memory, I now mounted the previous boot-env "solaris" to /mnt: zfs mount -oro,mountpoint=/mnt rpool/ROOT/solaris and there they are, my lost changes...

Maybe this will help someone else later on.

Moral: zfs seems a pretty safe haven for data, afterall -- unless maybe if I ever get into the situation I originally thought myself in.

avl42
  • 41
  • 5
  • 1
    This is the answer to the problem I really had, unlike the problem I thought I had. - Therefore it might not really count as an answer to my question. Please guide me as to what answer is more appropriate to accept as answer. – avl42 Nov 03 '20 at 21:44