I have a need to copy a complex dataset (around 4 PB total, hundreds of millions of files) via Rclone copy, for some customers (each of them own around 100TB out of the 6PB) - to a bunch of disks via potentially the sftp protocol. Electing not to go to commercial cloud to skip the potential network as well as egrees/object rate limiting. We have a server with a 10gbs connection, doing an rclone copy (out of a webdav source), to another server (that'll be hosting ultimately customer-owned jbods) in the same local network with 10gbs (sftp destination), and we're breaking up these datasets - however we will be transferring some folders with potentially files upwards of 100k in the same folder. We have files ranging from a few KBs in size, to 1TB.
These copies are one-time only, to a bunch of disks for a customer that we will then courier to them.
My question is around these disks setup. Because they will need to be sent to a customer, they'll need to be secure and therefore encrypted? However I know this will probably kill my performance. To save cost, it'll be a JBOD setup rather than a RAID because we want to utilise the use of diskspace. Doing a customer at a time, so having a 100TB+ jbod potentially on SATA. With encryption - debating between zfs native encryption vs ext4 LUKs or potentially others? I have used ext4 LUKS in previous setup and while its not as easy/out of the box as ZFS, it is stable and reliable. Once data is copied to these disks, they are then shipped to customer.
Has anyone ever done this before and therefore has any suggestions? Gotchas? Should be I be leaning towards the ZFS setup rather than ext4? Is there a better way of doing this? Our focus is not just cost but also data integrity mainly and speed secondary. Any ideas would be appreciated. We are using rclone to do this at this point since we are using it for other archiving workflows and would like to stick to the same if possible. Happy to provide any info additionally if necessary. I know this is a loaded question but I guess I'm just after general technical advice in regards to the disk infrastructure that we might need to stand up to enable the "encrypted disks shipped to customer to be mounted and used in their own linux server on premise" solution.
Appreciate it, Jane