4

I have a LTO6 tape inserted

tapeinfo -f /dev/st0
Product Type: Tape Drive
Vendor ID: 'QUANTUM '
Product ID: 'ULTRIUM 6       '
Revision: '4142'
Attached Changer API: No
SerialNumber: 'HU1322VW9U'
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 0
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x5a
BlockSize: 0
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
BOP: yes
Block Position: 0
ActivePartition: 0
EarlyWarningSize: 0
NumPartitions: 0
MaxPartitions: 3

But when backup reach the 2.27TB(tape compressed is 6TB) exit with error as tape is not compressed

2,27TiB 8:39:36 [75,6MiB/s] [                                                                        <=>                             ]
pv: write failed: Spazio esaurito sul device
error writing output file

I use tar for backup on slackware 14.2

tar cMpf - -X /etc/file.exclude  /| openssl enc -e -aes256 -salt -pass file:filepass |(pv -p --timer --rate --bytes > /dev/st0)
elbarna
  • 332
  • 3
  • 6
  • 15

3 Answers3

17

In your case it is the file level encryption that is preventing compression.

Encryption tries to make the data stream look as much as random "noise" as possible. Compression tries to increase the data "density" which has a similar effect of limiting further compression.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34
  • 1
    Why tar don't ask me for a new tape?I use -M in flags – elbarna Oct 12 '17 at 14:23
  • Of course, if you must encrypt in software for some reason, you can always compress in software before the encryption happens (at cost of some CPU usage). `lzop` should in theory give you similar compression ratios to what LTO compression does. The typical pipeline would be along the lines of `tar | compress | encrypt`. – Bob Oct 12 '17 at 16:19
  • 14
    @elbarna because `tar` doesn't know the tape has run out; it only knows that it's writing to a pipe, and the pipe broke. `pv` knows the tape filled up, but it doesn't have any code to handle that, so it reports the tape filled up and breaks the pipe. If you want tar to do tape handling, *you have to let it handle the tape*. – MadHatter Oct 12 '17 at 17:26
5

Compression assumes it can work. tar files generally can not be compressed (they already are), so yes, you may end up not getting the "average compression ratio". Pure text files may compress a lot more. Compression targets are estimates.

BaronSamedi1958
  • 13,676
  • 1
  • 21
  • 53
TomTom
  • 51,649
  • 7
  • 54
  • 136
  • 33
    Bare `tar` files are *uncompressed*. That's why you often see `.tar.bz2` and friends. OP is encrypting plain uncompressed tar output, which destroys any hope of compression later in the chain. (AES-256-CBC is pretty close to a random function.) – user Oct 12 '17 at 14:55
  • 1
    Yeah, seocond piece of common knowledge: encryption has to create white noise output or it is crackable. White noise output is not compressable. Let your drive handle encryption. – TomTom Oct 12 '17 at 15:25
  • 3
    Just in case anyone was not aware of this, LTO and most tape drives auto-detect if compression is resulting in expansion and switch to uncompressed storage of data on the tape. So the limit for the encrypted data would be about 2.5 TB. – rcgldr Oct 12 '17 at 16:09
  • 6
    @TomTom: Consider editing your answer to remove the statement that tar files cannot be compressed - this is false and misleading; if you add the bit from your comment about encryption/noise it will be a much better answer :-) – psmears Oct 12 '17 at 16:41
  • 1
    @TomTom There is no problem with using OpenSSL for encryption for backup, and I would encourage it's usage. However, any compression needs to be upstream of the encryption as you noted. In fact in many commercial grade encryption products, compression is built into the pipeline to reduce the attack surface. – Aron Oct 13 '17 at 02:42
2

A couple of, nowadays common, kinds of files will not compress well (achieving far less compression than the ~2.5:1 target assumed), even if they are in an uncompressed archive:

  • anything that is, at any level, compressed already, using any algorithm. This includes gzipped manual pages and documentation, some formats of app bundle, application plugin, or office document (which are pkzip containers internally), installers for software (these are, at their core, often self extracting archives - and often contain media files as described below).

  • As mentioned, anything encrypted

  • modern image and multimedia formats (anything more high-tech than BMP, uncompressed TIFF variants, or WAV audio). These use domain-specific data reduction methods that still result in data that behaves like if it had already been compressed with a format-agnostic method. Also, if these are embedded in other files (eg a TIFF or JPEG image embedded in a PDF, postscript or office document), they make that file far less compressible than expected in turn.

In some cases, trying to compress any of these can even yield a net gain in file size.

rackandboneman
  • 2,577
  • 11
  • 8