Questions tagged [bzip2]

bzip2 is a Unix command used for compression and decompression of files. The main advantage of bzip2 is that it has a high compression ratio with reasonable speed.

bzip2 one of the most widely used free compression programs for the terminal.

It typically compresses files to within 10% to 15% of the best available techniques, whilst being around twice as fast at compression and six times faster at decompression.

The current version is 1.0.6, released 20 Sept 2010.

327 questions
5
votes
0 answers

Unable to load bzip2 in native library of Hadoop

My environment is CentOS 7; Spark 1.6.1; Hadoop 2.6.4; and I have two slave-node in a cluster mode. When I tried hadoop command, I got WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes…
5
votes
1 answer

Bzip2 block header: 1AY&SY

This is the question about bzip2 archive format. Any Bzip2 archive consists of file header, one or more blocks and tail structure. All blocks should start with "1AY&SY", 6 bytes of BCD-encoded digits of the Pi number, 0x314159265359. According to…
osgx
  • 90,338
  • 53
  • 357
  • 513
5
votes
2 answers

What File Format Does PharData::extractTo Extract Files As?

I'm using the extractTo method of the PHP PharData class to examine the contents of a phar file and running into some strage results. I've reached the limits of my byte level detective work and was hoping someone here would be able to help me sort…
Alana Storm
  • 164,128
  • 91
  • 395
  • 599
4
votes
1 answer

How can I read from a corrupted tar.bz2 file in Python?

I have a program which saves its output to a tar.bz2 file as it works. I have a python script which processes that data. I'd like to be able to work with the output if the first program is interrupted — or just run the python script against it while…
mattdm
  • 2,082
  • 25
  • 39
4
votes
2 answers

How would I pipe data into bzip2 and get the resulting data from its stdout in C++ on Linux?

I am considering beginning work on a library for Linux that would provide a virtual file system to application developers where the files would be stored in an archive, and each file within the archive would be individually compressed so that…
coder543
  • 803
  • 6
  • 16
4
votes
1 answer

How does the ability to compress a stream affect a compression algorithm?

I recently backed up my soon-to-expire university home directory by sending it as a tar stream and compressing it on my end: ssh user@host "tar cf - my_dir/" | bzip2 > uni_backup.tar.bz2. This got me thinking: I only know the basics of how…
beta
  • 2,380
  • 21
  • 38
4
votes
2 answers

tar: Error opening archive: Can't initialize filter; unable to run program "bzip2 -d"

I'm trying to run this code from : https://github.com/pnnl/safekit ,using cmd on windows 10, I already installed python. when I type the command: tar -xjvf data_examples.tar.bz2 I keep getting the error: tar: Error opening archive: Can't…
user10842553
  • 41
  • 1
  • 3
4
votes
1 answer

Correctly building local python3, with bz2 support

I am trying to build a local version of python3 (specifically python3.7, but same issue with 3.6.6), but am running into problems with linking to some C libraries and/or headers (at least that is what I think the problem is). I am able to build…
djmac
  • 827
  • 5
  • 11
  • 27
4
votes
1 answer

Why is bzip2's maximum blocksize 900k?

bzip2 (i.e. this program by Julian Seward)'s lists available block-sizes between 100k and 900k: $ bzip2 --help bzip2, a block-sorting file compressor. Version 1.0.6, 6-Sept-2010. usage: bzip2 [flags and input files in any order] -1 .. -9 …
saladi
  • 3,103
  • 6
  • 36
  • 61
4
votes
0 answers

Spark saveAsTextFile with BZip2Codec causing memory leak

I'm using Spark 2.1.0 on EMR 5.5.0 (java 1.8.0_121) with YARN as a resource manager. I've encountered an issue with spark where after a few hours the executor gets killed by Yarn after it has used-up all of the configured container's physical…
drord
  • 41
  • 2
4
votes
1 answer

Call to undefined function bzdecompress PHP

I have an Ubuntu 16.04 server with PHP7 + nginx running. I already have a project in PHP Laravel 5.1 running in my local enviroment (Windows with Xampp) and everything is running great. I have a PHP script that uses the function bzdecompress of…
Sredny M Casanova
  • 4,735
  • 21
  • 70
  • 115
4
votes
4 answers

How do I handle a stream of data internal to a C-based app?

I am pulling data from a bzip2 stream within a C application. As chunks of data come out of the decompressor, they can be written to stdout: fwrite(buffer, 1, length, stdout); This works great. I get all the data when it is sent to stdout. Instead…
Alex Reynolds
  • 95,983
  • 54
  • 240
  • 345
4
votes
2 answers

How can we learn the size of uncompressed data of a bzip2 block?

bzip2 compresses the data in blocks, where each block starts with a magic number 1AY&SY. Can we determine the size of uncompressed data behind each block?? One way to do is to decompress the bzip2 file block-by-block and then find the size of each…
Zeeshan
  • 539
  • 4
  • 19
4
votes
2 answers

GoLang: Decompress bz2 in on goroutine, consume in other goroutine

I am a new-grad SWE learning Go (and loving it). I am building a parser for Wikipedia dump files - basically a huge bzip2-compressed XML file (~50GB uncompressed). I want to do both streaming decompression and parsing, which sounds simple enough.…
Manuel Menzella
  • 417
  • 1
  • 4
  • 9
4
votes
1 answer

Boost 1.59 not decompressing all bzip2 streams

I've been trying to decompress some .bz2 files on the fly and line-by-line so to speak as the files I'm dealing with are massive uncompressed (region of 100 GB uncompressed) so I wanted to add a solution that saves disk space. I have no problems…
Primalfido
  • 53
  • 4