Questions tagged [lzo]

LZO is a (text) compression algorithm from the Lempel-Ziv family, which favours speed against compression ratio.

LZO is a data compression library which is suitable for data de-/compression in real-time. This means it favours speed over compression ratio.

LZO is written in ANSI C. Both the source code and the compressed data format are designed to be portable across platforms.

LZO implements a number of algorithms with the following features:

Decompression is simple and very fast. Requires no memory for decompression. Compression is pretty fast. Requires 64 kB of memory for compression. Allows you to dial up extra compression at a speed cost in the compressor. The speed of the decompressor is not reduced. Includes compression levels for generating pre-compressed data which achieve a quite competitive compression ratio. There is also a compression level which needs only 8 kB for compression. Algorithm is thread safe. Algorithm is lossless. LZO supports overlapping compression and in-place decompression.

LZO and the LZO algorithms and implementations are distributed under the terms of the GNU General Public License (GPL) .

122 questions
1
vote
0 answers

How to install and load the native gpl library gplcompression in Mac in a java application?

I have a dropwizard application and i use inside the hadoop-lzo lib to decompress some files compressed with lzo. When I use it, it shows this error: ERROR [2017-01-11 09:00:21,887] com.hadoop.compression.lzo.GPLNativeCodeLoader: Could not load…
edwise
  • 869
  • 2
  • 9
  • 18
1
vote
1 answer

How to compress Twitter streaming using LZO in Linux/Python/Tweepy environment?

I'm receiving huge amounts of data streaming from Twitter using Tweepy (a Python Twitter API library). What I want to do is to compress the stream of received tweets and store them in file. The compression must be LZO and I don't want to use Linux…
Ash
  • 3,428
  • 1
  • 34
  • 44
1
vote
1 answer

Apache Avro in File Processing

What is the use of Apache Avro in file processing? Can anybody explain to me, is it useful if I need to process TBs of data in .LZO format? I have a choice between C++ and Java, what will fit more perfectly with Avro? My real purpose is to read…
1
vote
1 answer

How to decompress bytes in Python using lzo-1.0.8 ( Python 2.7.9)?

I have a compressed byte array received from network and it is LZO Compressed. I need to decompress it using LZO. I already installed the python-lzo-1.0.8 package for python and I checked in the Python Shell, its properly installed but I cannot find…
Biswarup Dass
  • 193
  • 1
  • 5
  • 19
1
vote
1 answer

Is there a Scalding source I can use for lzo-compressed binary data?

I am writing serialized Thrift records to a file using Elephant Bird's splittable LZO compression. To achieve this I am using their ThriftBlockWriter class. My Scalding job then uses the FixedPathLzoThrift source to process the records. This all…
fblundun
  • 987
  • 7
  • 19
1
vote
1 answer

lzo codec difference b/w python and java

I am running into a strange problem failing to inflate/uncompress lzo compressed data in java which was deflated/compressed from python lzo module although both seem to be using the same native lzo codec implementation. To give more details, I am…
user352951
  • 271
  • 1
  • 5
  • 11
1
vote
1 answer

LZO-Compress and Indexing Files on HDFS In-place?

Normally I'll do the following to use LZO: Use lzop command to compress the data file on local disk. Put it into HDFS. Use distributed lzo indexer the generate the .index files. I'm wondering is there a way to compress and index the raw files on…
Jerry
  • 789
  • 1
  • 13
  • 31
1
vote
1 answer

Faunus test failed on com.hadoop.compression.lzo.LzoCodec not found, HDP1.3

Hello I installed Faunus 0.32 on HDP 1.3 When I follow the get-start test case in https://github.com/thinkaurelius/faunus/wiki/Getting-Started, I got following errors gremlin> g = FaunusFactory.open('bin/faunus.properties') …
yzhang
  • 165
  • 1
  • 12
1
vote
1 answer

How can i tell if a file is lzop or lzma?

I have files that are compressed with both lzop and lzma and without proper extensions to their filename. e.g. "filename" instead of filename.lzo or filename.lzma how can i tell if they are compressed in their respective format?
Sam Wong
  • 11
  • 2
  • 6
1
vote
1 answer

Pig Elephant-Bird Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

I am running Hadoop 2.0 with CDH4 and built elephant-bird library with Oracle Java 1.6 r31 My Pig Script: register elephant-bird-2.2.3.jar log = load 'loggy.log.lzo' using com.twitter.elephantbird.pig.store.LzoPigStorage(' '); limited = limit log…
Carl Sagan
  • 982
  • 1
  • 13
  • 34
1
vote
0 answers

Indexed .lzo log files performing slower than .gz compressionxt

I have some log files compressed at lzo setting 7 and gzip at default compression and my results are as follows: MapReduce job over: 1GB .gz file - 340 seconds 1GB .lzo file un-indexed - 410 seconds 1GB .lzo file indexed - 380 seconds The…
Carl Sagan
  • 982
  • 1
  • 13
  • 34
1
vote
1 answer

Building Java Project with Hadoop-LZO but cannot find class

I'm trying to build a simple WordCount jar project which utilizes Hadoop-lzo library but cannot seem to get the following command to work, even though the class I'm referencing is within hadoop classpath: $ javac -cp `hadoop classpath`…
Carl Sagan
  • 982
  • 1
  • 13
  • 34
1
vote
0 answers

Hadoop lzopCodec pack

I'm trying to create simple map-reduce example. Here's my code public class Main { public static void main(String... args) throws IOException, ClassNotFoundException, InterruptedException { Job job = new Job(); …
user2281439
  • 673
  • 2
  • 11
  • 19
1
vote
0 answers

Manually splitting and compressing input for Amazon EMR

Instead of using hadoop-lzo to index my LZO input file, I decided to simply split it into a chunks, which compressed with LZO would be close to 128MB (since it is default block size on Amazon Distribution[1]). Is there anything wrong (from cluster…
spacemonkey
  • 19,664
  • 14
  • 42
  • 62
1
vote
0 answers

Creating a Sequence File outside Java Hadoop Framework

I have been experimenting with generating sequence files for Hadoop outside the Java framework, Python to be specific. There is a python-hadoop module which provides mostly similar framework to do this. I have successfully created sequence files…
Taro Sato
  • 1,444
  • 1
  • 15
  • 19
1 2 3
8 9