java.lang.UnsatisfiedLinkError when writing using crunch MemPipeline

Question

I am using com.cloudera.crunch version: '0.3.0-3-cdh-5.2.1'.

I have a small program that reads some AVROs and filters out invalid data based on some criteria. I am using pipeline.write(PCollection, AvroFileTarget) to write the invalid data output. It works fine in production run.

For unit testing this piece of code, I use MemPipeline instance. But, it fails while writing the output in that case.

I get error:

java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V
    at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)
    at org.apache.hadoop.util.NativeCrc32.calculateChunkedSumsByteArray(NativeCrc32.java:86)
    at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(DataChecksum.java:428)
    at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:197)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:78)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50)
    at java.io.DataOutputStream.writeBytes(DataOutputStream.java:276)
    at com.cloudera.crunch.impl.mem.MemPipeline.write(MemPipeline.java:159)

Any idea what's wrong?

I recall seeing some existing bugs in MemPipeline's AVRO handling; is your schema anything especially complex? Are you able to write any Avro records using that schema in a MemPipeline, or is it only the invalid records you're filtering out that throw this error? — Suriname0, Aug 22 '16 at 22:09
Hi, I am not able to write any records using MemPipeline. MemPipeline.write() always gives me this error. — Yogesh, Aug 24 '16 at 03:24
It's probably a problem with your schema then. Try creating a simple test with a very basic Avro schema (e.g. a record with a single String field) and see if you are able to write materialized records to disk. If you can't, it's likely an issue with your dependencies; if you're using a tool like Maven, inspect the dependency tree and consider explicitly excluding some transitive dependencies that may be causing problems. — Suriname0, Aug 24 '16 at 14:57

score 1 · Answer 1 · answered Jul 18 '17 at 06:39

1

Hadoop environment variable should be configured properly along with hadoop.dll and winutils.exe.

Also pass the JVM argument while executing MR job/application -Djava.library.path=HADOOP_HOME/lib/native

answered Jul 18 '17 at 06:39

isudarsan

437
1
6
14

java.lang.UnsatisfiedLinkError when writing using crunch MemPipeline

1 Answers1