0

I wrote a code to save files into sequence file of Hadoop. The key is the file name and the value is the byte array of the file.The output was a sequence file and .crc file

After that I tried to read from the sequence file,but I got an exception regarding Checksum:

Exception in thread "main" org.apache.hadoop.fs.ChecksumException: Checksum error: file:/home/mosab/Desktop/output/ProcessWS/sequence.seq at 18873344
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:259)
    at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:276)
    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:228)
    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
    at java.io.DataInputStream.readFully(DataInputStream.java:195)
    at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:70)
    at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:120)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2436)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2335)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2381)
    at sequence.extractor.Extractor.main(Extractor.java:36)

I tried to remove the .crc file and read the sequence file again but then I got EOFException

Any solution please?

Mosab Shaheen
  • 1,114
  • 10
  • 25
  • Possible duplicate of [Checksum Exception when reading from or copying to hdfs in apache hadoop](https://stackoverflow.com/questions/15434709/checksum-exception-when-reading-from-or-copying-to-hdfs-in-apache-hadoop) – Eugene Lopatkin Oct 04 '18 at 07:52

1 Answers1

0

Solution: The ChecksumException comes because I forgot to close the sequence writer after I finished from writing/appending. This results in a sequence file that doesn't match its crc file.

Mosab Shaheen
  • 1,114
  • 10
  • 25