8

Is there any library/code in Java to calculate the 32-bit CRC of a stream of bytes in a way thats consistent with the cksum command in unix ?

Kowshik
  • 1,541
  • 3
  • 17
  • 25

3 Answers3

6

Jacksum: http://www.jonelo.de/java/jacksum/index.html

cksum         algorithm:   POSIX 1003.2 CRC algorithm
              length:      32 bits
              type:        crc
              since:       Jacksum 1.0.0
              comment:     - under BeOS    it is /bin/cksum
                           - under FreeBSD it is /usr/bin/cksum
                           - under HP-UX   it is /usr/bin/cksum and
                             /usr/bin/sum -p
                           - under IBM AIX it is /usr/bin/cksum
                           - under Linux   it is /usr/bin/cksum 

It's open source with GPL licence.

palacsint
  • 28,416
  • 10
  • 82
  • 109
  • @palacsint: Is there a code/algo which I can use in my java program, since I do not want any 3rd party utility – AabinGunz Sep 04 '12 at 12:58
  • 1
    Fun fact, jonelo.jacksum.algorithm.Cksum has the same interface as Java's CRC32 class but gives same result as unix cksum. – Carlos Rendon Jun 24 '13 at 23:20
  • 1
    I would consult corporate counsel before using that library in company code. The license is GPL ("copyleft"), which means you have to provide the source code of your project code to the public. If it was the LGPL ("Lesser General Public License") you would not be compelled to provide access to your proprietary code. – Chris Wolf Sep 24 '21 at 00:23
2

Have you tried the CRC32 class?

http://download.oracle.com/javase/7/docs/api/java/util/zip/CRC32.html

This is the crc 32 which gzip uses.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 3
    I read online at several places that unix cksum's crc-32 algorithm is not the same as gzip's. I've not tested this, but it is likely to be true. Using 'Jacksum' (see above) works fine for me. – Kowshik Oct 13 '11 at 05:47
  • 3
    @Kowshik, I can verify that Java's CRC32 is NOT the same as /usr/bin/cksum – Carlos Rendon Jun 24 '13 at 23:25
  • 1
    @Kowshik, check my answer, please. – Sully Aug 09 '17 at 17:32
1

The cksum command on MacOS allows selecting historic algorithms and algorithm 3 is the same as java.util.zip.CRC32, as @RobertTupelo-Schneck pointed out. For some reason, the more compact CheckedInputStream yields a different checksum.

e.g.

$ cksum -o 3 /bin/ls
4187574503 38704 /bin/ls

Same as :

package com.elsevier.hmsearch.util;
import static java.lang.System.out;

import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.zip.CRC32;
import java.util.zip.CheckedInputStream;
import java.util.zip.Checksum;

public class Demo {

  static final String FILE = "/bin/ls";
  
  public static void main(String[] args) throws Exception {
    Checksum cs = new CRC32();
    byte[] buffer = new byte[4096];
    long totalBytes = 0;
    
    InputStream is = Files.newInputStream(Paths.get(FILE));
    int bytesRead = is.read(buffer);
    totalBytes += bytesRead;
    //CheckedInputStream checkedInputStream = new CheckedInputStream(is, new CRC32());
    //while ((bytesRead = checkedInputStream.read(buffer, 0, buffer.length)) >= 0) {
    //  totalBytes += bytesRead;
    //}
    while (bytesRead > 0) {
      cs.update(buffer, 0, bytesRead);
      bytesRead = is.read(buffer);
      if (bytesRead < 1)
        break;
      totalBytes += bytesRead;
    }
    //out.printf("%d %d %s\n", checkedInputStream.getChecksum().getValue(), totalBytes, FILE);
    out.printf("%d %d %s\n", cs.getValue(), totalBytes, FILE);
  }
}
Chris Wolf
  • 1,539
  • 2
  • 10
  • 9