7

I have a binary file that uses sets of 4 chars to form integers that have "meaning". For example, the 4 bytes '0005', which equals 808464437 in integer form.

I'd rather represent these in my Java code as '0005' rather than 808464437. I also don't want to set constants like final int V0005 = 808464437.

In C++, I can do the following:

fileStream->read((byte*) &version, 4); //Get the next bytes and stuff them into the uint32 version

switch (version) {
    case '0005':
        //do something
    case '0006':
        //do something
    case '0007':
        //do something
}

In Java, my problem isn't reading the 4 bytes and putting them into some data type. My problem is comparing some data type to the const char array '0005'.

How do I compare an int, or whatever form, to a const character array like '0005' in Java? I need to do this as efficiently as possible!

crush
  • 16,713
  • 9
  • 59
  • 100
  • 1
    "I'd rather represent these in my Java code as '0005' rather than 808464437. I also don't want to set constants like final int V0005 = 808464437." Either those are contradictory sentences, or I don't understand what you mean. Or perhaps you need to look at Java's Enums? – Marvo Feb 22 '13 at 19:42
  • The only efficient way is to use an int, 0x30303035, or maybe 0x0005. – Joop Eggen Feb 22 '13 at 19:42
  • @JoopEggen '0005' != 0x0005. '0005' is ascii. – crush Feb 22 '13 at 19:49
  • Though 0x30 == '0' and if only having digits, make a `short convertBytesToBCD(byte[])` – Joop Eggen Feb 22 '13 at 19:51
  • '0005' is an int that is equivalent to: `0x30 << 24 + 0x30 << 16 + 0x30 << 8 + 0x35` – crush Feb 22 '13 at 19:53
  • @Marvo can you explain about enums? Thanks. – crush Feb 22 '13 at 19:58
  • http://docs.oracle.com/javase/tutorial/java/javaOO/enum.html – Marvo Feb 22 '13 at 20:00
  • @Marvo, that doesn't seem like it would be much different than setting `static final int V0005 = 808464437`. Instead I'd have an Enum like `public enum FileVersion { V0005(808464437) }`? – crush Feb 22 '13 at 20:03
  • Perhaps. I wasn't clear on what you wanted when I mentioned that. But Java enums do provide some encapsulation of the data and provide some logic to go with. Personally, I'm not excited about using character literals in my code as your proposing, so I'd personally be searching for such a solution. – Marvo Feb 22 '13 at 20:30
  • If you're using this as part of versioning something (packets, perhaps) then you'd have a finite number of versions, and something like enums would work well for that. – Marvo Feb 22 '13 at 20:32
  • These are an archaic file format used by Sony. I'm not sure what all versions exist. I've so far only encountered '0005' and '0006'. The file type is defined by the first four bytes, and equals 'TREE'. If you are familiar, yes, this is the TRE file archive Sony used for patches, which I have succesfully parsed in C++. Just trying to make a Java equivalent. I'll just stick with `public static final int` afterall. Thanks everyone. – crush Feb 22 '13 at 20:37

7 Answers7

4

Thanks for explaining your answer.

Here's how to convert from 4 bytes to an int, which you can then compare to the int in question:

    int v0005 = ByteBuffer.wrap("0005".getBytes()).asIntBuffer().get();

This is unfortunately the only way I can see to do it... probably not as efficient as you want, but perhaps it's what you need nonetheless.

I would suggest setting up some of these as 'constants' in one of your classes like so:

public static final int V0005 = ByteBuffer.wrap("0005".getBytes()).asIntBuffer().get();
public static final int V0006 = ByteBuffer.wrap("0006".getBytes()).asIntBuffer().get();
public static final int V0007 = ByteBuffer.wrap("0007".getBytes()).asIntBuffer().get();
public static final int V0008 = ByteBuffer.wrap("0008".getBytes()).asIntBuffer().get();

And then switch on V0005, etc, since switching on a primitive type (int) is efficient.

vikingsteve
  • 38,481
  • 23
  • 112
  • 156
  • '0005' is ASCII representation of the 4 bytes. Notice they are characters. I'm using Little Endian byte order, and they map to 808464437. – crush Feb 22 '13 at 19:50
  • 1
    ok, i understand. I think i have a solution... editting my answer. – vikingsteve Feb 22 '13 at 19:55
  • I think you might be right that there is no choice but using a `static final int`, even though it is not what I want. In that case, I would simply use the 0x30303035 approach that another answer showed. +1 – crush Feb 22 '13 at 20:10
3

A char array in java is nothing more than a String in java. You can use string in switch-cases in Java7, but I don't know the efficiency of the comparisions.

Because only the last element of you char array seems to have a meaning, you could do a switch case with it. Something like

private static final int VERSION_INDEX = 3;

...
    char[] version = // get it somehow

    switch (version[VERSION_INDEX]) {
         case '5': 
            break;
         // etc
    }
...

EDIT More object oriented version.

  public interface Command {
       void execute();
  }

  public class Version {
       private final Integer versionRepresentation;

       private Version (Integer versionRep) {
            this.versionRepresentation = versionRep;
       }

       public static Version get(char[] version) {
            return new Version(Integer.valueOf(new String(version, "US-ASCII")));
       }

       @Override
       public int hashCode() {
            return this.versionRepresentation.hashCode();
       }
       @Override
       public boolean equals(Object obj) {
            if (obj instanceof Version) {
                Version that = (Version) obj;

                return that.versionRepresentation.equals(this.versionRepresentation);
            }
            return false;
       }
  }

  public class VersionOrientedCommandSet {
       private final Command EMPTY_COMMAND = new Command() { public void execute() {}};
       private final Map<Version, Command> versionOrientedCommands;

       private class VersionOrientedCommandSet() {
           this.versionOrientedCommands = new HashMap<Version, Command>();
       }

       public void add(Version version, Command  command) {
           this.versionOrientedCommands.put(version, command);
       }

       public void execute(Version version) throw CommandNotFoundException {
           Command command = this.versionOrientedCommands.get(version);

           if (command != null)  {
               command.execute();
           } else {
               throw new CommandNotFoundException("No command registered for version " + version);
           }

       }
  }

  // register you commands to VersionOrientedCommandSet
  char[] versionChar = // got it from somewhere
  Version version = Version.get(versionChar);
  versionOrientedCommandSet.execute(version);

lots of code hehe You will have a little cost of warm-up, but if you program is executed multiple times, you will gain efficiency with the map :P

Caesar Ralf
  • 2,203
  • 1
  • 18
  • 36
  • So far this is the closest thing as you seem to be the only person comprehending that `0x0005` is not the same as `'0005'`. Unfortunately, what happens when the versions grow to `'0010'`. – crush Feb 22 '13 at 19:56
  • yeah, I saw problem, but this is the only quick solution I saw to your problem right now. You could process the array of chars and transform it to an int. GOnna edit and post it. – Caesar Ralf Feb 22 '13 at 20:00
  • I thought it over and it will be not very efficient. It would be something similar to vikingsteve solution. – Caesar Ralf Feb 22 '13 at 20:29
  • will write something not so very efficient, but you could use later so you wont need to have a switch case. – Caesar Ralf Feb 22 '13 at 20:31
  • done, but I don't think you will gain time with my solution hehe. Just a little more object oriented now. – Caesar Ralf Feb 22 '13 at 20:56
2

Getting a little from some of the other answers (@marvo and @vikingsteve), I came out with this :

public enum Version {

  V0005 {

    @Override
    public void doStuff() {
      // do something
    }
  },
  V0006 {
  @Override
    public void doStuff() {
    // do something else
    }

  };

private final int value;
private static final Map<Integer, Version> versions = new HashMap<>();

private Version() {
    final byte[] bytes = name().substring(1).getBytes(); //remove the initial 'V'
    this.value = ByteBuffer.wrap(bytes).asIntBuffer().get();
}

static {
    for (Version v : values()) {
        versions.put(v.value, v);
    }
}

public abstract void doStuff();


public Version valueOf(int i){
    return versions.get(i);
}
}

This offers a good object-oriented approach, since the action is encapsulated with the data representation .Each Constant decides what to do when doStuff is called, avoiding the switch on client code. The usage would be :

int code = readFromSomewhere();
Version.valueOf(i).doStuff();

The valueOf method finds the Version in a map which is populated when loading the class. I don't know if this is as efficient as you need, since the int gets boxed to an Integer but you'll only find out when you profile, everything else is pure speculation. Also note that the integer values are calculated for you in the constructor, so you just have to define the enums with the right name and it'll be done automatically.

Chirlo
  • 5,989
  • 1
  • 29
  • 45
  • I like the map solution, too. I thought about adding that to my answer but got lazy. Wouldn't have done it as elegantly as your answer, either. – Marvo Feb 23 '13 at 01:56
  • The fact that it works off the name of the Enum is beautiful. – Marvo Feb 23 '13 at 21:23
1

Are you looking for something like value1.intValue() == value2.intValue()? Or you could just type cast.

char myChar = 'A';
int myInt = (int) myChar;

The same cane go for the other way around.

int myInt = 6;
char myChar = (char) myInt;
pattmorter
  • 991
  • 9
  • 21
  • thats the proper way to typecast, but I think he needs to cast into the higher bits as well... – vikingsteve Feb 22 '13 at 20:04
  • Yeah, a single `char` seems to be able to cast no problem, which is sort of why I'm confused that a `char[]` of size 4 can't be cast into an `int`. I guess it's just because of the way arrays are handled in Java. – crush Feb 22 '13 at 20:06
1

The only way would be a very inefficient byte[] to String conversion.

switch (new String(version, "US-ASCII")) {
case "0005":
    break;
...
}

Using a (Hash-)Map instead would make it faster again.

But I think you will not be satisfied by this solution.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • So, what you are really saying is that there is no such thing as a const char array in Java, like `'0005'`? And that I would have to use a string object to compare it to unless I wanted to store the value of in an int in a `final int`? – crush Feb 22 '13 at 19:57
  • I don't seem to have this constructor that you show for String. Are you sure this is the valid format? – crush Feb 22 '13 at 20:11
  • There is a constructor `String(byte[] data, String encoding)`. And yes the 4 byte char 'xxxx' (originally seen by MacOS) has no pendant. Only playthings like 0xcafebabe (java magic cookie). – Joop Eggen Feb 22 '13 at 20:26
  • This is cool, but I don't think it's efficient as I need. I guess I will go with the static final int afterall. Thanks for the information! – crush Feb 22 '13 at 20:33
1

The most efficient way would be to work with integers - something like final int V0005 = 0x30303035 (a combination of a couple of different ideas seen here) would be manageable. Java just doesn't have the kind of integer literals you are looking for.

If you really want to work with strings, you need to read the characters in a byte array first, then convert that to a String (the default charset normally works as it is a superset of ASCII). However, it would be less efficient.

You could write a function to calculate the constant values from the simple numbers (like 5) for better readability, but that's probably overkill.

  • It's not a string in C++. It's a const character array which evaluates to an integer because it is a sequence of 4 bytes. It is a valid literal in C++. – crush Feb 22 '13 at 19:57
  • I stand corrected - it outputs a warning but works. I have edited the answer. – aditsu quit because SE is EVIL Feb 22 '13 at 20:02
  • It doesn't work if int is 16bit instead of 32bit, since you can't fit 4 bytes into 2 bytes of course. So there is a warning in case you are on a 16bit architecture - it is not part of the standard, but is common practice. – crush Feb 22 '13 at 20:04
  • I'm on a 64bit arch, but the warning is about using a "multi-character character constant" – aditsu quit because SE is EVIL Feb 22 '13 at 20:06
  • Never seen that warning before. What compiler are you using? – crush Feb 22 '13 at 20:07
  • gcc 4.6.3 (just running g++ with no special options) – aditsu quit because SE is EVIL Feb 22 '13 at 20:08
  • Interesting, I don't get that warning. Also using GCC/G++ on 64-bit Debian. I think I like your idea of expressing them in `final int V0005 = 0x30303035` form the best since I can't find a solution that uses the actual character constant `'0005'`. Thanks for information. – crush Feb 22 '13 at 20:12
1

Given that this appears to be a version string in a packet or something, then the number of values are finite and of a reasonable amount. So they can be encapsulated in an enum. The enum can encapsulate efficient code for converting from the four bytes in the input data to the representative enum. (My code for performing that conversion isn't necessarily the most efficient. Maybe, given a small number of versions, a nice internal array or mapping of ints to enums would work.)

class enum Version {

    UNKNOWN(0),
    V0005(0x30303035),  // whatever the int value is
    V0006(0x30303036);

    private int id;
    private Version(int value) {
        id = value;
    }

    public static Version valueOf(int id) {
        for (Version version : values()) {
            if (version.getId() == id) {
                return version;
            }
        }
        return UNKNOWN;
    }

    public static Version valueOf(String sId) {
        return valueOf(computeId(sId));
    }

    public static int computeId(String sId) {
        // efficiently convert from your string to its int value
    }
}

Then in my switch :

String s = methodThatReadsFourChars(...);
switch (Version.valueOf(s)) {
case V0005 : doV0005Stuff(); break;
case V0006 : doV0006Stuff(); break;
default :
    throw new UnknownVersionException();
Marvo
  • 17,845
  • 8
  • 50
  • 74
  • 1
    You could improve that adding a `doStuff()` method on the enum directly, like: ` V0005(0x803030d2){ void doStuff(){ ...} }` so that you encapsulate the action with the value. That way, you can save yourself the switch and just call `Version.valueOf(s).doStuff()` – Chirlo Feb 22 '13 at 20:48
  • I'd love to see an example of that. I'm unfamiliar with that syntax. – Marvo Feb 22 '13 at 21:03
  • 1
    I added an answer with an example of that, check it out if you're interested. – Chirlo Feb 23 '13 at 00:03