Converting packed ascii to ascii and back

Question

I'm developing an application to communicate with a serialport device. The device itself is developed according to HART protocol.

Some documentation that I found related to this:

Packed ASCII is a subset of full ASCII and uses only 64 of the 256 possible characters. These 64 characters are the capitalized alphabet, numbers 0 through 9, and a few punctuation marks. Many HART parameters need only this limited ASCII set, which means that data can be compressed to 3/4 of normal.

The rule for converting from ASCII to Packed ASCII is just to remove bits 6 and 7 (two most significant).
The rules for conversion from packed ASCII back to ASCII are (1) set bit 7 = 0 and (2) set bit 6 = complement of packed ASCII bit 5.

So what I'm trying to do is get a value from the device, and later send it back, that is 6 bytes long and contains 8 characters and numbers. The value I get from unpacking is obviously incorrect. Don't know if the issue in this test is in the packing or unpacking. I'm not that familiar working with bytes so there might be some very obvious errors.

import java.nio.ByteBuffer;
import java.util.Arrays;

public class AsciiTest{

    public static void main(String []args){

        String text = "ABCDEF12";
        byte[] bytesFromText = text.getBytes();
        byte[] packed = packAscii(bytesFromText);
        byte[] unpacked = unpackAscii(packed);

    }


    //Function to pack bytes from 8 byte to 6 byte

    private static byte[] packAscii(byte[] bytes) {

        byte[] shorten = new byte[6];
        ByteBuffer packedAscii = ByteBuffer.wrap(shorten);
        int index = 0;

        for (int i = 0; i < bytes.length; i = i + 8) {
            int result = (int)(
                ((bytes[index] & 0x3F) << 42) +
                ((bytes[index + 1] & 0x3F) << 36) +
                ((bytes[index + 2] & 0x3F) << 30) +
                ((bytes[index + 3] & 0x3F) << 24) +
                ((bytes[index + 4] & 0x3F) << 18) +
                ((bytes[index + 5] & 0x3F) << 12) +
                ((bytes[index + 6] & 0x3F) << 6) +
                (bytes[index + 7] & 0x3F)
                );

            byte[] packedTemp = ByteBuffer.allocate(6).putInt(result).array();

            for (int j = 0; j < 6; j++)
            {
                packedAscii.put(packedTemp[j]);
            }

            index += 8;

        }

        byte[] combined = packedAscii.array(); 
        System.out.println(bytesToHex(combined)); // As hex string: "C41470920000"
        return combined;

    }

    // Function to pack bytes from 8 byte to 6 byte

    private static byte[] unpackAscii(byte[] bytes) {
        byte[] slice = bytes;
        byte[] nbytes = new byte[8];
        byte[] first = new byte[1];

        ByteBuffer buffer = ByteBuffer.wrap(nbytes);
        buffer.put(first);
        byte second = slice[0];
        second = (byte)~second;
        buffer.put(second);
        buffer.put(slice);

        byte[] combined = buffer.array();

        String hex = bytesToHex(combined2); 
        System.out.println(hex); // As hex string: "003BC41470920000"

        StringBuilder output = new StringBuilder();
        for (int i = 0; i < hex.length(); i+=2) {
            String str = hex.substring(i, i+2);
            output.append((char)Integer.parseInt(str, 16));
        } 
        
        System.out.println(output); // Output of string builder: ";Äp’"
        return combined; 
    }

    private final static char[] hexArray = "0123456789ABCDEF".toCharArray();
    public static String bytesToHex(byte[] bytes) {
        char[] hexChars = new char[bytes.length * 2];
        for ( int j = 0; j < bytes.length; j++ ) {
            int v = bytes[j] & 0xFF;
            hexChars[j * 2] = hexArray[v >>> 4];
            hexChars[j * 2 + 1] = hexArray[v & 0x0F];
        }
        return new String(hexChars);
    }
}

EDIT: Should have started with this, but here's the data from device. From device packed

By hand, unpacked the bits and then checked what the characters are.

543210   7654 3210

010010 | 0101 0010 R
000101 | 0100 0101 E
000011 | 0100 0011 C
001001 | 0100 1001 I
010000 | 0101 0000 P
000101 | 0100 0101 E
100000 | 0010 0000 Space
110001 | 0011 0001 1

Byte array from the packed bytes [72, 80, -55, 64, 88, 49]

So assuming that I got the binary representation correct (which the actual word suggests), then at least I have correct idea what to do. I'll update the code when I get progress there.

And the application is made for Android phones.

When you pack an ASCII character, what goes in the two highest bit locations? Is it bits 0 and 1 of the next character? The two highest bits? Something else? — markspace, Oct 05 '20 at 16:41
With ABCD = 01000001 01000010 01000011 01000100, truncate the two highest, the first two from left. So that would become 000001 000010 000011 000100 and packed would be 00000100 00100000 11000100, right? And this is done with bitwise (AND) 000001 & 0x3F and shift that result to the left with <<. Isn't that the idea? So the answer for your question would be the two highest bits of the next truncated byte. — , Oct 06 '20 at 07:35
`Isn't that the idea` I don't know, I asked you. Can you verify this is correct? Can you get some raw output from the device and post it so we can see the bit patterns? The answer to my question is that the digit is first shifted to the left and then the bits of the next character go into the *lowest* bits, according to your explanation. — markspace, Oct 06 '20 at 15:12
Updated the post with the information from the device and manual testing — , Oct 06 '20 at 16:44
in `packAscii` method when you shift a byte value to the right for more than 26 bits, this information is lost because `int` size in Java is only 32 bits. It would make more sense to use `long result` which would fit all the 48 bits. Another point is that the bytes are additionally inversed - lowest bits/first chars are moved higher. — Nowhere Man, Oct 06 '20 at 22:36

Nowhere Man · Answer 1 · 2020-10-07T00:15:40.107

I was not checking actually issues in your code and decided to implement this from scratch. The solution is working but may be not too clean.

So, in packing mode, resulting first byte contains 6 bits of the 1st unpacked byte + 2 lower bits of the 2nd unpacked byte (thus shifted left by 6).
The 2nd resulting byte contains 4 higher bits of the 2nd unpacked byte (thus shifted right by 2) plus 4 lower bits of the 3rd unpacked byte (thus shifted left by 4)
Finally, the 3rd resulting byte contains 2 higher bits of the 3rd unpacked byte (shifted right by 6) plus the 4th unpacked byte shifted left by 2.

In the unpacking mode the process is reverse.

For packing mode 32 may be subtracted from each byte and in unpacking this constant is added back so that A, B..Z could fit into 6 bits properly.

update
Subtraction of 32 may be omitted, and there's an alternate way to restore "lost" bit 6 when unpacking for the values below 0x20 by adding 0x40.

Thus, the packing method is as follows:

private static byte[] packAscii(byte[] unpacked) {
    for (int i = 0; i < unpacked.length; i++) {
        // unpacked[i] = (byte)((unpacked[i] - 32) &0x3F);
        unpacked[i] = (byte)(unpacked[i] & 0x3F); // keep 6 lower bits as is
    }
    int size = (int) Math.ceil(unpacked.length * 3.0 / 4.0);
    byte[] result = new byte[size];
    // i - index of input unpacked array, r - index of result packed array 
    for (int i = 0, r = 0; i < unpacked.length;) {
        for (int j = 0; j < 4 && i < unpacked.length; j++, i++, r++) {
            int rightShift = j * 2;
            int leftShift = (8 - rightShift) % 8;
            switch(j) {
                case 0:
                    result[r]  = unpacked[i];
                    break;
                case 1:
                case 2:
                    result[r-1] |= (byte)(unpacked[i] << leftShift);
                    result[r]  = (byte)(unpacked[i] >> rightShift);
                    break;
                case 3:
                    result[r-1] |= (byte)(unpacked[i] << leftShift);
                    break;
            }
        }
        r--; // correction for the last packed byte
    }
    System.out.println("packed =" + printArr(result));
    return result;
}

Unpacking method is as follows:

final static int[][] masks = {
    {0x3F, 0x3F},
    {0x03, 0x3C},
    {0x0F, 0x30},
    {0x3F, 0x3F}
};

private static byte[] unpackAscii(byte[] packed) {
    int size = (int) Math.floor(packed.length * 4.0 / 3.0);
    byte[] result = new byte[size];
    System.out.println("unpacked size=" + size);
    for (int i = 0, r = 0; i < packed.length;) { // r index of unpacked array, i - packed
        for (int j = 0, rightShift = 6; j < 3 && i < packed.length; i++, j++, r++, rightShift -= 2) {
            int mask = masks[j][1];
            int leftShift = j * 2;
                
            if (j == 0) {
                result[r]  = (byte)(packed[i] & mask);
            } else {
                result[r] |= (byte)((packed[i] << leftShift) & mask);
            }
    
            if (r < size - 1) {
                mask = masks[j + 1][0];
                result[r + 1] = (byte)((packed[i] >> rightShift) & mask);
            }
        }
        r++; // correction for the last packed byte
    }    
    for (int i = 0; i < result.length; i++) {
        // result[i] += 32;  // restore after unpacking // previous implementation
        if (result[i] < 0x20) {
            result[i] += 0x40; // restore 6 bit for letters
        }
    }
    return result;
}

static String printArr(byte[] arr) {
    return IntStream.range(0, arr.length)
                    .mapToObj(i -> "0x" + Integer.toHexString(arr[i] & 0xFF).toUpperCase())
                    .collect(Collectors.joining(" "));
}

Test:

String text = "ABCDEFGH12";
System.out.println(text);
byte[] bytesFromText = text.getBytes();

byte[] packed = packAscii(bytesFromText);
byte[] unpacked = unpackAscii(packed);

System.out.println(new String(unpacked));

Output

ABCDEFGH12
packed =[-95, 56, -110, -91, 121, -94, -111, 4]
unpacked size=10
ABCDEFGH12

update
Using test word RECIPE 1 for the new mode

RECIPE 1
packed =0x52 0x31 0x24 0x50 0x1 0xC6
unpacked size=8
RECIPE 1

update2
The following implementation supports format mentioned in the question (however, not tested for other sizes of unpacked data than 8)

private static byte[] packAscii(byte[] unpacked) {

    long tmp = 0;
    for (int i = 0, shift = (unpacked.length - 1) * 6; i < unpacked.length; i++, shift -= 6) {
        tmp += ((long)unpacked[i] & 0x3F) << shift;
    }
    
    int size = (int) Math.ceil(unpacked.length * 3.0 / 4.0);
    byte[] result = new byte[size];
    for (int i = size - 1; i >= 0; i--) {
        result[i] = (byte) (tmp & 0xFF);
        tmp >>= 8;
    }
    System.out.println("packed =" + printArr(result));
    return result;
}

private static byte[] unpackAscii(byte[] packed) {
    int size = (int) Math.floor(packed.length * 4.0 / 3.0);
    byte[] result = new byte[size];
    System.out.println("unpacked size=" + size);
    long tmp = 0;
    for (int i = 0; i < packed.length; i++) {
        tmp <<= 8;
        tmp |= ((int)packed[i]) & 0xFF;    
    }

    for (int i = size - 1; i >= 0; i--) {
        result[i] = (byte)(tmp & 0x3F);
        if (result[i] < 0x20)
            result[i] += 0x40;
        tmp >>= 6;
    }
 
    return result;
}

Test succeeds

String text = "RECIPE 1";
System.out.println(text);
byte[] bytesFromText = text.getBytes();

byte[] packed = packAscii(bytesFromText);
byte[] unpacked = unpackAscii(packed);

System.out.println(new String(unpacked));

output

RECIPE 1
packed =0x48 0x50 0xC9 0x40 0x58 0x31
unpacked size=8
RECIPE 1

After some testing with the actual result from the device with your solutions unpack method, the result I get is "(!5R A5,". Forgot to add to the original info that the application is done for Android devices. I'm not sure what that changes, if anything. — , Oct 06 '20 at 19:54
@GeeS, Why would you test my _unpack_ method with the data prepared by _another_ packing method and why would you expect correct result? I thought I made clear point why subtraction of 32 is needed - because the letter chars starting from `0x41` do not fit into 6 bits. If you just remove 2 higher bits, you lose significant information and you need to provide special rule to handle ambiguity to set bit 6 (as for letters) or not (as for digits and space) after unpacking. — Nowhere Man, Oct 06 '20 at 20:28

Converting packed ascii to ascii and back

1 Answers1