0

As described above, I'm having an issue with UTF-16LE in Java. When I run my code, it usually encrypts, decrypts, and prints the message correctly. Sometimes it loses two of the encrypted characters and other times it prints out a non-english character such as greek, etc.

This is for a class and it did stump my professor, any help would be greatly appreciated on the error.

/*This program was written by me for my professors class.

The purpose of this program is to encrypt a message using a XOR encryption, decrypt it the same way as well to generate a codebook for those methods.*/

import java.security.*;

public class Crypto 
{
    public static void main(String[] args) throws Exception
    {           
        Crypto c = new Crypto(); 

        byte[]codebook = null;
        String message = "";   

        System.out.println("Generating codebook");
        codebook = c.makeCodebook(14);
        System.out.println();

        System.out.println("Now Encrypting");
        System.out.println("This is the contents of the Encrypted Message: ");
        message = c.crypt("1234567", codebook);
        System.out.println();

        System.out.println("Now Decrypting");
        System.out.println("This is the contents of the Decrypted Message");
        message = c.crypt(message, codebook);

        System.out.println();
        System.out.println("Your message is: ");      
        System.out.println(message);

    }//publis static void main(String [] args)

    //Encrypts or decrypts message against the codebook assuming the appropriate length
    public String crypt (String message, byte [] codebook) throws Exception
    {
        //Take the message and put it into an array of bytes.
        //This array of bytes is what will be XORed against the codebook
        byte[] cryptedMessage = message.getBytes("UTF-16LE");
        byte[] result = new byte[14];
        message = "";
        System.out.println(message.length());
        System.out.println(cryptedMessage.length);
        System.out.println(result.length);
        //Let's see the contents of encryptedMessage
        for(int i = 0; i< cryptedMessage.length; i++)
        {
            System.out.print(cryptedMessage[i]+" ");
        }//for(int i = 0; i< encryptedMessage.length; i++)
        System.out.println();

        //XOR codebook and encryptedMessage
        System.out.println("This is the message using XOR:");
        for(int i = 0; i<result.length; i++)
        {
            //since XOR has return type of an int, we cast it to a byte
            result[i] = (byte)(((byte)(codebook[i])) ^ ((byte)(cryptedMessage[i])));
            System.out.print(result[i]+" ");
        }//while(result[i]!=0)
        //output
        System.out.println();
        //output

        System.out.println(message.length());
        System.out.println(cryptedMessage.length);
        System.out.println(result.length);
        return new String(result, "UTF-16LE");
    }//public String crypt (String message, byte [] codebook) throws Exception

    //Creates truly random numbers and makes a byte array using those truly random numbers
    public byte [] makeCodebook (int length) throws Exception
    {
        SecureRandom SecureRandom = new SecureRandom();//instance of SecureRandom named random
        byte[] codebook = null;

        codebook = new byte[length];
        SecureRandom.nextBytes(codebook);//generate bytes using the byte[]codebook

        //output
        System.out.println("This is the contents of the codebook: ");
        for(int i = 0; i < codebook.length;i++)
        {
            System.out.print(codebook[i]+" ");
        }//for(int i = 0; i < codebook[i];i++)
        //output
        System.out.println();
        return codebook;
    }//public byte [] MakeCodebook (int length) throws Exception

}//Public class Crypto
  • what did you mean "some times" you mean different inputs or many runs to the program – shareef May 14 '12 at 16:32
  • 1
    Even though you use UTF-16LE, it doesn't mean all 65536 values represent a character. When printing or storing it as a string, things go weird. Furthermore, you use a codebook solely consisting of bytes (values -128 to 127) and your message apparently consists of only printable characters ("1234567"), so why do you use an encoding at all instead of byte arrays? – Mark Jeronimus May 14 '12 at 16:36
  • @shareef I mean that I can run the code above 20 times exactly as is, main method and everything, and a few times out of that, i will get different results. For example, there will be an ArrayIndexOutOfBoundsException thrown or the output will come out as 12345ↀ7 instead of 12345467 – Aaron Davis May 14 '12 at 16:41
  • @ Zom-B The codebook is used to XOR against the encrypted message. The string is used for storage in the main, not to use for the encryption process. Does that adequately answer your question? I'm not 100% sure if I understood it. – Aaron Davis May 14 '12 at 16:44

2 Answers2

3

The problem is probably because the XORing against random data occasionally produces an output that does not express a valid set of characters when interpreted as a UTF-16LE byte sequence.

Instead of trying to interpret the ciphertext as a UTF-16LE string, consider just base64-encoding the ciphertext bytes once you've produced them and returning the resulting base64-encoded string. Then when decrypting, base64-decode the input string to get back the ciphertext bytes, do the XOR to get your plaintext bytes, and then create the plaintext string from the plaintext bytes by interpreting the plaintext bytes as a UTF-16LE sequence.


Updated to reply to a comment below without having to worry about running out of space.

As I said, you could base64-encode the ciphertext bytes. That's certainly doable in Java. The commons-codec library has methods to do it and a search will find others. If you're not allowed to use outside libraries, then roll your own method of converting arbitrary bytes to bytes guaranteed to be a valid encoding of something.

For example, you could split each byte in your ciphertext into its high 4 bits and low 4bits. Thus each byte would produce a pair of values that each range from 0-15. Then you could produce a 2-byte sequence from each byte by adding the ASCII code for 'A' to those numbers. This is not very efficient, but it does the job.

QuantumMechanic
  • 13,795
  • 4
  • 45
  • 66
  • You're saying to use base64 instead of using UTF-16LE because you believe the random data is messing with UTF-16LE? – Aaron Davis May 14 '12 at 16:30
  • How will using a different encoding effect how the data comes out? – Aaron Davis May 14 '12 at 16:36
  • Yes, that's what I'm saying. And different encodings will likely make it worse. In the extreme, consider what would happen if in your code you used `US-ASCII` in the encrypt and decrypt routines. Almost all the time the encrypt/decrypt round trip would give you garbage. – QuantumMechanic May 14 '12 at 16:54
  • Again, what I am saying is that in random data is not guaranteed to represent a valid `UTF-16LE` encoding. Put another way, there will be cases where there is no string of characters whose `UTF-16LE` encoding is the sequence of bytes your encryption routine produced. – QuantumMechanic May 14 '12 at 16:58
  • @QuantamMechanic I see what you're saying. Is there a way to encrypt/decrypt with java that won't give me garbage? Or is the only solution to just have the program rerun? – Aaron Davis May 14 '12 at 17:00
  • base64-encode the ciphertext bytes as I suggested. That can certainly be done "in java". If you're not allowed to use a base-64 library (like commons-codec, and you can find others) then roll your own way of converting the random bytes to a sequence of bytes that is guaranteed to be a valid sequence of bytes in some encoding. – QuantumMechanic May 14 '12 at 17:07
0

you should check your code for ArrayIndexOutOfBoundsException error if statment added check below and when i try non english character like arabic it always thrown //XOR codebook and encryptedMessage System.out.println("This is the message using XOR:"); for(int i = 0; i

            //since XOR has return type of an int, we cast it to a byte
            if(result.length<=cryptedMessage.length){
            result[i] = (byte)(((byte)(codebook[i])) ^ ((byte)(cryptedMessage[i])));
            System.out.print(result[i]+" ");
            }
        }
shareef
  • 9,255
  • 13
  • 58
  • 89
  • that did fix my out of bounds exception, however, i can't seem to fix the displaying odd characters. I just got the output of 123456✌ – Aaron Davis May 14 '12 at 16:58