1

I'm sending an UDP-Datagram with only one string as content and I'm creating the packet like this:

String content = ALIVE + "," + clusterName + "," + nodeName + "," + location;
byte[] data = content.getBytes();
packet = new DatagramPacket(data, data.length);

The problem is, when it arrives it has some weird binary data at the end, which can't be displayed as characters (in Sublime Text it just shows several a NUL-symbol).

String data = new String(packet.getData());

I extract the different information parts with a string tokenizer(,) and solved this problem now by just adding another , at the end before sending it. Still I would like to know, where does this data come from?

Lasse Meyer
  • 1,429
  • 18
  • 38
  • May be it is related to this? http://stackoverflow.com/questions/8229064/how-to-get-rid-of-the-empty-remaining-of-the-buffer – Ravi Apr 20 '15 at 19:19

4 Answers4

2

Heed carefully the answers advising you to specify character encoding explicitly on both ends. Their advice is excellent.

However, if the character data is received accurately but for the addition of some junk at the end, then your issue is unlikely to arise from a character encoding mismatch. More likely it arises from incorrect use of DatagramPacket by the receiver.

DatagramPacket provides a fixed-length buffer for messages, and the getData() method returns that buffer. If it is longer than the message most recently received in it, then the tail end will contain data unrelated to that message. After receiving a message, you must use the packet's getLength() method to determine how many of the bytes in the buffer correspond to the message.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
1

Never, ever, call String.getBytes() or the constructor that just takes byte[].

Always pass an explicit character set on both sides.

As your code is currently written, the sender can generate bytes of one encoding, and the reader can (mis)interpret them as some other encoding, producing trash of all flavors.

You might have other problems, as well.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
1

You're converting from characters to bytes at one end, and from bytes to characters at the other. All well and good, but you're not specifying the character encodings in use, and if those are mismatched, the byte/character conversion will not work properly.

You have two options:

  1. specify the conversions with the appropriate character set
  2. enforce the default encoding used by the JVM using the confusingly named -Dfile.encoding JVM parameter.

I would prefer the first option, since you may not have control over how your code is executed, or where (e.g. if your code is lifted into a library for use elsewhere)

Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
1

you can specify character set explicitly like this :

byte[] data = content.getBytes(StandardCharsets.UTF_8);
Bhavin Panchani
  • 1,332
  • 11
  • 17