java.lang.Error: Invalid UTF-8 Encoding

Question

I am getting this error when trying to process trades through my trading application and when it is communicating with FIX.

     java.lang.Error: Invalid UTF-8 Encoding
     at javolution.io.Struct$UTF8String.get(Struct.java:1105)

I use UTF-8 encoding and it has been specified in each POM file of my application.It is happening when the trade comes into the application. Has anyone come across this error ever ?

It seems related to the presence of a byte-order marker in the file. The closest report I could quickly find is this here: http://docs.oracle.com/cd/A97329_03/web.902/a88894/adx19paj.htm#1006476 — Frank van Puffelen, Nov 13 '12 at 02:17
It is using Hazelcast SerialisationHelper for encoding the data.Could they possibly be a problem in encoding the object that has been send to the SerializationHelper of Hazelcast ? — MindBrain, Nov 13 '12 at 03:29
java.lang.Error: Invalid UTF-8 Encoding at javolution.io.Struct$UTF8String.get(Struct.java:1105) — MindBrain, Nov 13 '12 at 05:32

score 1 · Answer 1 · edited May 23 '17 at 12:04

1

UPDATE:

Looks like what you've run into are a couple of existing bugs: Fault in handling of UTF8Strings within the Struct class and XMLStreamReaderImpl ignoring xml encoding attribute?

Passing certain Strings to the UTF8String set method results in the field boundary of the memory block which the UTF8String is mapped to in the backing ByteBuffer being exceeded. This appears to result from certain UTF-8 multi-byte characters expanding the string.

Seems there's a problem with the data you're processing or there's a bug in the library. Take a look at the source code of UTF8ByteBufferReader. The exception is being thrown from the following method

private int read2(byte b) throws IOException

towards the bottom of that method you'll see

throw new CharConversionException("Invalid UTF-8 Encoding");

I would double-check that the data you're receiving is in fact UTF-8, because that library doesn't seem to think it is...

If all you're doing is trying to decode a UTF8 stream, you can just use regular Java for that. There are a lot of UTF8 examples online. Also, you might need to use Apache's BOMInputStream.

You can also just read in the bytes and periodically use Charset#decode of StandardCharsets.UTF_8

edited May 23 '17 at 12:04

Community

1
1

answered Nov 13 '12 at 06:01

Andrey

8,882
10
58
82

Yes you are right. There must be a problem in the data. However, I do not have an instance of the trade data that's causing the problem. That is why I was searching whether there is a bug in the code. – MindBrain Nov 13 '12 at 06:42
1

just use your program to record the data (into a file) instead of processing it. then, look at what you've recorded. as someone already mentioned, one possible culprit is BOM (byte order mark), although i don't think it plays a crucial role in UTF-8 processing, it's just an indicator that the data that comes after it will be encoded in UTF-8. – Andrey Nov 13 '12 at 15:37
there are seem to be bugs in javolution. i updated my answer with some links – Andrey Nov 13 '12 at 16:00
I have the version 5.5.1 of javolution in my project. Isn't it the latest ? I think it will have this fix in it if then. – MindBrain Nov 13 '12 at 21:10
1

i don't think so. the status of both bugs is "open" – Andrey Nov 14 '12 at 00:18
Oh thanks for informing me about that. So what solution would you recommend. It is a critical issue. :( – MindBrain Nov 14 '12 at 00:41
1

well, i haven't heard about javolution until i answered this question. but if all you need to do is to decode a UTF8 stream, you can use regular Java for that. i've updated my answer with some links. – Andrey Nov 14 '12 at 01:01
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/19510/discussion-between-abcd-and-andrey) – MindBrain Nov 14 '12 at 05:02

score 1 · Answer 2 · answered Feb 01 '13 at 15:25

1

The FIX standard uses ASCII, not UTF-8.

answered Feb 01 '13 at 15:25

noahlz

10,202
7
56
75

java.lang.Error: Invalid UTF-8 Encoding

2 Answers2

UPDATE: