11

I'm using the jackson framework for marshaling and unmarshalling data between JSON and Java. Everything works well, as long the input doesn't contain any characters like:

  • ö
  • ä
  • ü
  • Ö
  • Ä
  • Ü
  • ß

For input data I tried:

String jsonData = "{\"id\":1,\"street\":\"Straße\",\"number\":\"1c\",\"zipCode\":1111,\"city\":\"MyCity\"}";

as well as:

String jsonData = "{\"id\":1,\"street\":\"Stra\u00DFe\",\"number\":\"1c\",\"zipCode\":1111,\"city\":\"MyCity\"}";

and all the time I get the same exception.

The mapping from json data to java entity object is done via:

/*
 * Convert stream to data entity
 */
ObjectMapper m = new ObjectMapper();
T entity = (T) m.readValue(stringToStream(jsonData), readableClass);

I also perform a json data validation which works like expected, also with the above chars.

How should such data be handled?

UPDATE These are the important parts of the MessageBodyReader class

@Override
public T readFrom(Class<T> type, Type genericType,
        Annotation[] annotations, MediaType mediaType,
        MultivaluedMap<String, String> httpHeaders, InputStream entityStream)
        throws IOException, WebApplicationException {

    final String jsonData = getStringFromInputStream(entityStream);
    System.out.println(jsonData);

    InputStream isSchema = new FileInputStream(jsonSchemaFile);
    String jsonSchema = getStringFromInputStream(isSchema);

    /*
     * Perform JSON data validation against schema
     */
    validateJsonData(jsonSchema, jsonData);

    /*
     * Convert stream to data entity
     */
    ObjectMapper m = new ObjectMapper();
    T entity = (T) m.readValue(stringToStream(jsonData), readableClass);

    return entity;
}

/**
 * Validate the given JSON data against the given JSON schema
 * 
 * @param jsonSchema
 *            as String
 * @param jsonData
 *            as String
 * @throws MessageBodyReaderValidationException
 *             in case of an error during validation process
 */
private void validateJsonData(final String jsonSchema, final String jsonData)
        throws MessageBodyReaderValidationException {
    try {
        final JsonNode d = JsonLoader.fromString(jsonData);
        final JsonNode s = JsonLoader.fromString(jsonSchema);

        final JsonSchemaFactory factory = JsonSchemaFactory.byDefault();
        JsonValidator v = factory.getValidator();

        ProcessingReport report = v.validate(s, d);
        System.out.println(report);
        if (!report.toString().contains("success")) {
            throw new MessageBodyReaderValidationException(
                    report.toString());
        }

    } catch (IOException e) {
        throw new MessageBodyReaderValidationException(
                "Failed to validate json data", e);
    } catch (ProcessingException e) {
        throw new MessageBodyReaderValidationException(
                "Failed to validate json data", e);
    }
}

/**
 * Taken from <a href=
 * "http://www.mkyong.com/java/how-to-convert-inputstream-to-string-in-java/"
 * >www.mkyong.com</a>
 * 
 * @param is
 *            {@link InputStream}
 * @return Stream content as String
 */
private String getStringFromInputStream(InputStream is) {
    BufferedReader br = null;
    StringBuilder sb = new StringBuilder();

    String line;
    try {

        br = new BufferedReader(new InputStreamReader(is));
        while ((line = br.readLine()) != null) {
            sb.append(line);
        }

    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        if (br != null) {
            try {
                br.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    return sb.toString();
}

private InputStream stringToStream(final String str) {
    return new ByteArrayInputStream(str.getBytes());
}
123456789
  • 113
  • 1
  • 1
  • 7
  • Could you please also provide us with stringToStream code? – Jk1 Aug 11 '13 at 19:41
  • Possible duplicate of [jackson JsonParseException: Invalid UTF-8 middle byte](https://stackoverflow.com/questions/6352861/jackson-jsonparseexception-invalid-utf-8-middle-byte) – Raedwald Aug 23 '18 at 08:10

2 Answers2

11

JSON specification states, that only valid encodings are UTF-8, UTF-16 and UTF-32. No other encodings (like Latin-1) can be used. Your stringToStream implementation is not setting the encoding explicitly, so system default is used. That is how you got non-utf stream. On the next step Jakson is trying to parse the stream using one of UTF encodings (it has detection algorithm built in) and fails. Try setting an explicit encoding:

new ByteArrayInputStream(str.getBytes("UTF-8"));
Jk1
  • 11,233
  • 9
  • 54
  • 64
1

You already got an answer, but one obvious question here is this: why are you converting from a String to a stream? That is unnecessary and wasteful thing to do -- so just pass the String as-is. This will also remove the problem; Strings do not have encoding per se (that is: there is just a single in-memory representation and no conversions are needed).

StaxMan
  • 113,358
  • 34
  • 211
  • 239
  • Oh, thank you! You are refering to the unmarshalling call which can be simplified to: `T entity = (T) m.readValue(jsonData, readableClass);` Are there further improvements? – 123456789 Aug 16 '13 at 14:59
  • When reading a file as a String, also better to use basic `InputStreamReader`, append using `StringBuilder`, instead of line-by-line. Or, if JSON Schema Validator can read from a `Reader` or `InputStream`, pass those -- it may well be using Jackson under the hood as well. – StaxMan Aug 16 '13 at 18:00