4

I have a json library on github https://github.com/jillesvangurp/jsonj

This library has a parser based on json simple, which uses a handler class to do all the work of creating instances of JsonObject,JsonArray, and JsonPrimitive that I have in my library.

I've seen people post various benchmarks suggesting that the jackson parser is about as good as it gets in terms of performance and that json simple is one of the slower options. So, to see if I could boost performance, I created an alternative parser that uses the jackson streaming API and calls the same handler that I used for the original parser. This works fine from a functional perspective and was pretty straightforward.

You can find the relevant classes here (JsonHandler, JsonParser and JsonParserNg): https://github.com/jillesvangurp/jsonj/tree/master/src/main/java/com/github/jsonj/tools

However, I'm not seeing any improvement on the various tests I ran.

So, my question: should I be seeing any improvement at all and if so why? It seems to me that in streaming API mode at least, both libraries have similar performance.

I'd be very interested in other people's experience with this.

Jilles van Gurp
  • 7,927
  • 4
  • 38
  • 46
  • Not a solution, nor an answer to your actual question, but according to their own benchmarks, json-smart is the fastest of all the ones they tested, including json-simple and jackson. https://code.google.com/p/json-smart/ . Also, it uses exactly the same interface as json-simple, so the migration should be a snap. However, a lot of comments on the same site say that folks are getting different results, so YMMV. – Colselaw Apr 20 '13 at 12:54
  • Thanks, I might give that a try as well. – Jilles van Gurp Apr 20 '13 at 13:17
  • Json-smart is not faster than alternatives according to any publicly available benchmark. Benchmarks mentioned in its project page have flaws as comments suggest; including using weird data ({"value":123} and such -- really?). – StaxMan Apr 21 '13 at 19:33

1 Answers1

12

I wrote "On proper performance testing of Java JSON processing" a while ago, to enumerate common problems I have seen with performance benchmarking. There are lots of relatively simple ways to mess up comparison. I am assuming you are not making any of mistakes mentioned, but it is worth mentioning. Especially part about using raw input: there are very few cases where real JSON data comes as String -- so make sure to use InputStream / OutputStream (or byte arrays).

Second thing to note that is that if you use tree model (like JsonObject) you are already adding lots of potentially avoidable overhead: you are building a Map/List structures that use 3x memory that POJOs would use; and are slower to operate on. In this case, actual parsing/generation overhead is typically minority component anyway. Sometimes tree style processing makes sense, and this is acceptable overhead.

So if performance matters a lot, one typically either:

  1. Uses streaming API to build your own objects -- not an in-memory tree, or
  2. Uses data-binding to/from POJOs. This can be close to speed of (1)

both of which will be faster than building trees (and to some degree, serializing). For some reason many developers somehow assume that dealing with tree representations is as efficient way to deal with data as any -- this is not the case, and seen in benchmarks like https://github.com/eishay/jvm-serializers

I did not see Jackson-related code via link, so I am assuming it works as expected. The main things to look for (wrt performance probs) really are to:

  1. Always close JsonParser and JsonGenerator (needed for some of recycling) and
  2. Reuse JsonFactory and/or ObjectMapper instances: they are thread-safe, reuse of some components (symbol tables, serializers) occurs through these objects.
  3. As mentioned earlier, always use most raw input/output destinations if possible (InputStream, OutputStream).
StaxMan
  • 113,358
  • 34
  • 211
  • 239
  • Thanks, I suspected as much. My library by design is all about tree structures, so I can't do much about that. I actually didn't realize that I needed to close the parser, so I just fixed that. – Jilles van Gurp Apr 23 '13 at 09:16
  • Ok. Yes, I figured there probably was a reason to use more flexible model. I hope closing helps to some degree, but differences may not be astronomic. – StaxMan Apr 23 '13 at 20:13
  • @StaxMan I noticed your expertise in serialization. If you have time please have a look at http://stackoverflow.com/questions/21781540/test-code-to-compare-jackson-stream-vs-map-is-this-working-right . KR, apologies for disturbance. – Menelaos Feb 14 '14 at 14:20