1

I am working with some big data processing. For every row of a large dataframe table we have data stored as objects, and we have a function that expects a JSON object and runs some evaluations on that object. Currently we are serializing our object into a string and then deserializing it into a json object for processing.

At scale, serialization and deserialization are very costly, so I am wondering if there is a way to convert a java/scala object directly to a json object, and if I am even correct in assuming that this would bypass any deserialization and improve performance.

Thanks!

  • 2
    1) It is not possible to convert *any* Java or Scala object to JSON. For example, a `Thread` or `Process` object will not be representable as JSON. 2) It is not possible to bypass serialization. Conversion to JSON is serialization. And you can't convert to JSON without converting to JSON. – Stephen C Aug 24 '22 at 02:46
  • 3) The cost of serializing using `ObjectOutputStream` will be comparable to serializing to JSON using a POJO <-> JSON mapping ... such as Jackson. – Stephen C Aug 24 '22 at 02:49
  • Having said that, when you say *"currently we are serializing our object into a string and then deserializing it into a json object for processing"* it is not clear what you are actually saying. Do you actually mean that you are deserializing to a `JSONObject` instance ... for some specific `JSONObject` class? ("JSON object" typically means the serialized form.) Perhaps it would help if you showed us some relevant code. – Stephen C Aug 24 '22 at 02:53

0 Answers0