27

Suppose I use the standard Java object serialization to write/read small (< 1K) Java objects to/from memory buffer. The most critical part is the deserialization, i.e. reading Java objects from memory buffer (byte array).

Is there any faster alternative to the standard Java serialization for this case ?

Michael
  • 41,026
  • 70
  • 193
  • 341
  • Why not try something like protocol buffer? It is faster than native Java serialization in most cases, for what I wanted to use it for. I started using it for really simple use case, but its slowly grown into a major part in a project I'm involved in, mainly for creating contracts. – opensourcegeek Dec 23 '12 at 14:01
  • I stumbled upon this question because is asked myself the same question. If there are metrics to compare (im sure somebody did such thing) - then this question can be answered with facts, not opinions. Its not the questions problem - but the answeres problem. correct me if im wrong - else i will edit the question so it fits the criterions to reopen it. – Gewure Jan 03 '17 at 15:45

5 Answers5

43

You might also want to have a look at FST.

also provides tools for offheap reading/writing

nico.ruti
  • 605
  • 6
  • 17
R.Moeller
  • 3,436
  • 1
  • 17
  • 12
  • 4
    Really liked FST. Thank you! – Art Jul 03 '15 at 15:50
  • 4
    Wow, for my use case, I just cut the deserialization time on my big objects from more than 30s to 4s (also, as a bonus, the serialized form is smaller), just by using their [Plain ObjectOutputStream Replacement](https://github.com/RuedigerMoeller/fast-serialization/wiki/Serialization#plain-objectoutputstream-replacement). This lib is absolutely a must try if you rely on the JDK serialization in your project! – Pierre-David Belanger May 12 '16 at 03:00
35

have a look at kryo. its much much faster than the built-in serialization mechanism (that writes out a lot of strings and relies heavily on reflection), but a bit harder to use.
edit: R.Moeller below suggested FST, which i've never heard of until now but looks to be both faster than kryo and compatible with java built-in serialization (which should make it even easier to use), so i'd look at that 1st

nico.ruti
  • 605
  • 6
  • 17
radai
  • 23,949
  • 10
  • 71
  • 115
  • 1
    Thanks. Do you know _why_ kryo is faster? What do they do different? – Michael Dec 23 '12 at 14:40
  • 5
    the java built-in serialization mechanism writes down the fully qualified class name at the begining of every serialized instance - otherwise it wont know what its looking at when deserializing. kryo may (optionally) register() known classes (if you know in advance the types you'll be reading/writing) and can use a small int header instead of a FQCN - so smaller size. you can also register Serializers for classes, avoiding reflection lookups (serializable interface, serialization annotations etc) - faster io. kryo doesnt handle object versions (by default, vs serialVersionUID) - faster again. – radai Dec 23 '12 at 15:42
  • 1
    also, when deserializing the built-in mechanism needs a Class.forName() call with the FQCN - slower than a map of registered handlers. – radai Dec 23 '12 at 15:44
7

Try Google protobuf or Thrift.

SpyBot
  • 487
  • 2
  • 6
  • 16
  • 2
    Protobuf is not a serializer by itself. It's a message format for creating objects of protobuf classes. For creating those objects, it uses SerDes of the language it runs on by default, e.g. if you use C++ then it users C++ SerDes by default. If you use java then at the backend it uses Java SerDes to create those messages. Kryo on the other hand is a SerDes and is very fast. It's common to be used in spark as a SerDes sometimes for creating objects of proto/avro types. – behold Apr 04 '19 at 16:14
3

The standard serialization adds a lot of type information which is then verified when the object is deserialized. When you know the type of the object you are deserializing, this is usually not necessary.

What you could do, is create your own serialization method for each class, which just writes all the values of the object to a byte buffer, and a constructor (or factory method, when you swing that way) which takes such a byte buffer and reads all the variables from it.

But just like AlexR I wonder if you really need that. Serialization is usually only needed when the data leaves the program (like getting stored on disk or sent over the network to another program).

Philipp
  • 67,764
  • 9
  • 118
  • 153
  • the serialized object would still contain the fully-qualified class name string, for example, and serialization would still be (relatively) slow because it has to check and see that youre Serializabl, or even Externalizable in the case youre suggesting – radai Dec 23 '12 at 14:11
  • I wasn't suggesting to override the standard serialize method - I was suggesting to create an entirely new serialization mechanism which does not use the standard. – Philipp Dec 23 '12 at 14:30
  • oh, sorry then. in that case i'd have to say the wheel has already been invented (kryo, among others) – radai Dec 23 '12 at 15:45
1

Java's standard serialisation is known to be slow, and to use a huge ammount of bytes on disk. It is very simple to do your own custom serialisation.
javas std serialisation is nice for demo project but for above reasons not well suited for professional projects. Further versioning is not well under your controll.

java provides all you need for custom serialisation, see demo code in my post at

Java partial (de)serialization of objects

With that approach you even can specify the binary file format, such that in in C or C# it could be read in, too. Another advantage custom setialized objects need less space than in main memory (a boolean needs 4 byte in main memm but only 1 byte when custom serialized (as byte)

If differnet project partners have to read your serialied data, Googles Protobuf is an alternative to look at.

Community
  • 1
  • 1
AlexWien
  • 28,470
  • 6
  • 53
  • 83