I`m working on server-side Spring based application. We are using JAXB, SOAP and Axiom (wrapped in Spring WS), which marshals/unmarshals xml messages with Woodstox, but our application has problems with garbage processing. We send only 165 Mb message, but Marshaller produces about 920 Mb garbage. Maybe someone knows, why the size of collected garbage is so big, and how I can improve this?
-
If you don't suffer from xml entity expansion, I wouldn't call it that much garbage, since it's only 5.5 times as big as the input. Depending on the specific message there might be a lot of instances created which has it's overhead. Additionally equal strings might get instantiated more than once from the same part of xml file. – SpaceTrucker Jan 15 '15 at 15:59
-
The best would be to do a heap dump to see who the culprits are and what type of objects they are. – Ludger Jan 16 '15 at 06:05
-
You should also add more context to your question. Is this on the client side or server side? Is the large message the request or response? Do you use a data binding such as JAXB2? Etc. – Andreas Veithen Jan 16 '15 at 20:42
1 Answers
Woodstox itself does not really produce all that garbage, since it only retains small set of state to support streaming access. The main objects being produced would just be string values produced, and even those only when access.
But data-binding on top of that, provided by Axiom, has to keep much more extensive state and construct object model to expose. So I would expect it to produce significantly many short-lived objects. And it also typically does access each and every value of an XML document, materializing all the strings. Given this, I agree with @SpaceTrucker that this isn't necessary unreasonable amount of garbage to generate. And short term garbage is not often that problematic, compared to long-living objects that live in old generation.
Have you tried taking a heap dump to see what kinds of objects are being produced?

- 113,358
- 34
- 211
- 239