I've been doing some work within my hadoop job and was able to quickly get Dozer mapping between object hierarchies. I have some complex objects that I need to convert to Thrift types. When I enable this mapping, it adds nearly 5 minutes of execution to my 5 minute processing run (making it 10 minutes) plus or minus some variation. I am printing out the Dozer statistics, and it is clearly showing that it is spending 5 minutes doing mappings. Is there any way to speed this up? Is there any advantage to defining the mapping vs letting it auto-map via reflection?
This is how I am printing out the stats:
GlobalStatistics stats = GlobalStatistics.getInstance();
for (Statistic stat : stats.getStatsMgr().getStatistics()) {
for (StatisticEntry entry : stat.getEntries()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
}
And this is the output:
SUPER_TYPE_CHECK: 970293
SUPER TYPE CHECK: 6
MAPPER_INSTANCES_COUNT: 1
MAPPING_SUCCESS_COUNT: 10093
FIELD_MAPPING_SUCCESS_COUNT: 25883488
MAPPING_TIME: 478486
One thing I'm curious about is that the generated Thrift classes have public fields as well as getter/setter methods. I need to go back and regnerate them in the Bean style with private fields. Would Dozer be doing extra work on these public fields? How can I speed up this mapping? I know there are tradeoffs of development time vs execution time and I can always drop down to implementing my own conversion, but I am surprised at how large the performance penalty is for using Dozer. Am I doing something wrong?
Also, why don't I see any statistic data about the dozer cache hit rates? This page indicates there should be more statistics available: http://dozer.sourceforge.net/documentation/configuration/statistics.html