I apologize for opening another question about this general issue, but none of the questions I've found on SO seem to relate closely to my issue.
I've got an existing, working dataflow pipeline that accepts objects of KV<Long, Iterable<TableRow>>
and outputs TableRow
objects. This code is in our production environment, running without issue. I am now trying to implement a unit test with direct runner to test this pipeline, however, but the unit test fails when it hits the line
LinkedHashMap<String, Object> evt = (LinkedHashMap<String, Object>) row.get(Schema.EVT);
in the pipeline, throwing the error message:
java.lang.ClassCastException: com.google.gson.internal.LinkedTreeMap cannot be cast to java.util.LinkedHashMap
A simplified version of the existing dataflow code looks like this:
public static class Process extends DoFn<KV<Long, Iterable<TableRow>>, TableRow> {
/* private variables */
/* constructor */
/* private functions */
@ProcessElement
public void processElement(ProcessContext c) throws InterruptedException, ParseException {
EventProcessor eventProc = new EventProcessor();
Processor.WorkItem workItem = new Processor.WorkItem();
Iterator<TableRow> it = c.element().getValue().iterator();
// process all TableRows having the same id
while (it.hasNext()) {
TableRow item = it.next();
if (item.containsKey(Schema.EVT))
eventProc.process(item, workItem);
else
/* process by different Proc class */
}
/* do additional logic */
/* c.output() is somewhere far below */
}
}
public class EventProcessor extends Processor {
// Extract data from an event into the WorkItem
@SuppressWarnings("unchecked")
@Override
public void process(TableRow row, WorkItem item) {
try {
LinkedHashMap<String, Object> evt = (LinkedHashMap<String, Object>) row.get(Schema.EVT);
LinkedHashMap<String, Object> profile = (LinkedHashMap<String, Object>) row.get(Schema.PROFILE);
/* if no exception, process further business logic */
/* business logic */
} catch (ParseException e) {
System.err.println("Bad row");
}
}
}
The relevant portion of the unit test, which prepares the main input to the Process() DoFn
, looks like this:
Map<Long, List<TableRow>> groups = new HashMap<Long, List<TableRow>>();
List<KV<Long, Iterable<TableRow>>> collections = new ArrayList<KV<Long,Iterable<TableRow>>>();
Gson gson = new Gson();
// populate the map with events grouped by id
for(int i = 0; i < EVENTS.length; i++) {
TableRow row = gson.fromJson(EVENTS[i], TableRow.class);
Long id = EVENT_IDS[i];
if(groups.containsKey(id))
groups.get(id).add(row);
else
groups.put(id, new ArrayList<TableRow>(Arrays.asList(row)));
}
// prepare main input for pipeline
for(Long key : groups.keySet())
collections.add(KV.of(key, groups.get(key)));
The line which is causing the issue is gson.fromJson(EVENTS[i], TableRow.class);
, which appears to be encoding the internal representation of the TableRow as the wrong type of LinkedTreeMap.
The encoded type of the TableRow appears to be com.google.gson.internal.LinkedTreeMap
instead of the expected java.util.LinkedHashMap
. Is there a way I can cast the TableRow being created in my unit test to the correct type of java.util.LinkedHashMap
, so that the unit test succeeds without making any changes to the existing dataflow code that already works in production?