2

I got the following error in Google Cloud Data Flow:

java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.coders.CoderException: cannot encode a null String at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:162) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:287) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext.output(DoFnRunnerBase.java:449) at reports.transforms.JsonToObject.processElement(JsonToObject.java:35)

Caused by: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.RuntimeException: com.google.cloud.dataflow.sdk.coders.CoderException: cannot encode a null String at com.google.cloud.dataflow.sdk.util.UserCodeException.wrap(UserCodeException.java:35) at com.google.cloud.dataflow.sdk.util.UserCodeException.wrapIf(UserCodeException.java:40) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.wrapUserCodeException(DoFnRunnerBase.java:368) at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:51) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138) at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:190) at com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42) at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47) at com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53) at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52) at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:160) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:287) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext.output(DoFnRunnerBase.java:449) at reports.transforms.JsonToObject.processElement(JsonToObject.java:35) at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49) at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)

In my class (JsonToObject) I do the following:

if (obj != null) { processContext.output(obj); }

And that where the exception throws.

Any idea why it happen?

Ran
  • 462
  • 7
  • 15
  • It looks like your coder is likely a composite coder, and your object has a null string somewhere in it. How are you setting the coder? Also, if you are building up the coder yourself, NullableCoder may be useful – danielm Jun 16 '16 at 18:23
  • I use the default coder, not set it by myself. but yes my object has string member that is nullable, why it a problem? – Ran Jun 19 '16 at 08:07
  • NullableCoder cannot be set as default decoder? I get the following error java.lang.IllegalArgumentException: Class com.google.cloud.dataflow.sdk.coders.NullableCoder is missing required static method of(TypeDescriptor). – Ran Jun 22 '16 at 12:34

1 Answers1

2

NullableCoder is a composite coder which requires it to be specified in terms of another coder. @DefaultCoder is incompatible with composite coders (KvCoder, IterableCoder, ...) because of this requirement to be parameterized by another coder. One way to solve your problem is to set the coder on each PCollection that may contain nullable types manually. For example:

PCollection<String> pc = pipeline.apply(... transform that produces nulls ...);
pc.setCoder(NullableCoder.of(StringUtf8Coder.of());
Lukasz Cwik
  • 1,641
  • 12
  • 14