1

I'm using Flink StateFun 3.1.0, and want to call a remote Python function from an embedded Java function. In StateFun 2.2, I could create a ProtoBuf Any from an instance of my (generated via Protobuf) Java class via Any.pack(msg), and that worked. Now I get

Caused by: java.lang.ClassCastException: class com.google.protobuf.Any cannot be cast to class org.apache.flink.statefun.sdk.reqreply.generated.TypedValue (com.google.protobuf.Any and org.apache.flink.statefun.sdk.reqreply.generated.TypedValue are in unnamed module of loader 'app')

There are plenty of examples of calling between embedded Java functions, and remote Java functions, and remote Python functions, but I haven't found an example of calling remote Python from embedded Java.

In the remote Java examples, the function is called with a Context that has a send(Message) method, which I assume would work for calling a remote Python function as well, but my embedded Java function is passed a different type of Context that does not support this method.

kkrugler
  • 8,145
  • 6
  • 24
  • 18

1 Answers1

2

Background

Indeed in the 3.x release one of the explicit goals was to make remote functions easier and more ergonomic to use, you can read more about the motivation for that here.

To help achieving this goal we removed Protobuf from the SDK surface (we still use it internally) and we came up with the Message and types abstractions.

Since the embedded SDK is considered to be more of a power-user SDK, there you would still use Protobuf for communicating with remote functions. Although this is something we would like to simplify as well but there are some technical issues.

TypedValue

This is a Protobuf message that is used internally, whenever a remote function invokes another remote function.

message TypedValue {
    string typename = 1;
    // has_value is set to differentiate a zero length value bytes explicitly set,
    // or a non existing value.
    bool has_value = 2;
    bytes value = 3;
}
  • The typename field is a string of the form / that is defined by the user. for example: "com.kkrugler.types/MyEvent". StateFun will pass that string as-us, and treats that as an opaque type tag, to be interpreted by the user on the remote SDK side.

  • has_value in this context, needs to be always set to true. (it is considered an error to send a message without a payload.

  • values - is a serialised, opaque value of the type typename.

Embedded Functions

An embedded function that needs to invoke a remote function, needs invoke it with a TypedValue instance. The remote SDKs know how to convert a TypedValue into a Message class.

Drop in replacement for Any

Here is a useful method for wrapping a Protobuf message with a TypedValue:

 public static <M extends Message> TypedValue pack(M message) {
    return TypedValue.newBuilder()
        .setTypename("type.googleapis.com/" + message.getDescriptorForType().getFullName())
        .setHasValue(true)
        .setValue(message.toByteString())
        .build();
  }

Then, simply use it like that:

MyProtobufEvent myProtobufEvent = MyProtobufEvent.newBuilder() ... build();

context.send(.., pack(myProtobufEvent));

On the remote SDK side (assuming Python, since you've mentioned that):

from statefun import make_protobuf_type

MyProtobufEventType = make_protobuf_type(MyProtobufEvent)

...
myProtobufEvent = message.as_type(MyProtobufEventType)
Igal
  • 491
  • 2
  • 8
  • Hi Igal - thanks for the detailed response. I've run into one issue, where the TypedValue.setValue() method is expecting an argument of type `org.apache.flink.statefun.sdk.shaded.com.google.protobuf.ByteString`, versus the `com.google.protobuf.ByteString` I'm providing. Wondering if this is a bug in the API? – kkrugler Dec 17 '21 at 01:14
  • I worked around the above by creating a shaded ByteString and using that with `setValue()`, but then I get a run-time error: ` java.lang.IncompatibleClassChangeError: Class org.apache.flink.statefun.sdk.reqreply.generated.ToFunction does not implement the requested interface com.google.protobuf.Message` Wondering if it's implementing the shaded Message interface. – kkrugler Dec 17 '21 at 01:25
  • @kkrugler You need to use the unshaded, regular version of TypedValue. You can either built it yourself from the protobuf definition (you can find the protos in the repo) or you can add a dependency on statefun-flink-core (in scope provided). Let me know if this works, I can update my answer accordingly. – – Igal Dec 17 '21 at 15:04
  • the problem is that the statefun-skd-java jar also has a TypedValue, in the same package. But that TypedValue is using the shaded classes, and it's on the classpath before the TypedValue in the statefun-flink-core jar. Maybe that's the bug? I'll look at building TypedValue myself. – kkrugler Dec 17 '21 at 22:15
  • Hi @Igal - using my own generated TypedValue worked, I was able to call the remote Python function from my embedded Java function, thanks! I think maybe the internal-usage TypedValue (with shaded Protobuf refs) could be renamed (InternalTypedValue?) to avoid collision with the external-usage TypedValue. – kkrugler Dec 17 '21 at 23:28
  • This is total nightmare. I had to regenerate and replace (also exclude from shaded jar) around 20 class files from "statefun-sdk-java". It's not only about `.sdk.reqreply.TypedValue`, it's a whole `.sdk.reqreply.*` + most of `.sdk.java.slice.*` + some classes in `.sdk.java.*` and `.sdk.java.message.*`. Latest release is `3.2.0` and the problem is still there – mangusta Jan 19 '23 at 20:03