I am attempting to use FinishBundle() to batch requests in beam on dataflow. These requests are fetching information and emitting it for further processing downstream in the pipeline, a la:
func BatchRpcFn {
client RpcClient
bufferRequest *RpcRequest
}
func (f *BatchRpcFn) Setup(ctx context.Context) {
// setup client
}
func (f *BatchRpcFn) ProcessBundle(ctx context.Context, id string, emit func(string, bool)) error {
f.bufferRequest.Ids = append(f.bufferRequest.Ids, id)
if len(f.bufferRequest.Ids) > bufferLimit {
return f.performRequestAndEmit(ctx, emit)
}
return nil
}
func (f *BatchRpcFn) FinishBundle(ctx context.Context, emit func(string, bool)) error {
return f.performRequestAndEmit(ctx, emit)
}
In unit tests, this function works as expected, however when running on dataflow, I get this error:
panic: interface conversion: typex.Window is window.GlobalWindow, not window.IntervalWindow
//...
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*intervalWindowEncoder).EncodeSingle()
The documentation on FinishBundle()
is a little sparse, so it wasn't clear to me if this is possible. Most of the examples I see of using FinishBundle()
are flushing data to some sink instead of adding to the resultant PCollection.
Is this a bug, or am I using FinishBundle incorrectly here?