1

I'd like to test some streams I've created with execute_and_collect instead of a JDBC sink. The sink succeeds in converting a Row to insert data into a DB, but execute_and_collect fails with:

AttributeError: 'bytearray' object has no attribute 'timestamp'

This is in pyflink.datastream.utils:pickled_bytes_to_python_converter through execute_and_collect -> CloseableIterator -> next -> convert_to_python_obj, and indeed caused by the unpickled object being a byte array instead of a datetime object that has .timestamp(). However, as you'll see in the MWE below, I'm creating datetime objects in the source (which in the real application then is a proper stream in a larger graph).

Before assuming this is a bug, I'd like to know if I'm doing something wrong. I'm quite new to Flink in general, but this seems basic. Here's the MWE:

from datetime import datetime

from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment

env = StreamExecutionEnvironment.get_execution_environment()
env.set_parallelism(1)

field_names = ("created_at",)
collection = [(datetime.now(),)]
field_types = [Types.SQL_TIMESTAMP()]

types = Types.ROW_NAMED(field_names=field_names, field_types=field_types)
stream = env.from_collection(collection=collection, type_info=types)

items = stream.execute_and_collect()
print(list(items))  # Failure here
items.close()
Felix
  • 2,548
  • 19
  • 48

0 Answers0