I'd like to test some streams I've created with execute_and_collect
instead of a JDBC sink. The sink succeeds in converting a Row
to insert data into a DB, but execute_and_collect
fails with:
AttributeError: 'bytearray' object has no attribute 'timestamp'
This is in pyflink.datastream.utils:pickled_bytes_to_python_converter
through execute_and_collect -> CloseableIterator -> next -> convert_to_python_obj
, and indeed caused by the unpickled object being a byte array instead of a datetime object that has .timestamp()
. However, as you'll see in the MWE below, I'm creating datetime objects in the source (which in the real application then is a proper stream in a larger graph).
Before assuming this is a bug, I'd like to know if I'm doing something wrong. I'm quite new to Flink in general, but this seems basic. Here's the MWE:
from datetime import datetime
from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment
env = StreamExecutionEnvironment.get_execution_environment()
env.set_parallelism(1)
field_names = ("created_at",)
collection = [(datetime.now(),)]
field_types = [Types.SQL_TIMESTAMP()]
types = Types.ROW_NAMED(field_names=field_names, field_types=field_types)
stream = env.from_collection(collection=collection, type_info=types)
items = stream.execute_and_collect()
print(list(items)) # Failure here
items.close()