When I try to use MATCH_RECOGNIZE in my SQL queries with Python UDFs, I get the error Python Function can not be used in MATCH_RECOGNIZE for now.
For example, the following is not supported:
SELECT T.aa as ta
FROM MyTable
MATCH_RECOGNIZE (
ORDER BY proctime
MEASURES
A.a as aa,
pyFunc(1,2) as bb
PATTERN (A B)
DEFINE
A AS a = 1,
B AS b = 'b'
) AS T
This raises a few questions:
Why would it take for the Blink planner to support Python functions?
Where could I find in the documentation this type of lack of support? The docs regarding this feature don't mention Python. Is it expected that I parse through validation tests?
(main question) Is the best alternative to MATCH_RECOGNIZE a user-defined table aggregation Python function? I want to find just two events in sequence (within an hour window). I know I can do this with a self-join but I'd like to see if there's a more efficient/clean possibility.