The goal of this query is to delete the records that have the same keys of the new dataframe.
This is the query that I'm executing:
val op=spark.sql(s"""
DELETE
FROM TABLE1 AS t
WHERE EXISTS (
SELECT 1
FROM TABLE2 AS s
WHERE t.DAY=s.DAY
AND t.DATA_STREAM=s.DATA_STREAM
)""")
The error that I'm getting is:
AnalysisException: nondeterministic expressions are only allowed in
Project, Filter, Aggregate, Window, or Generate, but found:
exists(t.DAY, t.DATA_STREAM)
in operator DeleteCommandEdge com.databricks.sql.transaction.tahoe.DeltaLog@49e1c8fd, exists#107963 [DAY#108138 && DATA_STREAM#108193]
;
DeleteCommandEdge com.databricks.sql.transaction.tahoe.DeltaLog@49e1c8fd, exists#107963 [DAY#108138 && DATA_STREAM#108193]```
As far as I know exists
is deterministic. Is there another way to achieve the same result with another query?
The query is running on a Databricks cluster with Spark 3.2.1