0

I am doing the following in my databricks code. The code is trying to delete some rows from my table under conditions which seam simple to me.

from datetime import datetime, timedelta

today = datetime.today()
yesterday = today - timedelta(days = int(look_back_days))
yesterday_str = yesterday.strftime('%Y/%m/%d')

query = f"""DELETE FROM table_name a WHERE  AWB_ID  IN (SELECT * FROM a WHERE a.DW_CRT_RUN_DT >= '{yesterday_str}' or a.dw_upd_run_dt >= '{yesterday_str}' and a.awb_Stus_cd = 'VOID' and a.dw_upd_pgm_id = 'DSPD0580')"""

spark.sql(query)

However, I got the following error.

AnalysisException: Multi-column In predicates are not supported in the DELETE condition

I also put what was suggested by @kombajn-zbożowy below, but it is still throwing the same error

enter image description here

Ali Saberi
  • 864
  • 1
  • 10
  • 33

2 Answers2

1

The issue is in SELECT * - it is expecting a list of AWB_IDs, but you give it all columns. Use:

DELETE FROM table_name a WHERE AWB_ID  IN (SELECT AWB_ID FROM a WHERE ...)
Kombajn zbożowy
  • 8,755
  • 3
  • 28
  • 60
0

DELETE FROM abc WHERE field IN (SELECT field FROM def);

If you continue to get exceptions, run the sub-query on its own. You may get something like:

Error in SQL statement: AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name 'field' cannot be resolved. Did you mean one of the following?

It is possible that you are trying to reference a non-existing field and this exception pops up.

Kyle Winkelman
  • 311
  • 3
  • 7