0

I am running the delete query with the < (less then) and > (greater then) condition on the timestamp field but we are not getting the desired results.

Firstly I ran below select query on databricks

Query: select * from edl.defectTesting_2_0_0 where time > '2022-03-31T08:44:00.000+0000' and time < '2022-03-31T08:47:00.000+0000';

Result: I was getting 2 records in response [Expected]

But when I ran the delete query using spark I was not getting the desired results.

Query: spark.sql(delete from edl.defectTesting_2_0_0 where time > '2022-03-31T08:44:00.000+0000' and time < '2022-03-31T08:47:00.000+0000')

Result: No records deleted [NOT EXPECTED]

Ideally 2 records should have been deleted but no records were deleted.

So does someone knows whether databricks support timestamp for Less then and greater then condition? And why am I facing the above issue can someone please give us a reasoning behind it?

I am using the below spark library

Libary :import org.apache.spark.sql.{DataFrame, SparkSession}

spark = SparkSession.builder.appName("dataDeletionProcessor").getOrCreate()

Query: spark.sql(delete from delta.edl.defectTesting_2_0_0 where time > '2022-03-31T08:44:00.000+0000' and time < '2022-03-31T08:47:00.000+0000')

  • Are you sure there is data there for that time range (you didn't delete it by chance manually and now it isn't there??)?? Or different tables (the names are different??)? – Mr R May 13 '22 at 22:59
  • I have checked the data is there, I first ran the select query on databricks and records were present in that time range and then via using delete job which uses spark ran the delete query and it didn't delete anything. The issue seems to be happening when within the range, date part is same and time part is different, I guess its not considering the time part, Can you please try reproducing and check – AISHWARYA GUPTA May 14 '22 at 07:07

0 Answers0