How we can check the query is running fine or not in terms of accessing partition. Is there anything we can run explain plan for the iceberg table.
Example: I have created iceberg table using partition on month(tpep_pickup_datetime)
.
Query I'm running from spark is
df = spark.sql("select * from iceberg.nyc_yellowtaxi_tripdata_v2 where tpep_pickup_datetime = '2022-01-01 00:35:40' ")
I just want to make sure that partition is working fine or not. Which partition has been accessed or is there any full table scan.
I have tried running df.explain()
, but it is not giving any partition information on filters added.
Spark Running
== Physical Plan ==
*(1) Filter (isnotnull(tpep_pickup_datetime#217) AND (tpep_pickup_datetime#217 = 2022-01-01 00:35:40))
+- *(1) ColumnarToRow
+- BatchScan[vendorid#216L, tpep_pickup_datetime#217, tpep_dropoff_datetime#218, passenger_count#219, trip_distance#220, ratecodeid#221, store_and_fwd_flag#222, pulocationid#223L, dolocationid#224L, payment_type#225L, fare_amount#226, extra#227, mta_tax#228, tip_amount#229, tolls_amount#230, improvement_surcharge#231, total_amount#232, congestion_surcharge#233, airport_fee#234] iceberg.nyc_yellowtaxi_tripdata_v2 [filters=tpep_pickup_datetime IS NOT NULL, tpep_pickup_datetime = 1640997340000000] RuntimeFilters: []