0

How we can check the query is running fine or not in terms of accessing partition. Is there anything we can run explain plan for the iceberg table.

Example: I have created iceberg table using partition on month(tpep_pickup_datetime).

Query I'm running from spark is

df = spark.sql("select *  from iceberg.nyc_yellowtaxi_tripdata_v2 where tpep_pickup_datetime = '2022-01-01 00:35:40' ")

I just want to make sure that partition is working fine or not. Which partition has been accessed or is there any full table scan. I have tried running df.explain(), but it is not giving any partition information on filters added.

    Spark Running
== Physical Plan ==
*(1) Filter (isnotnull(tpep_pickup_datetime#217) AND (tpep_pickup_datetime#217 = 2022-01-01 00:35:40))
+- *(1) ColumnarToRow
   +- BatchScan[vendorid#216L, tpep_pickup_datetime#217, tpep_dropoff_datetime#218, passenger_count#219, trip_distance#220, ratecodeid#221, store_and_fwd_flag#222, pulocationid#223L, dolocationid#224L, payment_type#225L, fare_amount#226, extra#227, mta_tax#228, tip_amount#229, tolls_amount#230, improvement_surcharge#231, total_amount#232, congestion_surcharge#233, airport_fee#234] iceberg.nyc_yellowtaxi_tripdata_v2 [filters=tpep_pickup_datetime IS NOT NULL, tpep_pickup_datetime = 1640997340000000] RuntimeFilters: []
sho
  • 176
  • 2
  • 12

0 Answers0