0

I am trying to add koalas dataframe in an entitySet. Here is the code for it

subset_kdf_fp_eta_gt_prd.spark.print_schema()
root
 |-- booking_code: string (nullable = true)
 |-- order_id: string (nullable = true)
 |-- restaurant_id: string (nullable = true)
 |-- country_id: long (nullable = true)
 |-- inferred_prep_time: long (nullable = true)
 |-- inferred_wait_time: long (nullable = true)
 |-- is_integrated_model: integer (nullable = true)
 |-- sub_total: double (nullable = true)
 |-- total_quantity: integer (nullable = true)
 |-- dish_name: string (nullable = true)
 |-- sub_total_in_sgd: double (nullable = true)
 |-- city_id: long (nullable = true)
 |-- hour: integer (nullable = true)
 |-- weekday: integer (nullable = true)
 |-- request_time_epoch_utc: timestamp (nullable = true)
 |-- year: string (nullable = true)
 |-- month: string (nullable = true)
 |-- day: string (nullable = true)
 |-- is_takeaway: string (nullable = false)
 |-- is_scheduled: string (nullable = false)

es = ft.EntitySet(id="koalas_es")
from woodwork.logical_types import Categorical, Double, Integer, NaturalLanguage, Datetime, Boolean

es.add_dataframe(dataframe_name="fp_eta_gt_prd",
                              dataframe=subset_kdf_fp_eta_gt_prd,
                              index="order_id",
                              time_index="request_time_epoch_utc",
                              already_sorted="false",
                              logical_types={
                                  "booking_code": Categorical,
                                  "order_id": Categorical,
                                  "restaurant_id": Categorical,
                                  "country_id": Double,
                                  "inferred_prep_time": Double,
                                  "inferred_wait_time": Double,
                                  "is_integrated_model": Categorical,
                                  "sub_total": Double,
                                  "total_quantity": Integer,
                                  "dish_name": NaturalLanguage,
                                  "sub_total_in_sgd": Double,
                                  "city_id": Categorical,
                                  "hour": Categorical,
                                  "weekday": Categorical,
                                  "request_time_epoch_utc": Datetime,
                                  "year": Categorical,
                                  "month": Categorical,
                                  "day": Categorical,
                                  "is_takeaway": Categorical,
                                  "is_scheduled": Categorical,
                              })

On running this, I am encountering the error Index names must be exactly matched currently. I have double checked all the field names, index uniqueness etc. Not sure what might be the cause of error here.

Mohit Jain
  • 733
  • 3
  • 9
  • 24

0 Answers0