I have this koalas dataframe that is a merge of two other dataframes. it got 4 columns rewritten as the max value of their group on the specified key. Also got a new column with value of 0 1 if another column is null or not.
t0 = time.time()
NauticalData = ShipDatekeyTs_Calendar.merge(
derfact_nautical_ts5, on=["ShipId", "DateKey", "ts180"], how="left"
)
NauticalData = NauticalData.assign(
SOG=(NauticalData.groupby(["key_x"], as_index=False)["SOG"].max())["SOG"],
latitude=(NauticalData.groupby(["key_x"], as_index=False)["longitude"].max())[
"longitude"
],
longitude=(NauticalData.groupby(["key_x"], as_index=False)["longitude"].max())[
"longitude"
],
Heading=(NauticalData.groupby(["key_x"], as_index=False)["Heading"].max())[
"Heading"
],
)
NauticalData = NauticalData.assign(
SOG_IsNull=np.where((NauticalData["SOG"].to_numpy()).isnull(), 1, 0)
)
t1 = time.time()
print(str(t1 - t0) + " CREATE TABLE #NauticalData")
but it gives me this error:
AnalysisException: Resolved attribute(s) SOG#34059,longitude#34109,longitude#34159,Heading#34209 missing from
__index_level_0__#33970L,ShipId#33937,DateKey#33938,ts180#33939,ts180_date#33940,
minTs180#33941,maxTs180#33942,key_x#33943,SOG#33944,latitude#33945,longitude#33946,
Heading#33947,EUPortDetails#33948,ts5_seconds#33949L,ts5_minute#33950L,ts180_str#33951,
key_y#33952,__natural_order__#33989L in operator !Project [__index_level_0__#33970L,
ShipId#33937, DateKey#33938, ts180#33939, ts180_date#33940, minTs180#33941, maxTs180#33942,
key_x#33943, SOG#34059 AS SOG#34226, longitude#34109 AS latitude#34228,
longitude#34159 AS longitude#34230, Heading#34209 AS Heading#34232,
EUPortDetails#33948, ts5_seconds#33949L, ts5_minute#33950L, ts180_str#33951, key_y#33952].
Attribute(s) with the same name appear in the operation: SOG,longitude,longitude,Heading.
Please check if the right attribute(s) are used.;
on this line:
NauticalData = NauticalData.assign(SOG_IsNull = np.where((NauticalData['SOG'].to_numpy()).isnull(), 1, 0))
or every other line that uses NauticalData as df. even display(NauticalData)