I have seen posts discussing the usage of windows function. But i have some questions.
- Since it is can only be used in HiveContext. How can i switch between SparkSQLContext and HiveContext given i am already using SparkSQLContext?
How is that possible to run a HiveQL using windows function here? I tried
df.registerTempTable("data") from pyspark.sql import functions as F from pyspark.sql import Window
%%hive SELECT col1, col2, F.rank() OVER (Window.partitionBy("col1").orderBy("col3") FROM data
and native Hive SQL
SELECT col1, col2, RANK() OVER (PARTITION BY col1 ORDER BY col3) FROM data
but neither of them works.