0

i have 2 statements which are to my knowledge exactly alike, but select() works fine, but selectExpr() generates following results.

+-----------------------+----------------------+
|first(StockCode, false)|last(StockCode, false)|
+-----------------------+----------------------+
|                 85123A|                 22138|
+-----------------------+----------------------+

+-----------+----------+
|first_value|last_value|
+-----------+----------+
|  StockCode| StockCode|
+-----------+----------+

following is implementation.

df.select(first(col("StockCode")), last(col("StockCode"))).show()
df.selectExpr("""first('StockCode') as first_value""", """last('StockCode') as last_value""").show()

Can any 1 explain the behaviour?

noobie-php
  • 6,817
  • 15
  • 54
  • 101

2 Answers2

2

selectExpr takes everything as select clause in sql. Hence if you write anything in single quote', it will act as string in sql. if you wanted to pass the column to selectExpr use backtique (`) as below-

df.selectExpr("""first(`StockCode`) as first_value""", """last(`StockCode`) as last_value""").show()

backtique will help you to escape space in the column.

you can use without backtique also if your column name is not starting with number like 12col or it doesn't have spaces in between like column name

df.selectExpr("""first(StockCode) as first_value""", """last(StockCode) as last_value""").show()
Som
  • 6,193
  • 1
  • 11
  • 22
1

You should pass like below

df_b = df_b.selectExpr('first(count) as first', 'last(count) as last')
df_b.show(truncate = False)

+-----+----+
|first|last|
+-----+----+
|2527 |13  |
+-----+----+
dsk
  • 1,863
  • 2
  • 10
  • 13