How to get group by Alias column in Dataframe SELECT list

Question

I am doing SUM on multiple column, those columns want to include in the SELECT list.

Below are my work:

val df=df0
                             .join(df1, df1("Col1")<=>df0("Col1"))
                             .filter((df1("Colum")==="00")
                             .groupBy(df1("Col1"),df1("Col1"))
                             .agg(sum(df1("Amount").alias("Amount1")),sum(df1("Amount2").alias("Amount2")))
                             .select(
                                         df1("Col1").alias("co11"),
                                         df1("Col2").alias("Col2"),
                                         Amount1, Amount2 --getting error here
                                          )

How to include the alias column in the SELECT list?

T. Gawęda · Accepted Answer · 2017-03-13T14:06:29.190

1

Use col function or '

import org.apache.spark.sql.functions._
import spark.implicits._
val df=df0
    .join(df1, df1("Col1")<=>df0("Col1"))
    .filter((df1("Colum")==="00")
    .groupBy(df1("Col1"),df1("Col1"))
    .agg(sum(df1("Amount")).alias("Amount1"),sum(df1("Amount2")).alias("Amount2"))
    .select(
        df1("Col1").alias("co11"),
        df1("Col2").alias("Col2"),
        col("Amount1"), 'Amount2 
    )

edited Mar 13 '17 at 14:06

answered Mar 13 '17 at 13:50

T. Gawęda

15,706
4
46
61

I tried but getting error "User class threw exception: org.apache.spark.sql.AnalysisException: cannot resolve '`Amount1`' given input columns" – sks Mar 13 '17 at 14:04
@sks - I've corrected my answer. Order of alias was wrong, it must be done on sum, not on source column – T. Gawęda Mar 13 '17 at 14:07
I am using Alias column not source column, but still the same error. Cannot resolve Amount1. – sks Mar 13 '17 at 14:26
Yes,I copied your answer. – sks Mar 13 '17 at 14:35
@sks Strange, I've tested it and for me it works. Could you please post what's in the cut part of of message? There should be a list of visible columns – T. Gawęda Mar 13 '17 at 14:44
Sorry for the confusion, it is working fine, problem with brackets. – sks Mar 13 '17 at 15:04

How to get group by Alias column in Dataframe SELECT list

1 Answers1