Scala, Spark-shell, Groupby failing

Question

I have Spark version 2.4.0 and scala version 2.11.12. I can sucessfully load a dataframe with the following code.

val df = spark.read.format("csv").option("header","true").option("delimiter","|").option("mode","DROPMALFORMED").option("maxColumns",60000).load("MAR18.csv")

However, when I attempt to do a groupby the following I get an error.

df.groupby("S0102_gender").agg(sum("Respondent.Serial")).show()

The error message is:

error: value groupby is not a member of org.apache.spark.sql.DataFrame

What am I missing. A complete Scala and Spark Newb.

Thanks, this helped a lot. But I don't see where I can accept this as the correct answer. Where is the check mark located. I appreciate the help. — user204548, Dec 14 '18 at 03:01
Happy to help. Since the error is due to a simple typo you can simply delete the question :) — Shaido, Dec 14 '18 at 03:13
Sorry I am a python programmer trying to grok a superior language. The kind response about the importance notation, went far in my understanding of dumb things I need to look for. — user204548, Dec 16 '18 at 02:08

score 2 · Answer 1 · answered Dec 14 '18 at 03:32

2

You have a typo

Change

    groupby

To

    groupBy

answered Dec 14 '18 at 03:32

Suhas NM

960
7
10

score 1 · Accepted Answer · edited Dec 14 '18 at 17:37

1

Instead of groupby it should be groupBy like below... clearly typo error.

df.groupBy("S0102_gender").agg(sum("Respondent.Serial")).show()

edited Dec 14 '18 at 17:37

Ram Ghadiyaram

28,239
13
95
121

answered Dec 14 '18 at 11:36

Leothorn

1,345
1
23
45

Scala, Spark-shell, Groupby failing

2 Answers2