3

Am I going crazy?

from pyspark.sql.functions import *
sum([2,3,2])

Gives:

py4j.Py4JException: Method sum([class java.util.ArrayList]) does not exist

How can I just get a simple sum?

What is happening behind the scenes with spark to make things so difficult?

Elliot Huebler
  • 127
  • 2
  • 12

1 Answers1

8

By from pyspark.sql.functions import *, you are overwriting the sum function from python standard library with the sum function from pyspark.sql.functions module. To avoid the overwriting, you can either import the module:

import pyspark.sql.functions as f

and refer to the two sum functions as f.sum and sum.

Or give an alias to the sum function from pyspark.

from pyspark.sql.functions import sum as fsum

So that you don't introduce two sum functions into the same scope.

TheNeil
  • 3,321
  • 2
  • 27
  • 52
Psidom
  • 209,562
  • 33
  • 339
  • 356