0

I would like to know how to specify a custom profiler class in PySpark for Spark version 2+. Under 1.6, I know I can do so like this:

sc = SparkContext('local', 'test', profiler_cls='MyProfiler')

but when I create the SparkSession in 2.0 I don't explicitly have access to the SparkContext. Can someone please advise how to do this for Spark 2.0+ ?

femibyte
  • 3,317
  • 7
  • 34
  • 59

1 Answers1

1

SparkSession can be initialized with an existing SparkContext, for example:

from pyspark import SparkContext
from pyspark.sql import SparkSession
from pyspark.profiler import BasicProfiler

spark = SparkSession(SparkContext('local', 'test', profiler_cls=BasicProfiler))
zero323
  • 322,348
  • 103
  • 959
  • 935