1

I assign the value as sc = pyspark.SparkContext(). It run and doesnt respond for so long on jupyter notebook as asteric sign appears and doesnt show any error or so.

I tried sc = SparkContext()

import pyspark
import os
from pyspark import SparkContext, SparkConf
sc = pyspark.SparkContext()  # At this part it don't respond
from pyspark.sql import SQLContext
sqlc = SQLContext(sc)

It should go on.

2 Answers2

1

For Python,

from pyspark import SparkContext
sc = SparkContext(appName = "test")

But since you're working on pyspark version 2+ , you dont need to initialize spark context. You can create a spark session and directly work on it.

SPARK 2.0.0 onwards, SparkSession provides a single point of entry to interact with underlying Spark functionality and allows programming Spark with DataFrame and Dataset APIs. All the functionality available with sparkContext are also available in sparkSession.

In order to use APIs of SQL, HIVE, and Streaming, no need to create separate contexts as sparkSession includes all the APIs.

To configure a spark session,

session = SparkSession.builder.getOrCreate()
mythic
  • 535
  • 7
  • 21
-1

Try the following import: from pyspark import * After that you can use it like so:

sc = SparkContext()
Ballo Adam
  • 105
  • 1
  • 8