35

There are a lot of Scala/Spark kernels for IPython/Jupyter:

  1. IScala
  2. ISpark
  3. Jupyter Scala
  4. Apache Toree(prev Spark Kernel)

Does anybody know wich of them is most compatible with IPython/Jupyter and most comfortable to use with:

  1. Scala
  2. Spark(Scala)
smci
  • 32,567
  • 20
  • 113
  • 146
Lunigorn
  • 1,340
  • 3
  • 19
  • 27
  • 2
    The IPython wiki has a list of many kernels (including other languages besides scala). Thought I would add it here: https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages – Luciano Jan 01 '17 at 16:24
  • Useful to comment if these come as source, binary or both. And the ease of installation, both on Win10/Linux/MacOS. Also, how do they compare to each other on CPU and memory performance? security? patches? magic commands? – smci Oct 14 '17 at 17:27

3 Answers3

15

I can't speak for all of them, but I use Spark Kernel and it works very well for using both Scala and Spark.

I found IScala and Jupyter Scala less stable and less polished. Jupyter Scala always prints every variable value after I execute a cell; I don't want to see this 99% of the time.

Spark Kernel is my favourite. Both for Spark and plain old Scala.

Al M
  • 557
  • 4
  • 10
5

Spark Kernel has been accepted into Apache Incubator and has moved all development to Apache Toree.

artyomboyko
  • 2,781
  • 5
  • 40
  • 54
  • Are you recommending it or just commenting? How does it compare on CPU and memory performance, install size, ease of install, etc? – smci Oct 14 '17 at 17:24
4

I have been using spark-kernel (your option #4) and quite satisfied.

You can find a nice how-to installation (CDH 5.5 on CentOS 7) here (I have used it myself to install it in a Single node in pseudo-distributed mode).

http://www.davidgreco.me/blog/2015/12/24/how-to-use-jupyter-with-spark-kernel-and-cloudera-hadoop-slash-spark/

Antoni
  • 2,542
  • 20
  • 21