2

I want to know how to use ssd for SPARK RDD.

Originally, SPARK RDD is using in Memory.

But I want to use ssd for RDD.

Phantômaxx
  • 37,901
  • 21
  • 84
  • 115
June Choi
  • 23
  • 3

1 Answers1

5

Check this link

Check for RDD Persistence and select storage level as DISK_ONLY

Also recommended to check this

Sandesh Deshmane
  • 2,247
  • 1
  • 22
  • 25
  • thank you for your comment. I have one more question. If I select storage lever as DISK_ONLY , that would be rdd made in SSD? – June Choi Apr 22 '15 at 10:12
  • 1
    When you persist an RDD, each node stores any partitions of it that it computes in memory and reuses them in other actions on that dataset (or datasets derived from it). so when you call rdd.persist() or red.cache then rdd will be stored in SSD If you select DISK_ONLY – Sandesh Deshmane Apr 22 '15 at 10:44