1

I am totally new to spark and I want to create a JavaRDD from labeled points programmatically without reading input from file. Say I create few Labeledpoints as following,

 LabeledPoint pos = new LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 3.0));
 LabeledPoint pos = new LabeledPoint(1.0, Vectors.dense(1.0, 5.0, 3.0));
 LabeledPoint pos = new LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 3.0));
 LabeledPoint pos = new LabeledPoint(1.0, Vectors.dense(1.0, 7.0, 3.0));

Then I want to create a JavaRDD using these vectors. How can I do that.

user1097675
  • 33
  • 1
  • 2
  • 6

1 Answers1

5

Check this section of Apache spark documentation. You can use parallelize function to create rdd.

List<Integer> data = Arrays.asList(1, 2, 3, 4, 5);
JavaRDD<Integer> distData = sc.parallelize(data);
Milad Khajavi
  • 2,769
  • 9
  • 41
  • 66
  • I checkked that. But unfortunately I am working on a part of a big project so the Javasparkcontext is already defined somewhere else. It can be instantiated only once right?. And it is not possible to access the already defined javasparkcontext from my class too. Or is there a way to access the current active javasparkcontext? – user1097675 Feb 21 '16 at 05:07
  • 5
    You must have the access to the SparkContext in your project. Might be exposed through some static variable. Otherwise you wont be able to use any functionality of spark. – Pankaj Arora Feb 21 '16 at 05:11
  • you can use `SparkContext.getOrCreate()` method to get the current context – Sebastian Piu Feb 21 '16 at 14:48