2

This is a Newbie Question.

In the Code below

val var_A=graphA.edges
                .filter{case(currEdge)=>currEdge.srcId==currEdge.dstId}
                .map{case(currEdge)=>(currEdge.srcId,currEdge.attr)}

var_A has a type RDD[(graphx.VertexId,Double)]

I want to create a Map using graphx.VertexId as key, but then I map have a large Dataset and I was trying to use "lookup" instead.

But when I use lookup on "var_A", then it shows me return type: Seq[Double] and Input/Key type: graphX.VertexId.

I am troubled with the return type which I expect to be Double rather than Seq[Double]. Please point out the error.

Then I made var_B:

val var_B= var_A.zipWithUniqueId

Now when I use look up on "var_B", I get return type: Seq[Long] and Input/Key type: (graphX.VertexId,Double).

I basically want to lookup with graphX.VertexId as Input/Key type and want Double as the return Type.

Also I am doing this because Map may not have fitted on the driver's memory, causing OOM exceptions perhaps. But Actually the maximum Tuples that I would have will be in the order of 10^4. So is actually Big to be stored in Memory and broadcasted in the cluster?

Jarvis
  • 8,494
  • 3
  • 27
  • 58
ayush gupta
  • 607
  • 1
  • 6
  • 14

0 Answers0