This is a Newbie Question.
In the Code below
val var_A=graphA.edges
.filter{case(currEdge)=>currEdge.srcId==currEdge.dstId}
.map{case(currEdge)=>(currEdge.srcId,currEdge.attr)}
var_A
has a type RDD[(graphx.VertexId,Double)]
I want to create a Map using graphx.VertexId
as key, but then I map have a large Dataset and I was trying to use "lookup" instead.
But when I use lookup on "var_A", then it shows me
return type: Seq[Double]
and Input/Key type: graphX.VertexId
.
I am troubled with the return type which I expect to be Double
rather than Seq[Double]
. Please point out the error.
Then I made var_B:
val var_B= var_A.zipWithUniqueId
Now when I use look up on "var_B", I get
return type: Seq[Long]
and
Input/Key type: (graphX.VertexId,Double)
.
I basically want to lookup with graphX.VertexId
as Input/Key type and want Double
as the return Type.
Also I am doing this because Map may not have fitted on the driver's memory, causing OOM exceptions perhaps. But Actually the maximum Tuples that I would have will be in the order of 10^4. So is actually Big to be stored in Memory and broadcasted in the cluster?