Please suggest best way to write an inline function at the place where func_1 is being called. Also it should do something what func_1 is trying to do (I am aware that a function cannot return two things in scala)
I am reading lines from a file(args(0)), where each line consists of numbers separated by comma. For each line first number nodeId and other numbers are its neighbours For first 5 lines first number itself is cluseterId. graph contains each node with Long:nodeId,Long:clusterId and List[Long]:neighbours
I am trying to write a map reduce kind of functionality where this function "func_1" is like a mapper which emits (nodeId,clusterId,neighbours) and then checks for every element in neighbours and if clusterId > -1 then emits (nodeId,clusterId). In short tuple (nodeId,clusterId,neighbours) has to be emitted unconditionally
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import scala.collection.mutable.ListBuffer
object Partition {
val depth = 6
def func_1(nodeId:Long,clusterId:Long,neightbours:List[Long]):Either[(Long,Long,List[Long]),(Long,Long)]={
Left(nodeId,clusterId,neightbours)
for(x <- neightbours){
if(clusterId > -1){
Right(x,clusterId)
}
}
}
def func_2(){
}
def main ( args: Array[ String ] ) {
val conf=new SparkConf().setAppName("Partition")
val sc=new SparkContext(conf)
var count : Int = 0
var graph=sc.textFile(args(0)).map(line =>{
var nodeId:Long=line(0).toLong
var clusterId:Long=1
var neighbours=new ListBuffer[Long]()
if(count < 5){
clusterId=line(0).toLong
}else{
clusterId= -1 * clusterId
}
val nums=line.split(",")
for(i <- 1 to line.length()-1){
neighbours.+=(nums(i).toLong)
}
(nodeId,clusterId,neighbours.toList)
}).collect()
graph.foreach(println)
for (i <- 1 to depth)
graph = graph.flatMap{ func_1 }.groupByKey.map{ /* (2) */ }
/* finally, print partition sizes */
}
}