please suggest best way to write an anonymous function here

Question

Please suggest best way to write an inline function at the place where func_1 is being called. Also it should do something what func_1 is trying to do (I am aware that a function cannot return two things in scala)

I am reading lines from a file(args(0)), where each line consists of numbers separated by comma. For each line first number nodeId and other numbers are its neighbours For first 5 lines first number itself is cluseterId. graph contains each node with Long:nodeId,Long:clusterId and List[Long]:neighbours

I am trying to write a map reduce kind of functionality where this function "func_1" is like a mapper which emits (nodeId,clusterId,neighbours) and then checks for every element in neighbours and if clusterId > -1 then emits (nodeId,clusterId). In short tuple (nodeId,clusterId,neighbours) has to be emitted unconditionally

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import scala.collection.mutable.ListBuffer

object Partition {

  val depth = 6

  def func_1(nodeId:Long,clusterId:Long,neightbours:List[Long]):Either[(Long,Long,List[Long]),(Long,Long)]={
    Left(nodeId,clusterId,neightbours)
    for(x <- neightbours){
      if(clusterId > -1){
       Right(x,clusterId)
      }
    }
  }
  def func_2(){

  }
  def main ( args: Array[ String ] ) {
    val conf=new SparkConf().setAppName("Partition")
    val sc=new SparkContext(conf)
    var count : Int = 0

    var graph=sc.textFile(args(0)).map(line =>{
                                         var nodeId:Long=line(0).toLong
                                         var clusterId:Long=1
                                         var neighbours=new ListBuffer[Long]()
                                         if(count < 5){
                                           clusterId=line(0).toLong
                                         }else{
                                           clusterId= -1 * clusterId
                                         }
                                         val nums=line.split(",")
                                         for(i <- 1 to line.length()-1){
                                           neighbours.+=(nums(i).toLong)
                                         }
                                         (nodeId,clusterId,neighbours.toList)
                                         }).collect()
    graph.foreach(println)
    for (i <- 1 to depth)
      graph = graph.flatMap{ func_1 }.groupByKey.map{ /* (2) */ }
    /* finally, print partition sizes */

  }
}

What are you trying to do? Is this "if clusterId is > -1, then return just the clusterId, otherwise return the clusterId and the neighbours". You have to return "either" something of `(Long,List[Long])` or something of the type `Long`. Sounds like you want something more like `def func_1.... = if(clusterId > -1) Right(clusterId) else Left(clusterId, neightbours)` — Sidd Singal, Nov 08 '18 at 20:13
@siddsingal: now i am trying to return tuple of either (Long,Long,List[Long]) or (Long,Long) still gettting the same error — nikhil kekan, Nov 08 '18 at 20:21
Why do you need the for loop? The clusterId isn't going to change for each neighbor — Sidd Singal, Nov 08 '18 at 20:24
Also, the reason you are getting an error is because a scala "for" loop isn't an expression. It doesn't return anything. That entire statement will evaluate to nothing (or `Unit`). You can assign the output to a variable and then add the variable at the end to return t. — Sidd Singal, Nov 08 '18 at 20:26
I am newbie to functional programming and scala, I couldnt figure out a better way — nikhil kekan, Nov 08 '18 at 20:27
Just describe in human words what the function should do. If the clusterId is greater than -1, do you want to return the entire list of neighbors, the first neighbor, the last neighbor? — Sidd Singal, Nov 08 '18 at 20:28
@siddsingal : I am trying to write a map reduce kind of functionality where this function is like a mapper which emits (nodeId,clusterId,neighbours) and then checks for every element in neighbours and if clusterId > -1 then emits (nodeId,clusterId). In short tuple (nodeId,clusterId,neighbours) has to be emitted unconditionally — nikhil kekan, Nov 08 '18 at 20:36

score 1 · Answer 1 · edited Nov 08 '18 at 22:06

1

It's really hard to figure out what you want because your code makes absolutely no sense.

I'm going to take a wild guess that you might be looking for something like this.

def func_1(nodeId      :Long
          ,clusterId   :Long
          ,neightbours :List[Long]
          ) :Either[(Long,Long,List[Long]),List[(Long,Long)]] =

  if (clusterId > -1) Right(neightbours.map(_ -> clusterId))
  else                 Left(nodeId, clusterId, neightbours)

At least this compiles, and that's a place to start.

def func_1( ... ) :Either[ ... ] = { //a method that returns an Either

  Left(nodeId,clusterId,neightbours) //create a Left expression of the Either
                                     //don't do anything with it, throw it away
  for(x <- neightbours){             //grab all the neightbours (spelling?)
    if(clusterId > -1){              //if clusterId is positive
      Right(x,clusterId)             //create a Right expression of the Either
    }                                //don't do anything with it, throw it away
  }
}                                    //I'm done, return nothing

edited Nov 08 '18 at 22:06

Brad Larson

170,088
45
397
571

answered Nov 08 '18 at 20:30

jwvh

50,871
7
38
64

I am trying to write a map reduce kind of functionality where this function is like a mapper which emits (nodeId,clusterId,neighbours) and then checks for every element in neighbours and if clusterId > -1 then emits (nodeId,clusterId). In short tuple (nodeId,clusterId,neighbours) has to be emitted unconditionally – nikhil kekan Nov 08 '18 at 20:35
Not only does your code make no sense, your explanation makes no sense. Type `Either[]` is of type `Right` **or** type `Left`. It can't be both. And a method can't "emit" something and then do something more. To emit means to return. When a routine returns something then it is finished and doesn't exist anymore. – jwvh Nov 08 '18 at 20:43
I am sorry for not describing my problem in context of scala. updating my question again – nikhil kekan Nov 08 '18 at 20:47

please suggest best way to write an anonymous function here

1 Answers1