-1

def trainBestSeller(events: RDD[BuyEvent], n: Int, itemStringIntMap: BiMap[String, Int]): Map[String, Array[(Int, Int)]] = { val itemTemp = events // map item from string to integer index .flatMap { case BuyEvent(user, item, category, count) if itemStringIntMap.contains(item) => Some((itemStringIntMap(item),category),count) case _ => None } // cache to use for next times .cache()

    // top view with each category:
    val bestSeller_Category: Map[String, Array[(Int, Int)]] = itemTemp.reduceByKey(_ + _)
                                            .map(row => (row._1._2, (row._1._1, row._2)))
                                            .groupByKey
                                            .map { case (c, itemCounts) =>
                                              (c, itemCounts.toArray.sortBy(_._2)(Ordering.Int.reverse).take(n))
                                            }
                                            .collectAsMap.toMap




    // top view with all category => cateogory ALL
    val bestSeller_All: Map[String, Array[(Int, Int)]] = itemTemp.reduceByKey(_ + _)
    .map(row => ("ALL", (row._1._1, row._2)))
    .groupByKey
    .map { 
        case (c, itemCounts) =>
            (c, itemCounts.toArray.sortBy(_._2)(Ordering.Int.reverse).take(n))
    }
    .collectAsMap.toMap


    // merge 2 map bestSeller_All and bestSeller_Category
    val bestSeller = bestSeller_Category ++ bestSeller_All
    bestSeller
}
pferrel
  • 5,673
  • 5
  • 30
  • 41
Le Kim Trang
  • 369
  • 2
  • 5
  • 17
  • I see you also posted this to the scala-user list ([here](https://groups.google.com/d/msg/scala-user/-TLmGs9g0Mc/WdbLsSHQBwAJ)) and the scala-language list. It is an appropriate question for scala-user, but not for scala-language. – Seth Tisue Oct 08 '15 at 03:04
  • How are you initializing your RDD? Also, note that you cannot index into your RDD the way you access tuples (e.g. `._1`, `._2`). You must apply a transformation to do this. – Rohan Aletty Oct 08 '15 at 04:39
  • Dear Rohan, yes, I think I have to apply a transformation, can you please help me with that? Thank you very much. – Le Kim Trang Oct 08 '15 at 07:52

1 Answers1

1

List processing

Your list processing seems okay. I did a small recheck

def main( args: Array[String] ) : Unit = {

  case class JString(x: Int)
  case class CompactBuffer(x: Int, y: Int)

  val l = List( JString(2435), JString(3464))
  val tuple: (List[JString], CompactBuffer) = ( List( JString(2435), JString(3464)), CompactBuffer(1,4) )

  val result: List[(JString, CompactBuffer)] = tuple._1.map((_, tuple._2))
  val result2: List[(JString, CompactBuffer)] = {
    val l = tuple._1
    val cb = tuple._2
    l.map( x => (x,cb) )
  }

  println(result)
  println(result2)
}

Result is (as expected)

List((JString(2435),CompactBuffer(1,4)), (JString(3464),CompactBuffer(1,4)))

Further analysis

Analysis is required, if that does not solve your problem:

  • Where are types JStream (from org.json4s.JsonAST ?) and CompactBuffer ( Spark I suppose ) from?
  • How exactly looks the code, that creates pair ? What exactly are you doing? Please provide code excerpts!
Martin Senne
  • 5,939
  • 6
  • 30
  • 47
  • Dear Martin, yes, the issue is still. Your questions: yes, from org.json4s.JsonAST and Spark. I think I have to do a transformation, currently "List( JString(2435), JString(3464))" is a String not a List, I suppose. Can you please help me with that? Thank you very much. – Le Kim Trang Oct 08 '15 at 07:51
  • Having said, please post all the code, that is required to reproduce your problem. (Do an edit of your posted question!) – Martin Senne Oct 08 '15 at 07:54
  • Dear Martin, I added all of my code. Please take a look, thank you! – Le Kim Trang Oct 08 '15 at 07:59
  • Please edit your original question and put the code (including imports etc.) there! – Martin Senne Oct 08 '15 at 07:59
  • I got this error: found : org.apache.spark.rdd.RDD[String], required List[JString]. – Le Kim Trang Oct 08 '15 at 08:00
  • See my edit of your original question. Please add your error message! – Martin Senne Oct 08 '15 at 08:20
  • And in which line occurs the error? Please keep in mind to always put complete (at best as a minimal working example), so others can reproduce your problem easily. Please go ahead and modify your post accordingly! – Martin Senne Oct 08 '15 at 08:30
  • Hello Martin, I already post a new question here http://stackoverflow.com/questions/33013037/json-scala-parse-list. Can you please help me on this? Thank you. – Le Kim Trang Oct 08 '15 at 10:26