-1

input file

(userid,movie,rating)

1,250,3.0

1,20,3.4

1,90,2

2,30,3.5

2,500,2.3

2,20,3.3

I am supposed to to get the highest rated movie the user rated. I am completely lost,I had the program running on hadoop but i am brand new to scala. It is comma delimated.

  • so far i have gotten here but i cant parse the line because correctly.

    val inputfile = sc.textFile("/home/input/input.txt") 
    
    val keyval = inputfile.map(x=>(x(0),x(1)))
    
    .reduceByKey{case (x, y) => (x._1+y._1, math.max(x._2,y._2))}
    
    keyval.maxBy { case (key, value) => value }
    
    keyval.saveAsTextFile("/home/out/word")
    
  • I get these errors -

    <console>:26: error: value _1 is not a member of Char
    
    keyval.reduceByKey{case (x, y) => (x._1+y._1, 
    math.max(x._2,y._2))}
                                        ^
    <console>:26: error: value _1 is not a member of Char
    keyval.reduceByKey{case (x, y) => (x._1+y._1,math.max(x._2,y._2))}
                                             ^
    <console>:26: error: value _2 is not a member of Char
    keyval.reduceByKey{case (x, y) => (x._1+y._1,math.max(x._2,y._2))}
                                                            ^
    <console>:26: error: value _2 is not a member of Char
    keyval.reduceByKey{case (x, y) => (x._1+y._1,math.max(x._2,y._2))}
                                                                 ^
    <console>:26: error: value maxBy is not a member of 
    org.apache.spark.rdd.RDD[(Char, Char)]
    keyval.maxBy { case (key, value) => value }
    
troy
  • 3
  • 2

1 Answers1

0

sc.textFile reads a file line by line as [String] so when you did inputfile.map(x=>(x(0),x(1))) the first and the second characters of each line are used as tuples . And reduceByKey used the first element of the tuple for grouping and the second value, a Char, is sent inside reducyByKey and since the second element is not a tuple but simply a Char, you can't get elements using ._1 and ._2 and thus you had subsequent errors as

error: value _1 is not a member of Char

and

error: value _2 is not a member of Char

And the last error is obvious

error: value maxBy is not a member of

as you can't perform maxBy on Char elements.

Heres the complete working solution for you

val inputfile = sc.textFile("/home/mortaza/input/input.txt")

val keyval = inputfile.map(x=>x.split(",")).map(x => (x(0), (x(1), x(2)))).reduceByKey{case (x, y) => if (x._2 <= y._2) y else x}

keyval.map(x => Seq(x._1, x._2._1, x._2._2).mkString(",")).saveAsTextFile("/home/mortaza/out/wordfreq")

which should generate a csv output with following output (the input used is as given in the question)

2,30,3.5
1,20,3.4

I hope the answer is helpful

Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
  • thank you so much! and if i want to just extract the highest rated movie how would i do that? meaning just the second part of the text would show in this example a 30. – troy Jul 29 '18 at 16:48
  • its a `x._2._1` in keyval – Ramesh Maharjan Jul 29 '18 at 16:52
  • what I am saying for the same example but this time i would like to get the highest rated movie and just the movie to print out. how would i do that? – troy Jul 29 '18 at 16:57
  • Thats what i answered please analyse the short answer and try to understand . Thats the exact answer – Ramesh Maharjan Jul 29 '18 at 23:59