-2

what is the difference b/w

val df = List(("amit","porwal")) 

and

val df = List("amit","porwal")

My question is how 2 parenthesis are making a difference.Because On doing

scala > val df = List(("amit","porwal")).toDF("fname","lname")

it works, but on doing

scala > val df = List("amit","porwal").toDF("fname","lname")

scala throws me an error

java.lang.IllegalArgumentException: requirement failed: The number of columns doesn't match. Old column names (1): value New column names (2): fname,lname – at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.Dataset.toDF(Dataset.scala:393) at org.apache.spark.sql.DatasetHolder.toDF(DatasetHolder.scala:44) ... 48 elided

Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
Amit Porwal
  • 43
  • 1
  • 1
  • 7
  • 1
    Possible duplicate of [Difference between Tuple and List\[Any\] in Scala?](https://stackoverflow.com/questions/40904505/difference-between-tuple-and-listany-in-scala) – philantrovert Apr 20 '18 at 08:03

2 Answers2

1

Yes, they are different. The paranthesis inside is treated as tuples by scala compiler. Since there are two string values inside the nested brackets of your first example, it will be treated as Tuple2(String, String). While the second example the string values inside the List are treated as separate elements as String.

the first one val df = List(("amit","porwal")) is List[Tuple2(String, String)]. There is only one element in df and to get porwal you have to do df(0)._2

And,

the second one val df = List("amit","porwal") is List[String]. There are two elements in df and to get porwal you have to do df(1)

Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
  • On doing scala > val df = List(("amit","porwal")).toDF("fname","lname") it works, but on doing scala > val df = List("amit","porwal").toDF("fname","lname") scala throws me an error java.lang.IllegalArgumentException: requirement failed: The number of columns doesn't match. Old column names (1): value New column names (2): fname, lname – – Amit Porwal Apr 20 '18 at 08:30
  • the answer clearly says that they are different. and the error message `The number of columns doesn't match. Old column names (1): value New column names (2): fname, lname` is clear enough I guess – Ramesh Maharjan Apr 20 '18 at 08:38
1

Even though the question is not related to spark

val df = List(("amit","porwal")) 

Here df is list of Tuple2 as List[(String, String)], To get the value "amit" you should use df(0)._1 and for "porwal" df(0)._2

val df = List("amit","porwal")

Here is df is simply list of String as List[String] In case of List[String] you can simply get as df(0) and df(1)

Hope this helps!

jwvh
  • 50,871
  • 7
  • 38
  • 64
koiralo
  • 22,594
  • 6
  • 51
  • 72
  • On doing scala > val df = List(("amit","porwal")).toDF("fname","lname") it works, but on doing scala > val df = List("amit","porwal").toDF("fname","lname") scala throws me an error java.lang.IllegalArgumentException: requirement failed: The number of columns doesn't match. Old column names (1): value New column names (2): fname, lname – – – Amit Porwal Apr 20 '18 at 08:31
  • The error says clearly `The number of columns doesn't match` you have only one column which is a list of string as `List("amit","porwal")` List("amit","porwal").toDF("fname") this should be used in your case. – koiralo Apr 20 '18 at 08:35