-1

What is the difference between Spark Row and Scala List, both provide a way to access items by Index When to use which one

The only difference I can see in Row is that it has some schema.

scala> val a=Row(1,"hi",2,"hello")
a: org.apache.spark.sql.Row = [1,hi,2,hello]

scala> a(0)
res61: Any = 1

scala> a(2)
res62: Any = 2

scala> a(3)
res63: Any = hello

scala> val b=List(1, "hi", 2,"hello")
b: List[Any] = List(1, hi, 2, hello)

scala> b(1)
res64: Any = hi

scala> b(2)
res65: Any = 2

scala> b(3)
res66: Any = hello

Please help me understand why Row came into the picture.

Harshal Parekh
  • 5,918
  • 4
  • 21
  • 43

1 Answers1

0

Re:

both provide a way to access items by Index When to use which one

This is just one aspect I believe. If you look at the functions supported by Row when compared to List, then you may realize List has many additional features than Row. Looking at source code it seems Row is backed by an Array. List is a different than Array as it represents Linked List kind of data structure. Also if you are not working with Spark then you should use any available and best suited List implementation in Scala library than in Spark Library.

Re:

The only difference I can see in Row is that it has some schema.

As per my understanding Row can be constructed with or without schema.

Amit
  • 1,111
  • 1
  • 8
  • 14