Using split
function with -1
is the thing you required. Observe below with and without scenarios.
import ss.implicits._
val rd = sc.textFile("path to your file")
.map(x => x.split("[|]",-1)).map(x => (x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), x(8), x(9), x(10), x(11), x(12), x(13), x(14), x(15), x(16))) // `split` function with `-1`
rd.foreach(println)
Output :
(A,B,,,,,,,,,,C,D,,,,)
Without split function, it throws error. Because it cannot read the last 4 empty columns.
import ss.implicits._
val rd = sc.textFile("path to your file")
.map(x => x.split("[|]")).map(x => (x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), x(8), x(9), x(10), x(11), x(12), x(13), x(14), x(15), x(16))) // `split` function without `-1`
rd.foreach(println)
java.lang.ArrayIndexOutOfBoundsException: 13