I know I can do random splitting with randomSplit method:
val splittedData: Array[Dataset[Row]] =
preparedData.randomSplit(Array(0.5, 0.3, 0.2))
Can I split the data into consecutive parts with some 'nonRandomSplit method'?
Apache Spark 2.0.1. Thanks in advance.
UPD: data order is important, I'm going to train my model on data with 'smaller IDs' and test it on data with 'larger IDs'. So I want to split data into consecutive parts without shuffling.
e.g.
my dataset = (0,1,2,3,4,5,6,7,8,9)
desired splitting = (0.8, 0.2)
splitting = (0,1,2,3,4,5,6,7), (8,9)
The only solution I can think of is to use count and limit, but there probably is a better one.