0

a txt file (1.txt) and column delimiter is a space for each row

content as following(in fact,there are many lines...):

1 2 3 4
a b c d
1 1 1 1
1 1 2 2

expected:

1 2 3 4
a b c d

now, i can get the correct output by as following code:

val lines = io.Source.fromFile("/root/1.txt").getLines
lines.filterNot(_.split("\\s+").distinct.length<4).foreach(println)

but, any more efficient methods to realize it ?

for example: multithreading or akka to realize it, thx.

mop
  • 423
  • 2
  • 11

1 Answers1

0

if you want to process the file concurrently one option would be to use parallel collections. so a file with large number of records will be processed parallel with multiple threads.

Note: make sure the heap space is sized accordingly to support a very large file. One could also use parallel iterators provided by a third party library as explained in this answer

val lines = scala.io.Source.fromFile("/root/1.txt").getLines.toSeq.par
lines.filterNot(_.split("\\s+").distinct.length<4).foreach(println)
Community
  • 1
  • 1
rogue-one
  • 11,259
  • 7
  • 53
  • 75