I have read some documents with textFile
, and did a flatMap
of the single words, adding some extra information for each word:
val col = sc.textFile(args.getOrElse("input","documents/*"))
.flatMap(_.split("\\s+").filter(_.nonEmpty))
val mapped = col.map(t => t + ": " + extraInformation())
I am currently saving this to text easily
mapped.saveAsTextFile(args.getOrElse("output", "results"))
But I cannot figure out how to save the map to a BigQuery schema. All examples I have seen create the initial Scollection from BigQuery and then save it to another table, so the initial collection is [TableRow]
instead of [String]
.
What is the correct approach here? Should I investigate how to convert my data to a kind of collection Big Query will accept? Or should I try to investigate further how to push this plain text straight into a table?