1

I have a crunch dofn which generates a Pcollection currently i m writing the pcollection to a single avro file i want to write the Pcollection to multiple files.


 PCollection<String> generatedResults = results.parallelDo(new AvroGeneratorDofn(count),Avros.specifics(String.class));
    //generatedResults.write(To.avroFile(outputPath));
    pipeline.write(generatedResults,new AvroFileTarget(outputPath), Target.WriteMode.APPEND);
Sneha
  • 13
  • 3

1 Answers1

0

The same PCollection can be written to any number of targets,

generatedResults.write(To.avroFile(outputPath));
generatedResults.write(new AvroFileTarget(outputPath), Target.WriteMode.APPEND);

See Apache Crunch - Getting Started:

Just as a single Pipeline instance can read data from multiple Sources, a Pipeline may also write multiple outputs for each PCollection.

sudeep
  • 735
  • 1
  • 9
  • 8