I am trying split my pipeline into many smaller pipelines so they execute faster. I am partitioning a PCollection of Google Cloud Storage blobs (PCollection)so that I get a
PCollectionList<Blob> collectionList
from there I would love to be able to something like:
Pipeline p2 = Pipeline.create(collectionList.get(0));
.apply(stuff)
.apply(stuff)
Pipeline p3 = Pipeline.create(collectionList.get(1));
.apply(stuff)
.apply(stuff)
But I haven't found any documentation about creating an initial PCollection from an already existing PCollection, I'd be very grateful if anyone can point me the right direction. Thanks!