I have a schedule which runs my flow twice a day - at 0910 and 1520 BST.
This is spawning a massive number of DataFlow jobs - so far today just the second schedule (1520) has spawned 80 jobs:
$ gcloud dataflow jobs list
JOB_ID NAME TYPE CREATION_TIME STATE REGION
2018-07-29_12_17_06-14876588186269022154 project-name-513008-by-username Batch 2018-07-29 19:17:07 Running us-central1
2018-07-29_12_14_54-6436458673562317581 project-name-512986-by-username Batch 2018-07-29 19:14:55 Cancelled us-central1
2018-07-29_12_13_55-6167618802124600084 project-name-512985-by-username Batch 2018-07-29 19:13:57 Cancelled us-central1
...
(see PasteBin for the full list)
In the days after the DataPrep update last week, I had trouble accessing the run settings url for the flow. I suspect that there's a process as part of the run settings which walks back through the flow (I have 12 flows chained by reference datasets) and sanity checks it - it seems that my flow was just on the cusp of being complex enough to cause the page load to time out, and I had to cut out a couple of steps just to get to the run settings.
I wonder if each time this timed out, it somehow duplicated the schedule or something else in the process - but then again, the number of duplicated jobs is inconsistent.
I recently rebuilt this project after seeing some issues with sampling errors (in that the sample was corrupt, so I couldn't load the transformation UI, but also couldn't build a new sample). After a hefty attempt at resolving the issue, I took the chance to rebuild as a dedicated GCP project with structure improvements, etc. I didn't see this scheduling error before the rebuild.