0

I'm working on a project with many Beam pipelines written in Java that needs to be packaged as a jar file for execution from our job scheduler. I've attempted to use build profiles for creating a jar for each main but this seems messy and I've had issues with dependency conflicts (with beam-sdks-java-io-amazon-web-services when its not used its still looking for required region options). I'm also just looking for overall sustainable project structure advice for a growing Beam code base.

What are the best practices for packaging pipelines to be executed on a schedule? Should I package multiple pipelines together so that I can execute each pipeline using the pipeline name and pipeline options parameters, if so, how? (potentially using some sort of master runner main that executes pipelines based on input parameters) Or should each pipeline be its own Maven project (this requires many jars)? Thoughts?

pnadolny
  • 113
  • 7

1 Answers1

0

I don't think there's a recommended way of solving this. Each way has benefits and downsides (e.g. consider the effort of updating the pipelines).

I think the common jar solution is fine if it works for you. E.g., there are multiple Beam example pipelines in the same package, and you run them by specifying the main class. It is similar to what you are trying to achieve.

Whether you need a master main also depends on specifics of your project and environment. It may be sufficient to just use java -cp mainclass and get around without extra management code.

Anton
  • 2,431
  • 10
  • 20