If your job is a finite unit of work, then you should be using Livy's Batch
abstraction and not Session
. Session
s are for interactive work (e.g., Jupyter Notebook or Apache Zeppelin), where users submit some queries, evaluate the results, and submit some more. Batch
, on the other hand, most closely resembles what you'd generally submit using spark-submit
executable; it will also end on its own when the job tasks are done, and will clean up after itself, so there's no need for a connection pool. That said, a connection pool makes little sense for a Session
, either, as each session have a state (variables defined in the past statements running in said session), and that state is not (and shouldn't be) shared