The bottleneck is always data travelling from and to the database.
So bring the functionality to where the data is rather than bringing the data to the functionality - always.
Have you actually explored all the functionality you have at your disposal with a DBMS actually complying with the ANSI 99 or even the ANSI 2003 standard of SQL?
You'll actually be surprised how little is left that is too complicated to do in SQL.
I have long since come to the point where I say:
If you can't do it in SQL, do it in C (or C++). If you can't do it in C, it's not worth doing. Search the net for "Real Programmers Don't Use Pascal" to see what I mean.
First, check the DBMS's documentation. Especially Vertica's, which is really detailled.
Then, create SQL functions. And - only as a last resort, code a User Defined Extension.
The best architecture you can build is simple.
The best code is the one you don't have to develop.
Ansering to your comment you posted in the meantime:
- Explore the Machine Learning Function library and documentation in Vertica - there is a ton of functionality that you don't have to develop and can just use:
https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/SQLReferenceManual/Functions/MachineLearning/_MLFunctions.htm?tocpath=SQL%20Reference%20Manual%7CSQL%20Functions%7CMachine%20Learning%20Functions%7C_____0
And for whatever might be missing- try User Defined Extensions - in Python, in your case:
https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/ExtendingVertica/UDx/DevelopingUDxs.htm?tocpath=Extending%20Vertica%7CDeveloping%20User-Defined%20Extensions%20(UDxs)%7C_____0
and:
https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/ExtendingVertica/Python/IntroductionPythonSDK.htm?tocpath=Extending%20Vertica%7CDeveloping%20User-Defined%20Extensions%20(UDxs)%7CDeveloping%20with%20the%20Python%20SDK%7C_____0