To add to the title: I now have a working workflow consinsting of two steps.
1) I extract the HTML Search Result pages for every keyword given in a input.txt file. - e.g.:
SAP;
Business Intelligence;
Talend saved those results and writes them as HTML to keywords_SAP.txt
and keywords_Business Intelligence.txt
. Attached is an image of the talend job.
2) I use Java Code to import these files (one by one) - Parse the Data out of the DOM Structure using the JSoup Library. Straigt away, the data gets written into a MySQL Database.
Here is my problem: It all works fine for now, but the requirement is to completely automate the process in the future, so it can run on a server periodically.
Therefore I thought to include my Java Code in Talend - which got my stuck, because I wasn't able to import the mysql connector and the jsoup.jar.
Where I need your help is either to advise me how to connect to my existing Talend workflow - or you are maybe thinking of an easier solution, which I'm just not thinking of right now.
I have to add, I'm quite new to coding, and it was a big leap to come this far with parsing and writing into a DB. With your help throughout the process, I got more comfortable though. I hope you can help me solve this problem. Thank you in advance for your time spent.