0

I want to write two scheduled jobs for my Ubuntu 14.04.4 server. The jobs need to be sequential.

The first job should unzip a .gz file (SQL Dump) and then import the table "myTable" into MySQL Database (localhost).

The second job (written using Pentaho Data Integration tool) extracts data from the table "myTable" , transforms it and loads it into a new database.

I could have accomplished the first task using pentaho PDI spoon but it doesn't provide any function to unzip a .gz file & after some research and coming accross these posts :

http://forums.pentaho.com/showthread.php?82566-How-to-use-the-content-of-a-tar-gz-file-in-Kettle

How to uncompress and import a .tar.gz file in kettle?

I have gathered that I should manually write a job to accomplish the first task i.e. unzip a .gz file and then import the table "myTable" into MySQL Database.

My question is that how to create a cron job that executes the two sequentially i.e. first job first completes and then the second is executed.

If there is any better alternative approach to this please suggest.

Community
  • 1
  • 1
Danish Bin Sofwan
  • 476
  • 1
  • 6
  • 21

1 Answers1

1

You can make use of the "SHELL" step in a PDI job. Code the unzip portion of your code in the shell step followed sequentially by your transformation. A sample image looks like this:

enter image description here

Now you can schedule this complete job in CRON or any other scheduler. No need for separate scripts.

Note: This works only in a linux env. which i assume you are using.

Hope this helps :)

Rishu Shrivastava
  • 3,745
  • 1
  • 20
  • 41