Upfront my question is: Are there any standard/common methods for implementing a software package that maintains and updates a MySQL database?
I'm an undergraduate research assistant and I've been tasked with creating a cron job that updates one of our university's in house bioinformatics databases.
Instead of building on monolithic binary that does every aspect of the work, I've divided the problem into subtasks and written a few python/c++ modules to handle the different tasks, as listed in the pipeline below:
- Query the remote database for a list of updated files and return the result for the given time interval (monthly updated files / weekly / daily);
- Module implemented in python. URL of updated file(s) output to stdout
- Read in relative URL's of updated files and download to local directory
- Module implemented in python
- Unzip each archive of new files
- Implemented as bash script
- Parse files into CSV format
- Module implemented in C++
- Run MySQL query to insert CSV files into database
- Obviously just a bash script
I'm not sure how to go about combining these modules into one package that can be easily moved to another machine, say if our current servers run out of space and the DB needs to be copied to another file-system (It's already happened once before).
My first thought is to create a bash script that pipes all of these modules together given that they all operate with stdin/stdout anyway, but this seems like an odd way of doing things.
Alternatively, I could write my C++ code as a python extension, package all of these scripts together and just write one python file that does this work for me.
Should I be using a package manager so that my code is easily installed on different machines? Does a simple zip archive of the entire updater with an included makefile suffice?
I'm extremely new to database management, and I don't have a ton of experience with distributing software, but I want to do a good job with this project. Thanks for the help in advance.