Database source control with Oracle

Question

I have been looking during hours for a way to check in a database into source control. My first idea was a program for calculating database diffs and ask all the developers to imlement their changes as new diff scripts. Now, I find that if I can dump a database into a file I cound check it in and use it as just antother type of file.

The main conditions are:

Works for Oracle 9R2
Human readable so we can use diff to see the diferences. (.dmp files doesn't seem readable)
All tables in a batch. We have more than 200 tables.
It stores BOTH STRUCTURE AND DATA
It supports CLOB and RAW Types.
It stores Procedures, Packages and its bodies, functions, tables, views, indexes, contraints, Secuences and synonims.
It can be turned into an executable script to rebuild the database into a clean machine.
Not limitated to really small databases (Supports least 200.000 rows)

It is not easy. I have downloaded a lot of demos that does fail in one way or another.

EDIT: I wouldn't mind alternatives aproaches provided that they allows us to check a working system against our release DATABASE STRUCTURE AND OBJECTS + DATA in a batch mode.

By the way. Our project has been developed for years. Some aproaches can be easily implemented when you make a fresh start but seem hard at this point.

EDIT: To understand better the problem let's say that some users can sometimes do changes to the config data in the production eviroment. Or developers might create a new field or alter a view without notice in the realease branch. I need to be aware of this changes or it will be complicated to merge the changes into production.

You want to include that actual data in the tables in your source control? If so, it seems like a bad idea to me... — Erich Kitzmueller, Nov 11 '09 at 18:10
Agreed - in the general instance I'd've thought you fairly explicitly don't want to include data in the version controlled schema — Murph, Nov 11 '09 at 18:26
It depends. There is often default data which is essential to getting a blank database usable. Things like a default user account or configuration information would likely be included. — Adam Hawkes, Nov 11 '09 at 18:51
In the case of "seed" data, a script with a bunch of inserts would be preferable (for an SCM), rather than a dump of the data, IMO. — Ryan Emerle, Nov 11 '09 at 18:53
If 200k rows is a large database to you, Access will work fine. Then write a batch script to copy the mdb every 10 minutes and zip it and write it to a large array of disks. — Stephanie Page, Nov 11 '09 at 20:28
Human-readable and RAW datatypes ? Look into DBMS_METADATA.GET_DDL to pull out all the schema objects. — Gary Myers, Nov 11 '09 at 21:56
AmmoQ. I am not going to store all the data. I just need to keep the configuration data in source control so that I can compare a system config and check to see what the user might have changed since the last release. — borjab, Nov 12 '09 at 12:10
Stephanie Page - 200k is not large. But config tables tend to be smaller. In fact 10M tables is quite usual but not for config data. I will edit the question to make it clearer. — borjab, Nov 12 '09 at 12:13

score 14 · Accepted Answer · answered Nov 12 '09 at 11:45

So many people try to do this sort of thing (diff schemas). My opinion is

Source code goes into a version control tool (Subversion, CSV, GIT, Perforce ...). Treat it as if it was Java or C code, its really no different. You should have an install process that checks it out and applies it to the database.
DDL IS SOURCE CODE. It goes into the version control tool too.
Data is a grey area - lookup tables maybe should be in a version control tool. Application generated data certainly should not.

The way I do things these days is to create migration scripts similar to Ruby on Rails migrations. Put your DDL into scripts and run them to move the database between versions. Group changes for a release into a single file or set of files. Then you have a script that moves your application from version x to version y.

One thing I never ever do anymore (and I used to do it until I learned better) is use any GUI tools to create database objects in my development environment. Write the DDL scripts from day 1 - you will need them anyway to promote the code to test, production etc. I have seen so many people who use the GUIs to create all the objects and come release time there is a scrabble to attempt to produce scripts to create/migrate the schema correctly that are often not tested and fail!

Everyone will have their own preference to how to do this, but I have seen a lot of it done badly over the years which formed my opinions above.

+1 ... similar to my experience, and I'll add that GRANTS should be part of source control. Grants are often added/changed in testing environments and never captured for the production migration. — dpbradley, Nov 12 '09 at 12:33
+1 for scripting everything. I prefer a separate file per object. I admit that might get a bit too gnarly if the database has hundreds of tables, but the mechanics of managing (say) one file per schema are even gnarlier. — APC, Nov 12 '09 at 15:34
I like the scripting option. It is just that I am looking for a tool to generate the incremental script. — borjab, Nov 12 '09 at 16:36
If I were you, I would draw a line in the sand now. Generate one big dirty script that can create your DB as it is now, and then put a new process in place for moving forward. I had to do a similar thing a while back, but now > 40 developers are managing migration files, but not all of them are disciplined enough to create the DDL scripts up front ... they will learn, someday ;-) — Stephen ODonnell, Nov 12 '09 at 22:40

score 2 · Answer 2 · answered Nov 11 '09 at 18:34

2

Oracle SQL Developer has a "Database Export" function. It can produce a single file which contains all DDL and data.

answered Nov 11 '09 at 18:34

Adam Hawkes

7,218
30
57

Hector Minaya · Answer 3 · 2009-11-11T19:08:42.083

2

I use PL/SQL developer with a VCS Plug-in that integrates into Team Foundation Server, but it only has support for database objects, and not with the data itself, which usually is left out of source control anyways.

Here is the link: http://www.allroundautomations.com/bodyplsqldev.html

edited Nov 11 '09 at 19:08

answered Nov 11 '09 at 19:00

Hector Minaya

1,695
3
25
45

If all you need is "seed" data, then you could create the insert scripts and store them as well in TFS. – Hector Minaya Nov 11 '09 at 21:09
We use it and it works OK for objetcs. Obtaing seed data automatically will work if it also creates tables. – borjab Nov 12 '09 at 12:17
1

The create table scripts can be saved in TFS – Hector Minaya Nov 12 '09 at 14:07

score 1 · Answer 4 · answered Nov 11 '09 at 18:59

It may not be as slick as detecting the diffs, however we use a simple ant build file. In our current CVS branch, we'll have the "base" database code broken out into the ddl for tables and triggers and such. We'll also have the delta folder, broken out in the same manner. Starting from scratch, you can run "base" + "delta" and get the current state of the database. When you go to production, you'll simply run the "delta" build and be done. This model doesn't work uber-well if you have a huge schema and you are changing it rapidly. (Note: At least among database objects like tables, indexes and the like. For packages, procedures, functions and triggers, it works well.) Here is a sample ant task:

    <target name="buildTables" description="Build Tables with primary keys and sequences">
<sql driver="${conn.jdbc.driver}" password="${conn.user.password}"
    url="${conn.jdbc.url}" userid="${conn.user.name}"
    classpath="${app.base}/lib/${jdbc.jar.name}">
    <fileset dir="${db.dir}/ddl">
        <include name="*.sql"/>
    </fileset>
</sql>
</target>

You can store data in this model as well. We typically load the data as insert statements and run them as part of the ant build. — Nick, Nov 12 '09 at 05:14

score 1 · Answer 5 · answered Nov 11 '09 at 20:34

I think this is a case of,

You're trying to solve a problem
You've come up with a solution
You don't know how to implement the solution
so now you're asking for help on how to implement the solution

The better way to get help,

Tell us what the problem is
ask for ideas for solving the problem
pick the best solution

I can't tell what the problem you're trying to solve is. Sometimes it's obvious from the question, this one certainly isn't. But I can tell you that this 'solution' will turn into its own maintenance nightmare. If you think developing the database and the app that uses it is hard. This idea of versioning the entire database in a human readable form is nothing short of insane.

But here seems that most people agrees that scripts are the best way of keeping changes. Does not need to be just a big moster file. There are some quite decent tools that can automatically extract the structure and objects and they even generates the queries to create it or diff against another database. I'm just havinf problems with the data. Anyway. A binary format (like de .dmp) does not seem better. Not having a source control of the database has lead into hard problems. — borjab, Nov 13 '09 at 08:21

score 1 · Answer 6 · answered Nov 11 '09 at 20:53

1

Have you tried Oracle's Workspace Manager? Not that I have any experience with it in a production database, but I found some toy experiments with it promising.

answered Nov 11 '09 at 20:53

René Nyffenegger

39,402
33
158
293

that's great for versioning data in an existing data. Not so much for the ddl changes too Not so much for recreation of a fresh DB – Stephanie Page Nov 11 '09 at 21:22

score 1 · Answer 7 · answered Nov 12 '09 at 12:25

1

Don't try to diff the data. Just write a trigger to store whatever-you-want-to-get when the data is changed.

answered Nov 12 '09 at 12:25

J-16 SDiZ

26,473
4
65
84

How many thousands of times per second is your config data changed by users? – Stephanie Page Nov 13 '09 at 17:04

Craig Trader · Answer 8 · 2009-11-19T23:49:28.983

Expensive though it may be, a tool like TOAD for Oracle can be ideal for solving this sort of problem.

That said, my preferred solution is to start with all of the DDL (including Stored Procedure definitions) as text, managed under version control, and write scripts that will create a functioning database from source. If someone wants to modify the schema, they must, must, must commit those changes to the repository, not just modify the database directly. No exceptions! That way, if you need to build scripts that reflect updates between versions, it's a matter of taking all of the committed changes, and then adding whatever DML you need to massage any existing data to meet the changes (adding default values for new columns for existing rows, etc.) With all of the DDL (and prepopulated data) as text, collecting differences is as simple as diffing two source trees.

At my last job, I had NAnt scripts that would restore test databases, run all of the upgrade scripts that were needed, based upon the version of the database, and then dump the end result to DDL and DML. I would do the same for an empty database (to create one from scratch) and then compare the results. If the two were significantly different (the dump program wasn't perfect) I could tell immediately what changes needed to be made to the update / creation DDL and DML. While I did use database comparison tools like TOAD, they weren't as useful as hand-written SQL when I needed to produce general scripts for massaging data. (Machine-generated code can be remarkably brittle.)

I had downloaded the last Toad Demo. I am afreaid that it does not allow you to diff data to all the tables in una command. 200 tables means to repeat the operation manually 200 times. Thank you for the answer anyway — borjab, Nov 20 '09 at 08:14

score 1 · Answer 9 · answered Sep 18 '14 at 08:52

1

Try RedGate's Source Control for Oracle. I've never tried the Oracle version, but the MSSQL version is really great.

answered Sep 18 '14 at 08:52

Aram Paronikyan

1,598
17
22

Database source control with Oracle

9 Answers9

Linked