13

I have been looking during hours for a way to check in a database into source control. My first idea was a program for calculating database diffs and ask all the developers to imlement their changes as new diff scripts. Now, I find that if I can dump a database into a file I cound check it in and use it as just antother type of file.

The main conditions are:

  • Works for Oracle 9R2
  • Human readable so we can use diff to see the diferences. (.dmp files doesn't seem readable)
  • All tables in a batch. We have more than 200 tables.
  • It stores BOTH STRUCTURE AND DATA
  • It supports CLOB and RAW Types.
  • It stores Procedures, Packages and its bodies, functions, tables, views, indexes, contraints, Secuences and synonims.
  • It can be turned into an executable script to rebuild the database into a clean machine.
  • Not limitated to really small databases (Supports least 200.000 rows)

It is not easy. I have downloaded a lot of demos that does fail in one way or another.

EDIT: I wouldn't mind alternatives aproaches provided that they allows us to check a working system against our release DATABASE STRUCTURE AND OBJECTS + DATA in a batch mode.

By the way. Our project has been developed for years. Some aproaches can be easily implemented when you make a fresh start but seem hard at this point.

EDIT: To understand better the problem let's say that some users can sometimes do changes to the config data in the production eviroment. Or developers might create a new field or alter a view without notice in the realease branch. I need to be aware of this changes or it will be complicated to merge the changes into production.

borjab
  • 11,149
  • 6
  • 71
  • 98
  • 3
    You want to include that actual data in the tables in your source control? If so, it seems like a bad idea to me... – Erich Kitzmueller Nov 11 '09 at 18:10
  • Agreed - in the general instance I'd've thought you fairly explicitly don't want to include data in the version controlled schema – Murph Nov 11 '09 at 18:26
  • 2
    It depends. There is often default data which is essential to getting a blank database usable. Things like a default user account or configuration information would likely be included. – Adam Hawkes Nov 11 '09 at 18:51
  • 1
    In the case of "seed" data, a script with a bunch of inserts would be preferable (for an SCM), rather than a dump of the data, IMO. – Ryan Emerle Nov 11 '09 at 18:53
  • 1
    If 200k rows is a large database to you, Access will work fine. Then write a batch script to copy the mdb every 10 minutes and zip it and write it to a large array of disks. – Stephanie Page Nov 11 '09 at 20:28
  • Human-readable and RAW datatypes ? Look into DBMS_METADATA.GET_DDL to pull out all the schema objects. – Gary Myers Nov 11 '09 at 21:56
  • AmmoQ. I am not going to store all the data. I just need to keep the configuration data in source control so that I can compare a system config and check to see what the user might have changed since the last release. – borjab Nov 12 '09 at 12:10
  • Stephanie Page - 200k is not large. But config tables tend to be smaller. In fact 10M tables is quite usual but not for config data. I will edit the question to make it clearer. – borjab Nov 12 '09 at 12:13

9 Answers9

14

So many people try to do this sort of thing (diff schemas). My opinion is

  • Source code goes into a version control tool (Subversion, CSV, GIT, Perforce ...). Treat it as if it was Java or C code, its really no different. You should have an install process that checks it out and applies it to the database.
  • DDL IS SOURCE CODE. It goes into the version control tool too.
  • Data is a grey area - lookup tables maybe should be in a version control tool. Application generated data certainly should not.

The way I do things these days is to create migration scripts similar to Ruby on Rails migrations. Put your DDL into scripts and run them to move the database between versions. Group changes for a release into a single file or set of files. Then you have a script that moves your application from version x to version y.

One thing I never ever do anymore (and I used to do it until I learned better) is use any GUI tools to create database objects in my development environment. Write the DDL scripts from day 1 - you will need them anyway to promote the code to test, production etc. I have seen so many people who use the GUIs to create all the objects and come release time there is a scrabble to attempt to produce scripts to create/migrate the schema correctly that are often not tested and fail!

Everyone will have their own preference to how to do this, but I have seen a lot of it done badly over the years which formed my opinions above.

Stephen ODonnell
  • 4,441
  • 17
  • 19
  • 2
    +1 ... similar to my experience, and I'll add that GRANTS should be part of source control. Grants are often added/changed in testing environments and never captured for the production migration. – dpbradley Nov 12 '09 at 12:33
  • +1 for scripting everything. I prefer a separate file per object. I admit that might get a bit too gnarly if the database has hundreds of tables, but the mechanics of managing (say) one file per schema are even gnarlier. – APC Nov 12 '09 at 15:34
  • I like the scripting option. It is just that I am looking for a tool to generate the incremental script. – borjab Nov 12 '09 at 16:36
  • 1
    If I were you, I would draw a line in the sand now. Generate one big dirty script that can create your DB as it is now, and then put a new process in place for moving forward. I had to do a similar thing a while back, but now > 40 developers are managing migration files, but not all of them are disciplined enough to create the DDL scripts up front ... they will learn, someday ;-) – Stephen ODonnell Nov 12 '09 at 22:40
2

Oracle SQL Developer has a "Database Export" function. It can produce a single file which contains all DDL and data.

Adam Hawkes
  • 7,218
  • 30
  • 57
2

I use PL/SQL developer with a VCS Plug-in that integrates into Team Foundation Server, but it only has support for database objects, and not with the data itself, which usually is left out of source control anyways.

Here is the link: http://www.allroundautomations.com/bodyplsqldev.html

Hector Minaya
  • 1,695
  • 3
  • 25
  • 45
1

It may not be as slick as detecting the diffs, however we use a simple ant build file. In our current CVS branch, we'll have the "base" database code broken out into the ddl for tables and triggers and such. We'll also have the delta folder, broken out in the same manner. Starting from scratch, you can run "base" + "delta" and get the current state of the database. When you go to production, you'll simply run the "delta" build and be done. This model doesn't work uber-well if you have a huge schema and you are changing it rapidly. (Note: At least among database objects like tables, indexes and the like. For packages, procedures, functions and triggers, it works well.) Here is a sample ant task:

    <target name="buildTables" description="Build Tables with primary keys and sequences">
<sql driver="${conn.jdbc.driver}" password="${conn.user.password}"
    url="${conn.jdbc.url}" userid="${conn.user.name}"
    classpath="${app.base}/lib/${jdbc.jar.name}">
    <fileset dir="${db.dir}/ddl">
        <include name="*.sql"/>
    </fileset>
</sql>
</target>
Nick
  • 2,524
  • 17
  • 25
1

I think this is a case of,

  • You're trying to solve a problem
  • You've come up with a solution
  • You don't know how to implement the solution
  • so now you're asking for help on how to implement the solution

The better way to get help,

  • Tell us what the problem is
  • ask for ideas for solving the problem
  • pick the best solution

I can't tell what the problem you're trying to solve is. Sometimes it's obvious from the question, this one certainly isn't. But I can tell you that this 'solution' will turn into its own maintenance nightmare. If you think developing the database and the app that uses it is hard. This idea of versioning the entire database in a human readable form is nothing short of insane.

Stephanie Page
  • 3,875
  • 1
  • 18
  • 22
  • But here seems that most people agrees that scripts are the best way of keeping changes. Does not need to be just a big moster file. There are some quite decent tools that can automatically extract the structure and objects and they even generates the queries to create it or diff against another database. I'm just havinf problems with the data. Anyway. A binary format (like de .dmp) does not seem better. Not having a source control of the database has lead into hard problems. – borjab Nov 13 '09 at 08:21
1

Have you tried Oracle's Workspace Manager? Not that I have any experience with it in a production database, but I found some toy experiments with it promising.

René Nyffenegger
  • 39,402
  • 33
  • 158
  • 293
  • that's great for versioning data in an existing data. Not so much for the ddl changes too Not so much for recreation of a fresh DB – Stephanie Page Nov 11 '09 at 21:22
1

Don't try to diff the data. Just write a trigger to store whatever-you-want-to-get when the data is changed.

J-16 SDiZ
  • 26,473
  • 4
  • 65
  • 84
1

Expensive though it may be, a tool like TOAD for Oracle can be ideal for solving this sort of problem.

That said, my preferred solution is to start with all of the DDL (including Stored Procedure definitions) as text, managed under version control, and write scripts that will create a functioning database from source. If someone wants to modify the schema, they must, must, must commit those changes to the repository, not just modify the database directly. No exceptions! That way, if you need to build scripts that reflect updates between versions, it's a matter of taking all of the committed changes, and then adding whatever DML you need to massage any existing data to meet the changes (adding default values for new columns for existing rows, etc.) With all of the DDL (and prepopulated data) as text, collecting differences is as simple as diffing two source trees.

At my last job, I had NAnt scripts that would restore test databases, run all of the upgrade scripts that were needed, based upon the version of the database, and then dump the end result to DDL and DML. I would do the same for an empty database (to create one from scratch) and then compare the results. If the two were significantly different (the dump program wasn't perfect) I could tell immediately what changes needed to be made to the update / creation DDL and DML. While I did use database comparison tools like TOAD, they weren't as useful as hand-written SQL when I needed to produce general scripts for massaging data. (Machine-generated code can be remarkably brittle.)

Craig Trader
  • 15,507
  • 6
  • 37
  • 55
  • I had downloaded the last Toad Demo. I am afreaid that it does not allow you to diff data to all the tables in una command. 200 tables means to repeat the operation manually 200 times. Thank you for the answer anyway – borjab Nov 20 '09 at 08:14
1

Try RedGate's Source Control for Oracle. I've never tried the Oracle version, but the MSSQL version is really great.

Aram Paronikyan
  • 1,598
  • 17
  • 22