2

I'm trying to cull the data in a list of tables (~30) based on a primary key.

My approach has been to: 1.Create an intermediate table & load it with the desired data for each table
2.Truncate the original table
3.Insert the data from the intermediate table back into the original table.

Here's the code I'm using thus far:

declare @table nvarchar(max) 
open tab
fetch next from tab into @table
while(@@FETCH_STATUS = 0)
       begin
              print @table
             exec ('select * into ' +@table+'_intermediate from '+@table+' where P_ID in( select P_ID from pc_table )')
             exec ('truncate table '+@table)
             exec ('insert into '+@table+' select * from '+@table+'_intermediate')
             exec ('drop table '+@table+'_intermediate') 
            fetch next from tab into @table
       end
close tab
deallocate tab

I'm running into an error:

Cannot insert an explicit value into a timestamp column. 
Use INSERT with a column list to exclude the timestamp column, 
or insert a DEFAULT into the timestamp column.

So, that error is telling me that I can't insert anything into a timestamp column.

In order to avoid selecting the timestamp, I need to avoid selecting it (i.e. using select *).

Is there a simple way of selecting all columns except of type timestamp, or will I need to go into the information schema and build a dynamic select statement for each table?

(or the implicit question, is there a better way of doing what I'm trying to do?)

Thanks

Rolan
  • 2,924
  • 7
  • 34
  • 46

3 Answers3

1

The short answer is that you need to put 'null' in any spot where there's a timestamp column.

I made this little script to create a list of the columns, so that I put that list into the DML statement :

         declare @sel_statement nvarchar(max)=''
         declare @col nvarchar(100) =''
         declare @num_rows int =0
         declare @dat_type nvarchar(30)

         declare cols cursor for
         select column_name, data_type 
         from information_schema.COLUMNS
         where TABLE_NAME = @table  --uses table fetched from tab cursor

         open cols

         fetch next from cols into @col, @dat_type
         while(@@FETCH_STATUS = 0)
                begin
                set @num_rows +=1
                if @dat_type = 'timestamp'
                     set @sel_statement += 'null'
                else  
                      set @sel_statement += @col 
                fetch next from cols into @col, @dat_type
                if @@FETCH_STATUS=0
                      set @sel_statement += ','
                end
         close cols
         deallocate cols 

It's not the prettiest thing ever, but it worked.

Hopefully this can give someone else a hand if they run into this problem.

Rolan
  • 2,924
  • 7
  • 34
  • 46
1

If it is millions of rows, not billions, then simple

DELETE from TABLE where P_ID not in (select P_ID from pc_table)

(maybe in batches) may be acceptable. Drop all indexes (except primary key on ID) and constraints first, delete rows, recreate indexes. Even better, not drop, but disable indexes and then enable them back using REBUILD INDEX.

One more thing to consider. If you do use an intermediate table, then after reINSERT all values of a timestamp column would become different. If you don't care about preserving values in this column you can simply drop this column before the processing and add it back after all is done.

If performance matters, you should disable constraints and indexes on the target tables in any method you choose.

Which brings us to another method:

SELECT * INTO intermediate_table ...

works with timestamp columns fine.

It is

INSERT INTO final_table SELECT * FROM intermediate_table ...

that doesn't work with the timestamp columns.

So, instead of TRUNCATE final_table you can DROP final_table and do SELECT * INTO final_table ... second time as well.

Thus you would preserve the values of timestamp column as well. Of course, you would have to recreate all constraints and indexes of the original table if you completely DROP it.

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
0

How about

delete from TABLE where P_ID not in( select P_ID from pc_table )

?

a2800276
  • 3,272
  • 22
  • 33
  • That would work, but when dealing with millions of rows ( at least from what I've read) it's faster to truncate the table, and doesn't fill the log files with each deletion... But it is probably still faster than copying data, then reinserting it. – Rolan Jan 26 '15 at 18:51
  • If it's really millions of rows and the performance matters, I'm afraid you're stuck with typing out the column names. There's no select *-except :) – a2800276 Jan 26 '15 at 18:54
  • That's correct. PRINT these statements, make huge script, replace * with column list where applicable and run this script. I'd suggest to use transaction for each batch to keep your data safe in case of error. – Alsin Jan 26 '15 at 19:02