26

I have a legacy table with about 100 columns (90% nullable). In those 90 columns I want to remove all empty strings and set them to null. I know I can:

update table set column = NULL where column = '';
update table set column2 = NULL where column2 = '';

But that is tedious and error prone. There has to be a way to do this on the whole table?

Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
Kyle West
  • 8,934
  • 13
  • 65
  • 97
  • 1
    Is this a one-shot affair or will this need to be kept up? – Brian Hooper Jul 13 '10 at 14:57
  • one-shot, would prob. run it some other time too though. – Kyle West Jul 13 '10 at 15:40
  • I was going to suggest that, if there is any danger of zero length strings finding their way into the table again, that you could create a view in which the zero length strings are converted to NULL values. That way your applications would never have to worry about them. – Brian Hooper Jul 13 '10 at 16:06

7 Answers7

49
UPDATE
    TableName
SET
    column01 = CASE column01 WHEN '' THEN NULL ELSE column01 END,
    column02 = CASE column02 WHEN '' THEN NULL ELSE column02 END,
    column03 = CASE column03 WHEN '' THEN NULL ELSE column03 END,
    ...,
    column99 = CASE column99 WHEN '' THEN NULL ELSE column99 END

This is still doing it manually, but is slightly less painful than what you have because it doesn't require you to send a query for each and every column. Unless you want to go to the trouble of scripting it, you will have to put up with a certain amount of pain when doing something like this.

Edit: Added the ENDs

Hammerite
  • 21,755
  • 6
  • 70
  • 91
  • @OMG: why? Presumably, you are referring to the ELSE columnXX END column name, but why bother with TRIM? – Jonathan Leffler Jul 13 '10 at 16:16
  • @Jonathan Leffler: This is what I meant: `CASE TRIM(column01) WHEN ''...` I should have been more clear that I meant the comparison, sorry. – OMG Ponies Jul 13 '10 at 16:18
  • @OMG: OK - I now know what you meant, but I reiterate my question: why? What is the possible benefit of trimming compared to simple comparison with a blank string? The DBMS is likely to end up doing blank padding of the trimmed string because the trimmed value might be as long as the untrimmed value, so you are just making it do vacuous work which it may or may not spot is vacuous. – Jonathan Leffler Jul 13 '10 at 22:55
  • 1
    @Jonathan Leffler: Because "\s\s\s" != "" unless you run TRIM/etc over it. Collation can affect that comparison. – OMG Ponies Jul 13 '10 at 22:59
  • @OMG: Where does "\s\s\s" == ""? In anything resembling standard SQL, a string containing backslashes cannot match the empty string. I suppose the question is tagged MySQL - but I find the concept that backslash-ess repeated 3 times is an empty string ... strange. Very strange. – Jonathan Leffler Jul 13 '10 at 23:32
  • @Jonathan Leffler: This comment system considers pressing the space bar 15 times to be a zero length string... It's a very common interpretation. – OMG Ponies Jul 13 '10 at 23:37
  • @OMG: In [MySQL](http://dev.mysql.com/doc/refman/5.5/en/string-syntax.html), "\s" is interpreted the same as "s". In SQL, a CHAR(40) value, say, is blank padded to full length; two strings `" abc "` and `" abc "` compare equal if their types are CHAR(40). If the columns in the question are CHAR, then I don't see TRIM as being necessary - certainly in the DBMS I use, it would be unnecessary. If the string literal is interpreted as a VARCHAR(n) and if two VARCHAR(n) values are not compared with blank padding - and some SQL collations would not - then you may need to TRIM. [...continued...] – Jonathan Leffler Jul 14 '10 at 05:04
11

One possible script:

for col in $(echo "select column_name from information_schema.columns
where table_name='$TABLE'"|mysql --skip-column-names $DB)
do
echo update $TABLE set $col = NULL where $col = \'\'\;
done|mysql $DB
Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
7

For newbies, you may still need more work after seeing the above answers. And it's not realistic to type thousands lines. So here I provide a complete working code to let you avoid syntax errors etc.

DROP PROCEDURE IF EXISTS processallcolumns;

DELIMITER $$

CREATE PROCEDURE processallcolumns ()
BEGIN

  DECLARE i,num_rows INT ;
  DECLARE col_name char(250);

  DECLARE col_names CURSOR FOR
  SELECT column_name
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE table_name = 'PROCESSINGTABLE'
  ORDER BY ordinal_position;

  OPEN col_names ;
  select FOUND_ROWS() into num_rows;

  SET i = 1;
  the_loop: LOOP

     IF i > num_rows THEN
          CLOSE col_names;
          LEAVE the_loop;
      END IF;


      FETCH col_names 
      INTO col_name;     


      SET @command_text = CONCAT('UPDATE `PROCESSINGTABLE` SET ', col_name, '= IF(LENGTH(', col_name, ')=0, NULL,', col_name, ') WHERE 1 ;' ) ;

--      UPDATE `PROCESSINGTABLE` SET col_name=IF(LENGTH(col_name)=0,NULL,col_name) WHERE 1;
--      This won't work, because MySQL doesn't take varibles as column name.

      PREPARE stmt FROM @command_text ;
      EXECUTE stmt ;

      SET i = i + 1;  
  END LOOP the_loop ;



END$$
DELIMITER ;

call processallcolumns ();
DROP PROCEDURE processallcolumns;
tomriddle_1234
  • 3,145
  • 6
  • 41
  • 71
2

Hammerite's answer is good but you can also replace CASE statements with IFs.

UPDATE
    TableName
SET
    column01 = IF(column01 = '', NULL, column01),
    column02 = IF(column02 = '', NULL, column02),
    column03 = IF(column03 = '', NULL, column03),
    ...,
    column99 = IF(column99 = '', NULL, column99)
1

There isn't a standard way - but you can interrogate the system catalog to get the relevant column names for the relevant table and generate the SQL to do it. You can also probably use a CASE expression to handle all the columns in a single pass - a bigger SQL statement.

UPDATE Table
   SET Column1 = CASE Column1 = ' ' THEN NULL ELSE Column1 END,
       ...

Note that once you've generated the big UPDATE statement, all the work is done down in the server. This is much more efficient than selecting data to the client application, changing it there, and writing the result back to the database.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

I think you'll need to pull each row into a language like C#, php, etc.

Something like:

rows = get-data()
foreach row in rows
    foreach col in row.cols
        if col == ''
            col = null
        end if
    next
next
save-data()
Nate
  • 30,286
  • 23
  • 113
  • 184
  • This will work, but it's doable in T-SQL using a loop (e.g. using cursors) over the list of columns. For a one-time task, I suppose it really doesn't matter how you do it, though. – Brian Jul 13 '10 at 14:53
  • I tend to gravitate away from SQL for this type of maintenance, since in most cases I end up needing to add more functionality over time, and I end up pulling it into a full OO language anyway. But thats a personal bias... ;) – Nate Jul 13 '10 at 15:09
0

You could write a simple function and pass your columns to it:

Usage:

SELECT
  fn_nullify_if_empty(PotentiallyEmptyString)
FROM
  table_name
;

Implementation:

DELIMITER $$
CREATE FUNCTION fn_nullify_if_empty(in_string VARCHAR(255))
  RETURNS VARCHAR(255)
  BEGIN
    IF in_string = ''
      THEN RETURN NULL;
      ELSE RETURN in_string;
    END IF;
  END $$
DELIMITER ;
Nae
  • 14,209
  • 7
  • 52
  • 79