2

I have run cat -v on a file, see ^K where I was expecting to see some kind of whitespace, but don't know how to translate that code into something I can use to find and replace the character in MySQL.

Context: I have a mysql field that appears to have a newline at the end, as when I query it, the output wraps around.

    select t_unid_contact
    ,      concat('"',t_phone,'"')
    ,      t_phone 
    from contacts_june_2013 
    where T_UNID_Contact = 'CN726181';

    +----------------+-------------------------+-------------+
    | t_unid_contact | concat('"',t_phone,'"') | t_phone     |
    +----------------+-------------------------+-------------+
    | CN726181       | "4155551212
                                  "           | 4155551212
                                                           |
    +----------------+-------------------------+-------------+
    1 row in set (0.00 sec)

I tried these WHERE clauses, to try to identify rows with the hidden character, but got no results:

    where t_phone like '%\n%'
    where t_phone like '%\r%'
    where t_phone like '%\r\n%'
    where t_phone like '%\k%'
    where t_phone regexp '[\r]'
    where t_phone regexp '[\k]'
    where t_phone regexp '[\n]'
    where t_phone regexp '[\r\n]'

I output a problem row into a csv and ran this to look at it; here is the output:

    cat -v testwhitespace.csv 
    "CN726181","4155551212^K" 

Does anyone know how I can identify that whitespace character in my table, in such a way that I can use replace() to get rid of it?

Thanks

user2616155
  • 330
  • 2
  • 9

2 Answers2

1

^K is the control code for a vertical/line tabulation character, U+000B.

REPLACE(my_column, _utf8 x'0b', '')
eggyal
  • 122,705
  • 18
  • 212
  • 237
  • Thank you! This works. I'm still trying to figure out how to use that in a WHERE clause, though, to find rows containing that record. Not that that is as important as doing the actual replace. But I should get the syntax figured out. – user2616155 Jul 24 '13 at 20:58
  • @user2616155: One way would be `WHERE my_column LIKE CONCAT('%', _utf8 x'0b', '%')`, but it won't be very efficient! – eggyal Jul 24 '13 at 21:25
  • That works very nicely, thank you. I'm working to clean up data, not writing production code, so this is perfect. – user2616155 Jul 25 '13 at 03:31
0

To answer the question, ^K is a vertical tab, as previously stated. As ^k == 11th char == b in hexdec

You state that you are looking at a file.

sed -e 's/\x0b/ /g' file

should strip them for a space.

If you want to use SQL, try \x0b on the where clause.

Owen Beresford
  • 712
  • 4
  • 10
  • I haven't figured out how to use utf8 codes in the WHERE clause yet, but I used this kludgey workaround: update contacts_june_2013 set t_phone = REPLACE(t_phone, _utf8 x'0b', '') where t_phone <> REPLACE(t_phone, _utf8 x'0b', ''); – user2616155 Jul 24 '13 at 21:06