3

I've been trying to wrap my head around a possible Hidden Value.

Whenever I run this query

SELECT * FROM LIST WHERE `GROUP_2` = "Yes";

I get 0 results.

Whenever I run this:

SELECT * FROM LIST WHERE `GROUP_2` LIKE 'Yes%';

I am able to pull result.

I have tried Many different export and import parameter using the MySQL load data for csv import but nothing works. Both GROUP_1 and GROUP_2 are set to varchar(55)

Any queries i can run to identify the cause of this or detect hidden breaklines?

ID  name    GROUP_1 GROUP_2
1   Bob     No      Yes
2   Jim     Yes     No
fthiella
  • 48,073
  • 15
  • 90
  • 106
renfley
  • 146
  • 1
  • 8

3 Answers3

6

To identify what the problem is, you could use HEX function:

SELECT group_1, HEX(group_1)
FROM table
WHERE group_1 LIKE 'Yes%' AND group_1!='Yes'

yes should be coded as 596573 (depending on the codeset used it might be different) but you will see some other characters after it.

Edit

I would suggest you to find the root cause and fix the text file or the import procedure. But if you want to fix the existing data without doing a new import there are multiple options.

You can follow xQbert link (thanks for the link!):

update table SET group_2 = TRIM(TRAILING '\n' FROM group_2);

or fix your own query:

UPDATE TABLE set GROUP_2 = replace(replace(GROUP_2,'\n',''), '\r', '');

or if you only have yes/no values:

UPDATE TABLE set GROUP_2='Yes' WHERE group_2 LIKE 'Yes%' AND group_2!='Yes';
UPDATE TABLE set GROUP_2='No' WHERE group_2 LIKE 'No%' AND group_2!='No';
fthiella
  • 48,073
  • 15
  • 90
  • 106
2

I've never tried it myself, but think this will identify any rows with 'invisible' characters.

SELECT * 
FROM theTable 
WHERE field REGEXP '[^[:graph:] ]+' --This allows spaces, even leading/trailing ones
OR field <> TRIM(field) --This catches those spaces
;

It doesn't identify or fix the offending characters though, just the rows that have them in that field.

Uueerdo
  • 15,723
  • 1
  • 16
  • 21
  • you just helped confirm every single entry from GROUP_2 have something hidden – renfley Sep 16 '15 at 19:06
  • It may be tedious, depending on the variety of the data in that field, but you can begin identifying them using mysql string functions such as ASCII() AND SUBSTR(). When identified, you can add minimal `REPLACE(REPLACE(field, CHAR(x), ''), CHAR(y), '') AS field ` expressions to fix the data. https://dev.mysql.com/doc/refman/5.1/en/string-functions.html – Uueerdo Sep 16 '15 at 19:10
2

My original issue was cause by the source data as well as the method I used to import the file.

Whenever I queried directly using the equal, I wouldn't return a value, when using the LIKE "Yes%" This would.

By using the following:

SELECT GROUP_2, HEX(GROUP_2)
FROM QUEUE
WHERE GROUP_2 LIKE 'Yes%';

This helped identify the Added 0D hex which is the carriage return which cause the queries to fail.

The quick fix for this was to update the field using the folowing..

UPDATE QUEUE set GROUP_2 = replace(GROUP_2,'\r','');

But since this would have been needed every time the data is updated, I found out where the carriage can be identify in the original LOAD DATA. The original syntax used was:

LINES TERMINATED BY '\n' 

Which i then changed to

LINES TERMINATED BY '\r' 

This allows the source data to remain untouched.

peterh
  • 11,875
  • 18
  • 85
  • 108
renfley
  • 146
  • 1
  • 8