0

From MySQL 5.7 I am executing a LEFT JOIN, and the WHERE clause calls a user-defined function of mine. It fails to find a matching row which it should find.

[Originally I simplified my actual code a bit for the purpose of this post. However in view of a user's proposed response, I post the actual code as it may be relevant.]

My user function is:

CREATE FUNCTION `jfn_rent_valid_email`(
    rent_mail_to varchar(1),
    agent_email varchar(45),
    contact_email varchar(60)
)
RETURNS varchar(60)
BEGIN
    IF rent_mail_to = 'A' AND agent_email LIKE '%@%' THEN
        RETURN agent_email;
    ELSEIF contact_email LIKE '%@%' THEN
        RETURN contact_email;
    ELSE
        RETURN NULL;
    END IF
END

My query is:

SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
   jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) 
AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE  r.RentCode = 'ZAKC17' -- this produces one match
AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL)

This produces no rows.

However. When a.AgentEmail IS NULL if I only change from

AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL)

to

AND (jfn_rent_valid_email(r.MailTo, NULL, co.Email) IS NOT NULL)

it does correctly produce a matching row:

RentCode, MailTo, AgentEmail, Email,      ValidEmail
ZAKC17,   N,      <NULL>,     name@email, name@email

So, when a.AgentEmail is NULL (from non-matching LEFT JOINed row), why in the world does passing it to the function as a.AgentEmail act differently from passing it as a literal NULL?

[BTW: I believe I have used this kind of construct under MS SQL server in the past and it has worked as I would expect. Also, I can reverse the test of AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NOT NULL) to AND (jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) IS NULL) yet I still get no match. It's as though any reference to a.... as a parameter to the function causes no matching row...]

JonBrave
  • 4,045
  • 3
  • 38
  • 115

2 Answers2

1

Most likely this is an issue with optimizer turning the LEFT JOIN into a INNER JOIN. The optimizer may do this when it believes that the WHERE-condition is always false for the generated NULL row (which it in this case is not).

You can take a look at the query plan with the EXPLAIN command, you will likely see different table order depending on the query variation.

If the actual logic of the function is to check all emails with one function call, you may have better luck with using a function that takes just one email address as parameter and use that for each email-column.

You can try without the function:

SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
   jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) 
AS ValidEmail
FROM rents r
LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
WHERE  r.RentCode = 'ZAKC17' -- this produces one match
AND ((r.MailTo='A' AND a.AgentEmail LIKE '%@%') OR co.Email LIKE '%@%' )

Or wrap the function in a subquery:

SELECT q.RentCode, q.MailTo, q.AgentEmail, q.Email, q.ValidEmail
FROM (
  SELECT r.RentCode, r.MailTo, a.AgentEmail, co.Email,
   jfn_rent_valid_email(r.MailTo, a.AgentEmail, co.Email) AS ValidEmail
  FROM rents r
    LEFT JOIN contacts co ON r.RentCode = co.RentCode -- this produces one match
    LEFT JOIN link l ON r.RentCode = l.RentCode -- there will be no match in `link` on this
    LEFT JOIN agents a ON l.AgentCode = a.AgentCode -- there will be no match in `agents` on this
  WHERE  r.RentCode = 'ZAKC17' -- this produces one match
) as q
WHERE q.ValidEmail IS NOT NULL
slaakso
  • 8,331
  • 2
  • 16
  • 27
  • Thanks for posting. I have edited to show the *actual* code used. Would you take another look? You can see the function needs to access all its parameters, why it must be called separately for each row, not just once for all. I have never used an optimizer which is allowed to change the results and actually produce incorrect results. Calling a function in the condition is used all over. From what you are saying one can never dare use such a function in case it produces this kind of aberration? What can I do about it now? Meanwhile, I am off to investigate the `EXPLAIN`.... – JonBrave Sep 27 '18 at 13:27
  • Using `EXPLAIN`: When I have literal `NULL` as parameter to function I get a nice, reasonable diagram. When I pass the desired `a.AgentEmail` all I get is `query_block #1 - no matching row in const table`. Which "explains behaviour", but it's wrong. Now what? Is there any documentation on behaviour, I now don't know the rules for when/which functions are safe to use in a `LEFT JOIN ... WHERE` under MySQL. – JonBrave Sep 27 '18 at 14:04
  • I believe you have edited your answer to "You can try without the function" and put the function code in-line. Naturally I have always known I can do this! The point is the query is generated in a client app, and the body of `jfn_rent_valid_email` could be huge, or change at any point. That's precisely why I want to do this stuff via a function! So are you *basically* saying: don't use a function in a MySQL WHERE with LEFT JOIN like this because you can never know when it will cause you to get incorrect results back? MySQL docs never mention that..... – JonBrave Sep 27 '18 at 15:47
  • 1
    Yes. In this case the optimizer is making a false assumption and it can be considered a bug in MySQL (at least in that MySQL version). Added an other sample how you could continue to use the function. – slaakso Sep 27 '18 at 16:03
  • 1
    Checked the behavior in MySQL 8 and the code works ok there, so it looks like the bug was fixed in newer version (or how the optimizer works has been changed) – slaakso Sep 27 '18 at 16:14
0

Changing the call to the function in the WHERE clause to read

jfn_rent_valid_email(r.MailTo, IFNULL(a.AgentEmail, NULL), IFNULL(co.Email, NULL)) IS NOT NULL

solves the issue.

It appears that the optimizer feels it can incorrectly guess that the function will return NULL in the non-match LEFT JOIN case if a plain reference to a.AgentEmail is passed as any parameter. But if the column reference is inside any kind of expression the optimizer ducks out. Wrapping it inside a "dummy", seemingly pointless IFNULL(column, NULL) is thus enough to restore correct behaviour.

I am marking this as the accepted solution because it is by far the simplest workaround, requiring the least code change/complete query rewrite.

However, full credit is due to @slaakso's post here in this topic for analysing the problem. Note that he states that the behaviour has been fixed/altered in MySQL 8 such that this workaround is unnecessary, so it may only be necessary in MySQL 5.7 or earlier.

JonBrave
  • 4,045
  • 3
  • 38
  • 115