3

DBMS is !

Motivation: I need to do a left join on a table so I can get the same list of message types every time, even if the result is zero or null. Unfortunately, this is a large table so including a SELECT DISTINCT() is prohibitively slow. These should never change, so I thought I'd get the list once and just join them statically.

Based on another SO question, here is what I have to replace the SELECT DISTINCT():

    SELECT 'HS.MESSAGE.GATEWAYREGISTRATIONREQUEST' as MessageBodyClassName
    UNION SELECT 'HS.MESSAGE.MERGEPATIENTREQUEST'
    UNION SELECT 'HS.MESSAGE.PATIENTSEARCHREQUEST'

This returns results that look exactly as expected, identical to the Distinct query. However, when I plug this into my JOIN statement, all the counts come back as zero.

Failing Query

SELECT mh.MessageBodyClassName, count(l.MessageBodyClassName) as MessageCount FROM 
    (
        SELECT 'HS.MESSAGE.GATEWAYREGISTRATIONREQUEST' as MessageBodyClassName
        UNION SELECT 'HS.MESSAGE.MERGEPATIENTREQUEST'
        UNION SELECT 'HS.MESSAGE.PATIENTSEARCHREQUEST'
    ) mh LEFT JOIN
    (
        SELECT messageBodyClassName FROM ens.messageheader WHERE TimeCreated > DATEADD(hh, -1, GETUTCDATE())
    ) l ON mh.MessageBodyClassName = l.MessageBodyClassName
GROUP BY mh.MessageBodyClassName

Failed results

MessageBodyClassName                  MessageCount 
------------------------------------- ------------ 
HS.MESSAGE.GATEWAYREGISTRATIONREQUEST 0            
HS.MESSAGE.MERGEPATIENTREQUEST        0            
HS.MESSAGE.PATIENTSEARCHREQUEST       0

Working Query

SELECT mh.MessageBodyClassName, count(l.MessageBodyClassName) as MessageCount FROM 
    (
        SELECT DISTINCT(MessageBodyClassName) FROM ens.messageheader
    ) mh LEFT JOIN
    (
        SELECT messageBodyClassName FROM ens.messageheader WHERE TimeCreated > DATEADD(hh, -1, GETUTCDATE())
    ) l ON mh.MessageBodyClassName = l.MessageBodyClassName
GROUP BY mh.MessageBodyClassName

Working and expected results

MessageBodyClassName                  MessageCount 
------------------------------------- ------------ 
HS.MESSAGE.GATEWAYREGISTRATIONREQUEST 0            
HS.MESSAGE.MERGEPATIENTREQUEST        0            
HS.MESSAGE.PATIENTSEARCHREQUEST       54

For VKP: Why are the results different? How can I adjust the first query with literals to get the proper (same) results?

  • I do not know your dbms. Whatever, the first thing i would check is the datatype of your column. If it's a char (and not a varcahr) you might have here the solutino of your problem – Nemeros Oct 21 '16 at 15:01
  • Is the type on the `messageBodyClassName` in `mh` a `varchar` or perhaps `char`? Curious if it's an enumerated type and there for you're comparing 2 different types. Your working query would preserve that. – RC. Oct 21 '16 at 15:01
  • It is a `VARCHAR`, according to DBVis and the ISC ODBC driver. –  Oct 21 '16 at 15:02
  • 2
    There's likely a blank or special (non display) charcters in ens.messageheader.messageBodyClassName not in your hardcoded values. To prove this out try to `SELECT count(*) from ens.messageheader where messageBodyClassName in ('HS.MESSAGE.GATEWAYREGISTRATIONREQUEST', 'HS.MESSAGE.MERGEPATIENTREQUEST', 'HS.MESSAGE.PATIENTSEARCHREQUEST')` if your count is zero then the values in the table don't match your hardcoded ones. and you need to figure out why. Try len to count characters or use ASCII to convert to asci character code to find out what is unexepected – xQbert Oct 21 '16 at 15:12
  • `Aggregate_1: 117058` –  Oct 21 '16 at 15:13
  • 1
    @scott If both are identical. That doesnt make sense. The other thing is maybe you run the query in different time and that is why you get different results? – Juan Carlos Oropeza Oct 21 '16 at 15:26
  • @xQbert he post a picture in my answer and they look identical. – Juan Carlos Oropeza Oct 21 '16 at 15:27
  • @Scott I can't tell why you get different result, but if you say `DISTINCT` is slow, can you try create one `INDEX` for `MessageBodyClassName` ? – Juan Carlos Oropeza Oct 21 '16 at 15:28
  • @JuancarlosOropeza I agree that screenshot seems to prove that the text in the two is identical. Therefore the only other option is that it's different environment, different days or data get altered underneath him? or uncommitted changes from something else. – xQbert Oct 21 '16 at 15:31
  • Can we union the results of both queries and run them together (add a column called source with hardcoded values of QRYWORKS and QRYFAIL to appropriate queries and display the results.) I want to eliminate environmental issues or uncommitted data. the picture in Juan's answer doesn't have the limiting data criteria which could be causing 0 record counts... What may have worked yesterday, may not work today due to changing dates on GETUTCDATE – xQbert Oct 21 '16 at 15:37
  • Thanks for all the help everyone, even if it makes zero sense. At least it doesn't seem to be me just [doing something stupid this time](http://stackoverflow.com/questions/40139867/complex-left-join-not-working-as-expected):) –  Oct 21 '16 at 15:46

2 Answers2

2

The last thing I can think of is to run your DISTINCT query once into a permanent table in your database. That way the inner SELECT in your query will only have to process those three lines. The inner query would lose DISTINCT, like

SELECT MessageBodyClassName FROM ens.messageheader_permvals

EDIT: The below answer did not work

This may be a longshot, but if it doesn't work it might help you diagnose the problem. Instead of the UNION try

SELECT MessageBodyClassName FROM ens.messageheader
    WHERE MessageBodyClassName in (
    'HS.MESSAGE.GATEWAYREGISTRATIONREQUEST',        
    'HS.MESSAGE.MERGEPATIENTREQUEST',          
    'HS.MESSAGE.PATIENTSEARCHREQUEST')

That should return records only if those values actually exist in the table and are compatible with the format of MessageBodyClassName, which we know works using the DISTINCT version. I don't know if the performance will be better this way, but hopefully it will shed some light on the issue.

EDIT: the below answer does not apply, as the OP is was actually trying to select the literal quoted values

You don't have a FROM statements in your UNION query. Try

SELECT 'HS.MESSAGE.GATEWAYREGISTRATIONREQUEST' as MessageBodyClassName
            FROM ens.messageheader
        UNION SELECT 'HS.MESSAGE.MERGEPATIENTREQUEST'
            FROM ens.messageheader
        UNION SELECT 'HS.MESSAGE.PATIENTSEARCHREQUEST'
            FROM ens.messageheader

The rest of the query looks right.

Rominus
  • 1,181
  • 2
  • 14
  • 29
  • You miss the question. and you dont need `FROM` – Juan Carlos Oropeza Oct 21 '16 at 15:05
  • Without the `FROM` it took 27 seconds to execute. With the `FROM` statements added, It took 90 and returned the same results. :( –  Oct 21 '16 at 15:06
  • If it helps I was really rooting for this answer! :) –  Oct 21 '16 at 15:09
  • @Scott so are you trying to select the strings in quotes instead of columns with those names? I assumed the quotes were a difference of intersystem-cache from the more general SQL I'm familiar with – Rominus Oct 21 '16 at 15:11
  • Nope, I actually want those to be *values* of the column named `MessageBodyClassName` –  Oct 21 '16 at 15:12
  • @Scott so your UNION should return a little table with just three rows, one for each quoted value? – Rominus Oct 21 '16 at 15:13
  • That's correct, and it does if I run it separately. –  Oct 21 '16 at 15:14
  • @Scott added another option to my answer – Rominus Oct 21 '16 at 15:23
  • The query runs as expected and returns results... sadly it doesn't appear to be hidden characters or a type issue. –  Oct 21 '16 at 15:25
  • @Scott ok, one last idea added to answer – Rominus Oct 21 '16 at 15:29
  • Unfortunately this is a hosted environment and that is not an option :( –  Oct 21 '16 at 15:44
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/126349/discussion-between-rominus-and-scott). – Rominus Oct 21 '16 at 15:49
0

I agree with xQbert, problem is the hard codes values

Try

  SELECT T1.MessageBodyClassName, T2.MessageBodyClassName
  FROM (
          SELECT 'HS.MESSAGE.GATEWAYREGISTRATIONREQUEST' as MessageBodyClassName
    UNION SELECT 'HS.MESSAGE.MERGEPATIENTREQUEST'
    UNION SELECT 'HS.MESSAGE.PATIENTSEARCHREQUEST'
       ) as T1
 LEFT JOIN (
         SELECT DISTINCT(MessageBodyClassName) as MessageBodyClassName
         FROM ens.messageheader
       ) as T2
  ON T1.MessageBodyClassName = T2.MessageBodyClassName

Possible solution: Create a temporal table

 CREATE TABLE className as 
         SELECT DISTINCT(MessageBodyClassName) as MessageBodyClassName
         FROM ens.messageheader
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118