0

I'm working on a Coldfusion8/MySQL query in which I'm scanning a database for A-B pairs, for example:

 S=2, M=2, L=2, XL=2

I'm trying to improve the script that originally handled this, which limited entries to 4 pairs and because I'm wondering, why it first selects all records with one matching pair and then uses HAVING to only select the records, with all pairs matching.

Here is the original query, afterwards my current version:

<!--- placeholders --->
<cfparam name="s01" default="">
<cfparam name="s02" default="">
<cfparam name="s03" default="">
<cfparam name="s04" default="">
<cfparam name="q01" default="">
<cfparam name="q02" default="">
<cfparam name="q03" default="">
<cfparam name="q04" default="">
<!--- check length of user inputs --->
<cfset sizes = ListLen(s_lot_groesse,",")>
<cfset qtys   = ListLen(s_lot_menge,",")>
<!--- populate placeholders --->
<cfif sizes gt 0><cfset s01 = trim(ListGetAt(s_lot_groesse, 1,","))></cfif>
<cfif sizes gt 1><cfset s02 = trim(ListGetAt(s_lot_groesse, 2,","))></cfif>
<cfif sizes gt 2><cfset s03 = trim(ListGetAt(s_lot_groesse, 3,","))></cfif>
<cfif sizes gt 3><cfset s4 = trim(ListGetAt(s_lot_groesse, 4,","))></cfif>
<cfif qtys gt 0><cfset q01 = trim(ListGetAt(s_lot_menge, 1,","))></cfif>
<cfif qtys gt 1><cfset q02 = trim(ListGetAt(s_lot_menge, 2,","))></cfif>
<cfif qtys gt 2><cfset q03 = trim(ListGetAt(s_lot_menge, 3,","))></cfif>
<cfif qtys gt 3><cfset q04 = trim(ListGetAt(s_lot_menge, 4,","))></cfif>
<!--- query --->
<cfquery datasource="db" name="lotsuche">
SELECT styleno, count(*) as total_styles
FROM styles
WHERE 1 = 1
AND (
    <cfif s01 neq "" AND q01 neq "">(groesse = "#s01#" AND bestand >= "#q01#")</cfif>
    <cfif s02 neq "" AND q02 neq "">OR (groesse = "#s02#" AND bestand >= "#q02#")</cfif>
    <cfif s03 neq "" AND q03 neq "">OR (groesse = "#s03#" AND bestand >= "#q03#")</cfif>
    <cfif s04 neq "" AND q04 neq "">OR (groesse = "#s04#" AND bestand >= "#q04#")</cfif>
    )
GROUP BY styleno
HAVING total_styles= "#sizes#"
</cfquery>

New version:

<!--- build a 2D array --->
<cfscript>
    variables.lotArray = ArrayNew(2);
    variables.sizeCounter = 1;
    variables.qtyCounter = 1;
</cfscript>
<cfloop list="#LOCAL.Search.s_lot_groesse#" delimiters=", " item="size">
    <cfscript>
        variables.lotArray[variables.lotCounter][1] = size;
        variables.lotCounter = variables.lotCounter + 1;
    </cfscript>
</cfloop>
<cfloop list="#LOCAL.Search.s_lot_menge#" delimiters=", " item="qty">
    <cfscript>
        variables.lotArray[variables.qtyCounter][2] = qty;
        variables.qtyCounter = variables.qtyCounter + 1;
    </cfscript>
</cfloop>
<!--- get array length --->
<cfset variables.lotArrayLen = arrayLen(variables.lotArray)>

<!--- query --->
<cfquery datasource="ds" name="lotsuche">
SELECT art.styleNo, count(*) as total_styles
FROM styles AS art
WHERE 1 = 1
<cfloop from="1" to="#variables.lotArrayLen#" index="i">
    AND ( art.groesse = <cfqueryparam cfsqltype="cfsql_varchar" value="#variables.lotArray[i][1]#"> 
    AND art.bestand >= <cfqueryparam cfsqltype="cfsql_varchar" value="#variables.lotArray[i][2]#">
</cfloop>
GROUP BY art.styleno
<!--- HAVING anzahl = "#variables.lotArrayLen#" --->
</cfquery>

Question:
I don't understand why OR and HAVING are being used in the original query, because doesn't this select all records with one matching pair vs just picking the records with all pairs matching? Which approach is better/faster if you have a few million entries to scan? Also, does it make sense to re-factor the whole thing like I did?

Thanks for help!

frequent
  • 27,643
  • 59
  • 181
  • 333

1 Answers1

1

The HAVING clause is used with aggregate functions and OR is used to test for multiple conditions in a query. I'm not sure of the purpose of the having clause in the old query. Have you thoroughly tested your new query to ensure that your getting expected results?

After looking over the two queries, your second query won't work the same. The two records may have multiple conditions that they need to meet and the query could return an empty recordset.

I think your looking for something more like this.

AND
 <cfloop from="1" to="#variables.lotArrayLen#" index="i">
 (  OR (art.groesse = <cfqueryparam cfsqltype="cfsql_varchar" value="#variables.lotArray[i][1]#"> 
AND art.bestand >= <cfqueryparam cfsqltype="cfsql_varchar" value="#variables.lotArray[i][2]#"> ) )

sweyrick
  • 140
  • 5
  • ok. I will check this. I'm not done with the page, so I have not started testing. This was just something that I stumbled upon. Thanks for the info so far. I will post back what comes out. – frequent May 21 '12 at 18:21
  • regarding HAVING in the original query: I think, if the user entered say 4 sizes, this should equate to a size-length of 4, so only records with 4 matching pairs should be selected. This is why I was asking why the query used OR, because this would select all articles with a single match (S=2 OR M=2 OR L=2 OR XL=2). Then by HAVING = 4, only the records with 4 matches would get selected. Therefore I was wondering whether I could just use AND (S=2 AND M=2 AND L=2 and XL=2) to only select the correct records right away. Do you understand what I mean? – frequent May 21 '12 at 18:29
  • 1
    @frequent - The thing to remember is the WHERE clause applies to a single record. Using AND would create a statement that is never true, for example `WHERE ( groesse = 'A' AND groesse = 'B' AND groesse = 'C')`. The groesse has a single value. So it cannot equal A, B and C at the same time. That is why you must use `OR`. – Leigh May 21 '12 at 19:09
  • @Leigh - makes senes as the records contain styleNo|size|quantity, there won't be any records matching multiple size. Then I will need the HAVING, will I? – frequent May 21 '12 at 19:14
  • @user1286330 - Your snippet will end up AND ( OR (groesse=S and qty=1... ). Shoudn't I change to something like this: AND (2=2 OR (size=x AND qty=y) ) – frequent May 21 '12 at 19:22
  • @frequent - Yep, you need the HAVING clause. – Leigh May 21 '12 at 20:51