1

I have started getting this error recently:

The request has exceeded the allowable time limit Tag: CFQUERY

It seems to be whenever the Bing bot visited my site. I get system notifications in the span of 5 minutes about the error. So I will get like 7 to 10 in a row. The error all point to the one query:

    <CFQUERY DATASOURCE="#datasource#" USERNAME="#username#" PASSWORD="#password#" NAME="queryname" CACHEDWITHIN="#CreateTimeSpan(0,0,2,0)#">
    SELECT products.field1, products.field2, products.field3, products.field4, products.field5, products.field6, products.field7, products.field8, products.field9, products.field10, products.field11, company.field1, company.field2, company.field3, company.field4, company.field5, company.field6, company.field7
    FROM products JOIN company
    ON products.field1 = company.field1
        WHERE products.field12 = <cfqueryparam value = "#catd#" cfsqltype = "cf_sql_integer" maxLength = "2">
        ORDER by products.field13 DESC
    </CFQUERY>

It basically select all the products in a category and get the associated company information with it and paginate them. The results returned for each category ranges from 2K to 30K records. There are up to 30 categories.

The errors that came in looks like this:

The request has exceeded the allowable time limit Tag: CFQUERY  <br>The error occurred on line 152.
Query: cat=18&page=1803
The request has exceeded the allowable time limit Tag: CFQUERY  <br>The error occurred on line 152.
Query: cat=2&page=211
The request has exceeded the allowable time limit Tag: CFQUERY  <br>The error occurred on line 152.
Query: cat=7&page=691
The request has exceeded the allowable time limit Tag: CFQUERY  <br>The error occurred on line 152.
Query: cat=6&page=451
The request has exceeded the allowable time limit Tag: CFQUERY  <br>The error occurred on line 152.
Query: cat=14&page=417

This indicates that the Bing bot is hitting many sections of the site at the same time on different categories and pages.

Two weeks ago I got

The request has exceeded the allowable time limit Tag: CFQUERY   

and

GC overhead limit exceeded null

So I increased the JVM memory from 1024 to 2048. But it may have only temporary fix the problem. Now two weeks later, the "exceeded the allowable time limit" error has come back.

Could this problem caused by the Bing bot, or is it something to do with my code, server memory or CF set up?

Thanks in advance.

Jack
  • 853
  • 1
  • 7
  • 20
  • How long does it normally take to execute? – Dan Bracuk Aug 26 '17 at 22:04
  • This does not necessarily need to be a problem with the query at all. In CFADMIN you can set up how long a request/page is allowed to run. When this limit is hit during the execution of the SQL the query is being reported but the reason might be some code that took to long, that is executed before this SQL. Why the `username` in the `cfquery`? Why the `maxlength` in the `cfqueryparam`? – Bernhard Döbler Aug 26 '17 at 22:11
  • There is a password parameter that go with it. The maxlength is to make sure the "cat" is no longer than 2. There are no other SQL code before this. – Jack Aug 26 '17 at 23:27
  • I think there are a couple of things you can do here. One, you can modify the query and introduce some kind of limit to reduce the number of records returned - something like this - (https://stackoverflow.com/questions/971964/limit-10-20-in-sql-server) - this will allow you to return only a certain number of records in each page. This should also help with the pagination of records. Another thing you can try is to restrict Bing bot from crawling your pages (assuming you don't need those pages crawled). For help, look here - (http://www.robotstxt.org/robotstxt.html) – ultimoTG Aug 27 '17 at 00:04
  • A 'robots.txt' file in the web root with the following content should stop Bing from crawling those pages. `User-agent: bingbot Disallow: /~johndoe/test.html [[ path to the page]]` – ultimoTG Aug 27 '17 at 00:07
  • @tsg I want the Bing to crawl all pages so limiting it won't work. The current pagination method retrieve all records and paginate it and leaving it cached for 2 minutes. It is possible to retrieve just the records for that page e.g. 20 without caching. But isn't this will cause more DB access and server load when Bing go at it multiple times? – Jack Aug 27 '17 at 00:35
  • Yes, it's true this strategy will create more query calls to the database but at least all of those queries will have a better chance of finishing before the set timeout is reached and that is exactly your issue here. – ultimoTG Aug 27 '17 at 02:10
  • @DanBracuk Usually it only takes seconds to execute. At most it's 30k records returned. So doesn't take that long. – Jack Aug 29 '17 at 04:11

2 Answers2

1

If you're only showing 20 records per page, then your query should only ever return 20 per request. You would rather have a bot index 200 pages with 20 records each than trigger timeouts on a single page, which has performance impact on the rest of your application and live clients all day long.

The call to FOUND_ROWS() will return the total number of records that match your query. You can use this to create page links.

<cfparam name="url.p" type="numeric" default="1">
<cfset queryLimit = 20>
<cfset queryOffset = 0>
<cfif url.p GT 1>
    <cfset queryOffset = p * queryLimit>
</cfif>

<CFQUERY NAME="queryname" ...>
    SELECT 
        products.field1
        , products.field2
        , products.field3
        , products.field4
        , products.field5
        -- etc.
        , FOUND_ROWS() as total_rows
    FROM 
        products 
    INNER JOIN 
        company ON products.field1 = company.field1
    WHERE 
        products.field12 = <cfqueryparam value = "#catd#" cfsqltype = "cf_sql_integer" maxLength = "2">
    ORDER BY 
        products.field13 DESC
    LIMIT #queryLimit# 
    OFFSET #queryOffset#
</CFQUERY>

Update 2017-11-07

After better reading the documentation for FOUND_ROWS(), my assumption that it worked similar to the relative function in SQL Server was incorrect. You'll need to do something like this. Maybe you can run two select statements in a single <cfquery>, but I'm not sure if that will return two query objects for the single query name variable.

<CFQUERY NAME="queryname" ...>
    SELECT 
        SQL_CALC_FOUND_ROWS
        , products.field1
        , ...
    LIMIT #queryLimit# 
    OFFSET #queryOffset#
</CFQUERY>
<CFQUERY NAME="foundRows" ...>
    SELECT FOUND_ROWS() as totalCount;
</CFQUERY>
Adrian J. Moreno
  • 14,350
  • 1
  • 37
  • 44
  • I am planning to try out as you have suggested for pagination. However, if I do the found_rows() in the statement, does that mean that every time next page is clicked it will need to do a totaling? Would it be better if I have another CFQUERY just before this with SELECT COUNT(*) as total_rows and cache it for 3 minutes? It will be 2 CFQUERY statement instead of the one as you have suggested. – Jack Nov 05 '17 at 20:27
  • The function FOUND_ROWS() returns an internal count of the total number of records in the recordset. LIMIT and OFFSET just filter down the subset of records to return. No need for a 2nd query, you'd just be doubling the query execution processing. – Adrian J. Moreno Nov 06 '17 at 04:43
  • When I did a display on #queryname.total_rows# value it returns 0 so it was not able to do any pagination. But it did display the first 20 records. – Jack Nov 06 '17 at 06:53
0

This is not exactly an answer, but this is too long for a comment

I am looking at this issue and it just screams to me that you are pulling back more data than you are expecting. Run something like

<CFQUERY DATASOURCE="#datasource#" USERNAME="#username#" PASSWORD="#password#" NAME="queryname">
SELECT COUNT(products.field1) AS productCount
FROM products JOIN company
ON products.field1 = company.field1
    WHERE products.field12 = <cfqueryparam value = "#catd#" cfsqltype = "cf_sql_integer" maxLength = "2">
    ORDER by products.field13 DESC
</CFQUERY>

See if you are getting too many rows. I try to keep the row count to below 1000, although I have done up to 3000 on a page.

Second Idea

Run an missing index query on the DB. Make sure you are properly indexed.

Third Idea

When I look at

products.field1, products.field2, products.field3, products.field4, products.field5, products.field6, products.field7, products.field8, products.field9, products.field10, products.field11, company.field1, company.field2, company.field3, company.field4, company.field5, company.field6, company.field7

I don't like all these columns. If you really have to return this into an HTML table, do the string building on the MySQL. Usually bad for SQL to build HTML, but maybe this is one of those times.

James A Mohler
  • 11,060
  • 15
  • 46
  • 72
  • I did mention that it returns between 2K to 30K records for each of the category. But each page is only displaying 20 records. It's used for pagination. All fields are indexed properly. Those fields are retrieved so that they can be displayed on the page. – Jack Aug 27 '17 at 03:49
  • Let me see if I understand, you are pulling 2k to 30k records, but not showing all of them? – James A Mohler Aug 27 '17 at 04:11
  • That is correct. But just use them for pagination. Visitor can jump to any page if them want. – Jack Aug 27 '17 at 17:31
  • Don't do that. Do the filtering within the SQL statement. You are putting a huge load on the MySQL engine and the JVM that ColdFusion is using. – James A Mohler Aug 28 '17 at 03:10
  • If I only load 20 records at a time, wouldn't it create a load issue for the server when SE robots are hitting on every page? – Jack Aug 28 '17 at 04:28
  • Robots index the content that is on the page. If are are pulling 2k records and only showing 20, the robots will only be evaluating the 20 that are shown. Besides it would be better to have 100 pages with relevant content, rather than 10 page of a lot of stuff. Furthermore you want to have search engines send users to pages that 1) work and 2) are fast. – James A Mohler Aug 28 '17 at 05:52