MySQL Version: 5.5.37-0ubuntu0.14.04.1
I am currently writing a python script which makes use of numerous MySQL tables and queries to obtain results from an inverted index stored in tables.
I have noticed that choosing the right type of Cursor in the MySQLdb python module when executing a query has a really big effect on performance and was wondering if someone could explain or provide a reliable resource explaining which cursor to use when.
As an example, executing this query 40 times with SSCursor takes 7 seconds:
SELECT Pages.PageID,
Pages.PageName,
Counter AS TermFreq,
Pages.Length,
(Counter / LOG(Length)) AS Weight
FROM Pages
INNER JOIN TermOccurrences ON TermOccurrences.PageID = Pages.PageID
INNER JOIN Terms ON TermOccurrences.TermID = Terms.TermID
WHERE TermName = %s
ORDER BY Weight DESC
LIMIT 20;
Running the same query 40 times with the default Cursor takes 0.004 seconds.
Removing the calculation of weight (Counter/LOG(Length)) makes this query execute fast again using an SSCursor however.
I was using SSCursor because it was proving to have vastly superior performance on a number of other queries and then suddenly became very slow for this one. Changing back to the default Cursor surprised me when it executed so fast.
EDIT: Some more examples.
Running the following with the default cursor 40 times takes ~3 seconds:
SELECT COUNT(*)
FROM Pages
INNER JOIN TermOccurrences ON TermOccurrences.PageID = Pages.PageID
INNER JOIN Terms ON TermOccurrences.TermID = Terms.TermID
WHERE TermName = %s AND Counter > 2
Running it with SSCursor instead takes about 0.002 seconds.