-2
/*Usage example: This function takes S. O. L. I. D. as input and returns SOLID. And similarly removes single quotes, hyphens and slashes from input*/
CREATE DEFINER=`root`@`localhost` FUNCTION `SanitiseNameForSearch`(Name nvarchar(100)) RETURNS varchar(100) CHARSET utf8
BEGIN
    RETURN REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Name, ' ', ''), '.', ''), '''', ''), '-', ''), '/', '');
END

Using this function here in a procedure, applied the function on search input and on column. Works fine, but definitely not scalable.

CREATE DEFINER=`root`@`localhost` PROCEDURE `Search`(SearchFilter nvarchar(20))
BEGIN
    SET @SearchFilter = `SanitiseNameForSearch`(SearchFilter);

    SELECT t.TermId, t.Name
    FROM Terminology AS t
    WHERE `SanitiseNameForSearch`(Name) Like @SearchFilter
    ORDER BY length(Name) asc
    LIMIT 5;
END;

Is it ideal to implement this functionality via function or add a separate column/table that holds the column values after the function is applied i.e. hold precalculated value of SanitiseNameForSearch(Name) so that it can be indexed?

Null Head
  • 2,877
  • 13
  • 61
  • 83
  • You want [Code Review](http://codereview.stackexchange.com/), not Stack Overflow for this. – AStopher Oct 11 '15 at 20:47
  • 1
    This is too fine line for me to judge where this question should go. And I have a million questions about performance optimization here on SO. I hope you did not down vote for that reason. – Null Head Oct 11 '15 at 20:51
  • WOW! 3 down votes, but no one cares to explain why! – Null Head Oct 11 '15 at 21:00
  • Going to go out on a limb here and say that there is not enough information to actually help you (not a MySQL expert). Did you look at the query plan? Did you actually verify that your function is the problem? My suspicion is that it is the use of `LIKE` that is causing performance problems, not your function. – cimmanon Oct 11 '15 at 21:03
  • We need more information to help you. What lead you to believe the scalar function was the primary problem with this query? What have you done about other candidate problems (improper indexing/index fragmentation, table partitioning, the use of the `LIKE` operator, examination of the query plan for this procedure, physical io issues)? Does the procedure work just fine without that scalar function? – Dan Field Oct 11 '15 at 21:17
  • Possibly related: http://stackoverflow.com/questions/10595037/is-it-possible-to-have-function-based-index-in-mysql or http://stackoverflow.com/questions/18053385/indexing-column-with-replace-function-in-mysql – cimmanon Oct 11 '15 at 21:35
  • 1
    I don't downvote, but I do vote to close. The question is too broad. "[The] function could be extended further to include more functionality", please optimize. That's too broad. – Gordon Linoff Oct 11 '15 at 22:18
  • Well the only trouble with that SanitiseNameForSearch is that it will make it hard to use any indexes on the name column. At any rate LIKE %something% queries are slow. You might want to switch to full text searching. – e4c5 Oct 12 '15 at 00:00

1 Answers1

0

LIKE '%...' does not optimize at all. It will do a table scan. Therefore, this is the main part of being "non scalable". The functions, etc are insignificant in comparison.

So, what to do? Look at FULLTEXT to see if you can use it. It will probably require changing expectations of the users -- it looks for whole words, not arbitrary substrings. But it is much faster and scalable.

Using FULLTEXT would obviate the need for your Sanitize function.

(So, guys, what's all this crap about downvoting and closing the question? Haven't I answered his question, or at least the implied question?)

Rick James
  • 135,179
  • 13
  • 127
  • 222