-1

We have a large database system that has a table of static values that are shared with external applications through a simple function call. The purpose is to have exact synchronization matching between the database system and the back-end system, so both systems can reference the same values based on custom names.

The implementation is fairly simple. The table just stores the text name of the system and field (VARCHAR64) along with a value (SMALLINT). This is essentially a lookup table for hardcoded values we share with the back-end JavaScript team, so the data could be:

CUSTOMER_RECORD_STATE, ACTIVE, 1
CUSTOMER_RECORD_STATE, INACTIVE, 2
CUSTOMER_RECORD_STATE, DELETED, 3

The function is used to look these values up as seen below.

SELECT tab.name_first FROM our_customers ourc WHERE ourc.record_state = get_gcs('CUSTOMER_RECORD_STATE','ACTIVE');

Here's the create code for the function:

SET NAMES 'utf8mb4';
DELIMITER $$
CREATE
DEFINER = 'root'@'localhost'
FUNCTION get_gcs (in_system varchar(128), in_field varchar(128), in_value varchar(128))
RETURNS smallint(6)
DETERMINISTIC
READS SQL DATA
BEGIN
    DECLARE var_value smallint DEFAULT -1;

    DECLARE out_result_value smallint;

    DECLARE debug_definition json DEFAULT JSON_OBJECT('function', 'get_gcs');
    DECLARE debug_details json DEFAULT JSON_OBJECT('source', debug_definition, 'parameters',
    JSON_OBJECT('in_system', in_system, 'in_field', in_field, 'in_value', in_value));

    SELECT
        custom_value INTO var_value
    FROM global_custom_setting
    WHERE in_system = name_system
    AND in_field = name_field
    AND in_value = name_value;

    RETURN var_value;
END
$$

DELIMITER ;

character_set_client: utf8mb4

collation_connection: utf8mb4_0900_ai_ci

Database Collation: utf8mb4_unicode_ci

The system is simple and it works well; the back-end team constantly calls this function to retrieve these "static" values defined in the database.

The problem is event though the function is DETERMINISTIC, it appears to dramatically slow down complex SQL calls that have the function embedded in the call. I was surprised by this as I was under the impression that DETERMINISTIC functions would be handled differently; as if they were a static value (in a SQL call).

So questions:

  1. Is this the best approach to sharing static values between two platforms (database and backend)?
  2. Why is the MySQL engine not treating the calls as DETERMINISTIC and only resolving the value once rather than through what appears to be with each iterative call?

Thanks!

Floobinator
  • 388
  • 2
  • 11
  • How many rows in `var_value` and in `global_custom_setting`? – Rick James Nov 30 '21 at 04:05
  • Please provide `SHOW CREATE FUNCTION get_gcs`; I'm interested in the collation assigned at the time of the creation. Also what collation is in effect during the call? – Rick James Nov 30 '21 at 04:06
  • Thanks @RickJames. There are only 745 lines in the table. I updated the OP to SHOW CREATE FUNCTION. Regardless of these settings, is a function that provides access to table data the best way to cross-share static "values" with an external application (like a Node.js Backend)? – Floobinator Nov 30 '21 at 15:56
  • There are three things to list, such as: character_set_client: utf8 collation_connection: utf8_general_ci Database Collation: utf8mb4_unicode_520_ci – Rick James Nov 30 '21 at 17:09
  • @RickJames updated. – Floobinator Nov 30 '21 at 17:40
  • What client is connecting? What are the parameters on the "connection"? Example: `node.js var connection = mysql.createConnection({ ... , charset : 'utf8mb4'})` – Rick James Nov 30 '21 at 22:44
  • I use dbForge. It detects the character set of the database (UTF8MB4). I have determined that get_gcs() calls in queries slows them down substantially, so they aren't being applied to indexes... Even being defined as DETERMINISTIC. – Floobinator Nov 30 '21 at 23:49

1 Answers1

1

One thing that should (and can) be fixed -- UCASE(in_system) = UCASE(name_system)

If both columns are in the same charset and collation, UCASE is unnecessary. Also, the collation must be ..._ci to be "Case Insensitive".

Any function call hides the columns from use in any index. Hence the WHERE clause in the function cannot use an index.

global_custom_setting needs INDEX(name_system, name_field, name_value). (The columns may be in any order.)

Apparently, the FUNCTION is using collation_connection: utf8mb4_0900_ai_ci, which may be different than the connection's collation.

Suggest trying this after removing the UPPERs and adding the INDEX:

SET NAMES utf8mb4, COLLATE utf8mb4_0900_ai_ci;
DROP FUNCTION get_gcs;
DELIMITER $$
((recreate the function))
DELIMITER ;
((similarly for get_gss))

These seem to apply to strings in a Function:

  • Arguments to a function: collation_database
  • Literal strings in function: collation_connection
  • Table columns: collation of the columns

Functions can never be sargable (that I know of). Some expressions involve "implicit" functions, such as datatype conversion and collation mismatch.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thanks Rick! That code was written nearly 2 years ago, so I completely forgot about the unnecessary use of UCASE(). Removing it did speed things up, but my understanding is the core problem is MySQL is calling the function against every row in a query vs. a single call registering a static value. For example, if I replace the function call with a hardcoded number or even a variable, it's 5-10x faster. Is there any way - at all - to have MySQL use a function result as a singular call rather than recursive? I thought DETERMINISTIC would do it, but apparently not... – Floobinator Dec 01 '21 at 16:15
  • @Floobinator - I have not figured out your `DETERMINISTIC` problem. As for function calls -- Those that are obviously constant are evaluated _before_ running the query; eg, `UPPER("abcd")`. But those that cannot be evaluated upfront (eg, `UPPER(column)`) encounter the "sargable" problem (cf Wikipedia). – Rick James Dec 01 '21 at 18:49
  • what about functions that just select a value from a table? If the UCASE() encapsulates are removed, would the function then be Sargable, or non-sargable due to it being a lookup? – Floobinator Dec 01 '21 at 21:30
  • Inconsistent collation is effectively a function call. That is what I was talking about in the body of my Answer. – Rick James Dec 01 '21 at 22:05
  • Understood; but assume the collation is OK; can a DETERMINISTIC function that fetches a value from a table based on passed parameters be treated as Sargable by the compiler? I think that's the root of the question. I would think the SELECT presence in the function may nullify any Sargable state of the function... my goal is to simply have the function call treated as a 1-time CONST reference; nothing more. – Floobinator Dec 01 '21 at 22:58
  • so I confirmed with SHOW FUNCTION STATUS that the character_set_client is utf8mb4 and collation_connection is utf8mb4_unicode_ci for get_gss and get_gcs. However SELECT USER(), charset(USER()),COLLATION(USER()); is showing utf8mb3 and utf8_general_ci. This is insane. I'm using dbForge and I select use the DB Charset, etc. which is utf8mb4 so something is very crazy with the connection details. Any ideas? Also, if I am the same charset and collation as the function, should that function be sargable? – Floobinator Dec 02 '21 at 19:51
  • I _think_ the default collation for utf8mb4 is utf8mb4_0900_ai_ci. Yes, it is insane. A function is never sargable; that is `UPPER(col)` is always a problem. As for what seems to be making the function not `DETERMINISTIC`, I do not know. I am _guessing_ that an inconsistent collation is the problem. – Rick James Dec 03 '21 at 00:38
  • Actually the single statement "A function is never sargable" answers everything. Now I'm trying to figure out the best way to code our lookup system into hundreds of queries to properly use indexes (and be sargable). It looks like variable declaration is the best option. – Floobinator Dec 03 '21 at 16:54
  • Rick James you've been so helpful; can you adjust your answer to say "functions can never be sargable" and recommend either Variable or CTE referencing to solve the issue? Note that solves the issue :) – Floobinator Dec 03 '21 at 16:57
  • @Floobinator - @ variables can add another dimension of problems. CTEs are won't change any of the issues. – Rick James Dec 03 '21 at 17:37
  • Well all I know is switching to variables for WHERE X = Y (with Y being the variable) resolves the issue. We must be able to use global shared values from a table that the back-end system reads in our queries in a sargable fashion. It seems variables are the only choice for this case. – Floobinator Dec 03 '21 at 18:55