2

I noticed the following: I'm creating a view for a table where some columns contain csv values. I want to split those csv files on the fly and make them individual columns in the view.

As a start, I simply created a function that returns the input value itself:

CREATE FUNCTION `split`(`csv` TEXT, `pos` INT)
RETURNS varchar(255) CHARSET utf8
return csv

Now the interesting part: as soon as I make use of the split function in my CREATE VIEW AS SELECT split(mytable.csv, 3) as split FROM mytable table, the execution time of SELECT queries is doubled (even though I simply return the value itself so far!).

Isn't that strange? Or is this supposed to work like that, as mysql cannot optimized queries when using functions?

Lateron I want to replace the function as follows. But that does not matter for the main question part:

CREATE FUNCTION split(csv TEXT, pos INT)
RETURNS varchar(255) CHARSET utf8
return substring_index(substring_index(csv, ',', pos), ',', -1)

Then a split('my,example,split,string', 2) would extract "example".

membersound
  • 81,582
  • 193
  • 585
  • 1,120
  • Maybe it has something to do with underlying data coming in, and your split function returning capacity of 255. If doing for many columns, how many. Post the code of the function, what is IT doing. Also suggest editing your current post and showing sample of the incoming data and what the splitting should result with. – DRapp May 04 '20 at 13:13
  • Well as written, even a function that does nothing beside returning the input value itself, that doubles the execution time. And that is mainly what concerns me. – membersound May 04 '20 at 13:21
  • Still unsure of actual data, but have you tried returning a smaller string just for grins? Say a varchar(20)? Does that make any change to it? – DRapp May 04 '20 at 13:32
  • Good idea, but unfortunately it does not make any difference. Also returning `TEXT` makes not difference. I assume the main problem is calling a function at all? – membersound May 04 '20 at 13:34
  • How about using SQL directly such as: cast( substring_index(substring_index(csv, ',', pos), ',', -1 as varchar(255) ) as YourColumn…. Just to see if calling the native substring function is doing something different than the extra layer of your function? – DRapp May 04 '20 at 13:36
  • But that's exactly my point: as soon as I'm using the `split(my.field, 2)` in my query (where the function simply returns the value itself without logic), then the query is slower. – membersound May 04 '20 at 13:38
  • This is just a shot in the dark (I don't even know if MySQL actually does something with it) but, does defining the function as `DETERMINISTIC` alter these figures? – Álvaro González May 04 '20 at 13:39
  • Indeed my function probably should be `deterministic`, sadly no difference.. – membersound May 04 '20 at 13:44
  • Is this exclusive to views? Is the query itself (when ran alone) unaffected? Perhaps you've just hit an optimisation loophole in MySQL internals (just like e.g. subqueries where terribly optimised in earlier versions). – Álvaro González May 04 '20 at 13:54

0 Answers0