FIND_IN_SET - Getting partial output due to doubled values

Question

Table "skus":

plant_species_ids
1,2,3,2,3,1

Table "plant_species"

id	name
1	Plant 1
2	Plant 2
3	Plant 3

The relevant part of my MySQL code:

...
GROUP_CONCAT(plant_species.name SEPARATOR ', ') as plant_species_names
...
LEFT JOIN plant_species 
       ON FIND_IN_SET(plant_species.id, skus.plant_species_ids) > 0

Output I get:

Plant 1,Plant 2,Plant 3

Result I need to get:

Plant 1,Plant 2,Plant 3, Plant 2,Plant 3,Plant 1

From what I can tell it's acting like I'm using DISTINCT when I'm not.

Any ideas how to do this?

Edit

Here is my final code for anyone else that has this same problem: ``

SELECT orders.id, orders.customer_id, orders.invoice_amount, orders.delivery_date, orders.delivery_time, GROUP_CONCAT(skus.name SEPARATOR ', ') as sku_name, orders.sku_ids as sku_ids, orders.sku_weights as sku_weights

FROM orders

JOIN json_table(concat('[',orders.sku_ids,']'), '$[*]' columns (id int path '$')) AS species_ids

LEFT JOIN skus on skus.id=species_ids.id
WHERE orders.id=10
GROUP BY orders.id;

``

much easier to answer if you provide create table and full select statements (even if you make up the other parts of it) — ysth, May 22 '23 at 23:39
what does `select version();` show? how many ids at most will there be in plant_species_ids? — ysth, May 22 '23 at 23:40

ysth · Accepted Answer · 2023-05-23T00:21:11.577

A simpler way is to wrap your ids column in brackets and expand it using json_table, but I'm not sure how to do that as a left join.

select skus.id, GROUP_CONCAT(plant_species.name SEPARATOR ', ') as plant_species_names
from skus
join json_table(concat('[',skus.plant_species_ids,']'), '$[*]' columns (id int path '$')) as species_ids
left join plant_species on plant_species.id=species_ids.id
group by skus.id

fiddle

The original select doesn't find the later duplicate ids because the join is only comparing skus rows to plant_species rows, not the individual ids in plant_species_ids, and find_in_set will find only the first instance of a plant_species id in plant_species_ids.

Using `JSON_TABLE` was very brilliant. – lemon May 23 '23 at 00:36 — lemon, May 23 '23 at 00:36

lemon · Answer 2 · 2023-05-22T23:48:09.957

If you're using MySQL 8.0, you may use a recursive query to unnest all your comma separated "plant_species_ids" data:

base step: compute the length of your string, by counting the commas
recursive step: extract your values among commas, one at each step, using SUBSTRING_INDEX and decreasing an integer index

Stop the recursion once the index has reached 0. Then you can join your unnested table with the "plant_species" table, and apply aggregation.

WITH RECURSIVE cte AS (
    SELECT plant_species_ids,
           '-1' AS element,
           CHAR_LENGTH(REGEXP_REPLACE(plant_species_ids, '[^,]', ''))+1 AS num_elements
    FROM skus
  
    UNION ALL
  
    SELECT plant_species_ids, 
           SUBSTRING_INDEX(SUBSTRING_INDEX(plant_species_ids, ',', num_elements), ',', -1),
           num_elements - 1
    FROM cte
    WHERE num_elements > 0
)
SELECT GROUP_CONCAT(ps.name ORDER BY cte.num_elements) AS plants
FROM       cte
INNER JOIN plant_species ps
        ON cte.element = ps.id
WHERE cte.element > 0

Output:

plants
Plant 1,Plant 2,Plant 3,Plant 2,Plant 3,Plant 1

Check the demo here.

Note: As a general consideration, it's better not to store data as comma-separated values, as it's very difficult to apply data manipulation on such kind of data, and typically need to employ quite inefficient and ugly queries to deal with it.

That does work, thank you! It is rather complicated. Is there any way to simplify it? — Scott Paterson, May 22 '23 at 23:48
It's the price of having comma-separated-valued fields, as stated in the note. You can't do much about this complexity unluckily. — lemon, May 22 '23 at 23:48

FIND_IN_SET - Getting partial output due to doubled values

2 Answers2