0

I have a table named 'Articles' that has some rows, inside each row has some image tag that I want to return all of those.

I used this query but it only returns one of those tags per row. And other lines are returned empty.

SELECT REGEXP_SUBSTR(body, '([0-9])+\.(jpg|png|gpeg)') from articles;

enter image description here

Article 1:
    <img src"54545343.png" />
    <img src"24352445.png" />
    <img src"24352435.png" />

article 2: 
<img src"24352435.png" />

article 3:
...

I want all of these images.

thank you for your help

UPDATE version 10.4.19-MariaDB

ali
  • 109
  • 9

1 Answers1

1

You could use a "dirty solution" like this, but it is horribly inefficient:

SELECT a.id, REGEXP_SUBSTR(body, '([0-9])+\.(jpg|png|gpeg)', 1, t.n) AS img
FROM articles a
CROSS JOIN (
    VALUES
        ROW(1), ROW(2), ROW(3), ROW(4), ROW(5),
        ROW(6), ROW(7), ROW(8), ROW(9), ROW(10)
) AS t(n)
HAVING img IS NOT NULL
ORDER BY id, img;

The fourth parameter to REGEXP_SUBSTR is:

occurrence: Which occurrence of a match to search for. If omitted, the default is 1.

The above query tries to retrieve occurrences 1 - 10.


This is probably a good example of

"just because you can, does not mean you should"

but you could use a stored function to extract the content of the image src attributes:

DELIMITER //
CREATE FUNCTION FIND_IMG_SRC(str text, occurrence int)
RETURNS VARCHAR(255)
DETERMINISTIC
BEGIN
    DECLARE loc INT DEFAULT 1;
    DECLARE src_loc INT;
    DECLARE i INT DEFAULT 0;
    DECLARE img_tag text DEFAULT '';
    WHILE(i < occurrence AND loc IS NOT NULL) DO
        SET loc = NULLIF(LOCATE('<img', str, loc + 1), 0);
        SET i = i + 1;
    END WHILE;
    
    IF loc IS NULL THEN
        RETURN NULL;
    ELSE
        # Get complete img tag
        SET img_tag = SUBSTR(str, loc, LOCATE('>', str, loc + 1) - loc + 1);

        # Check that img tag contains src attribute
        SET src_loc = LOCATE('src="', img_tag);
        IF src_loc = 0 THEN
            RETURN NULL;
        ELSE
            # Return content of src attribute
            RETURN SUBSTRING_INDEX(SUBSTR(img_tag, src_loc + 5), '"', 1);
        END IF;
    END IF;
END//
DELIMITER ;

And then the following query:

SELECT a.id, FIND_IMG_SRC(body, t.n) AS img
from articles a
CROSS JOIN (
    SELECT 1 AS n UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL
    SELECT 6      UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10
) AS t
HAVING img IS NOT NULL
ORDER BY id, img;

And here's a db<>fiddle.

Note: the above stored function expects src="..." and will not work with single quotes or spaces either side of the =. It will also fail if there is a > anywhere inside the <img>.

user1191247
  • 10,808
  • 2
  • 22
  • 32