0

I have a table with a column "TAGS". Each row in this column has a bunch of dictionaries separated by commas. It looks like this:

Row 1:

{
   "id":"334",
   "name":"A"
},
{
   "id":"8999",
   "name":"B"
}

Row 2:

{
   "id":"334",
   "name":"C"
},
{
   "id":"8999",
   "name":"D"
}

I want to extract the "names" from the dictionaries and group them for each row. I want to create a new column "NAMES" that contains a list of all names. For example this:

NAMES
["A", "B"]
["C", "D"]
Select * from TAGS_TABLE

I was trying some thing like this:

SELECT regexp_substr(regexp_substr(regexp_substr(TAGS, '"name\\W+\\w+"'),'"\\w+',2),'\\w+') FROM TAGS_TABLE

SELECT listagg(regexp_substr(regexp_substr(regexp_substr(TAGS, '"name\\W+\\w+"'),'"\\w+',2),'\\w+'), ', ') FROM TAGS_TABLE

and although it works correctly in extracting the names, it only gives the first row's "name" values. so regexp_substr is being applied to just the first row, instead of all rows separately.

How can I ensure that the regexp function is applied to each row separately?

x89
  • 2,798
  • 5
  • 46
  • 110

1 Answers1

1

Instead of REGEXP_SUBSTR, you can use REGEXP_SUBSTR_ALL:

Instead of returning the first result, it returns "an ARRAY that contains all substrings that match a regular expression within a string".

Felipe Hoffa
  • 54,922
  • 16
  • 151
  • 325