0

I have a table with a column "TAGS". Each row in this column has a bunch of dictionaries separated by commas. It looks like this:

{
    "id": "334",
    "name": "A",
    "synonyms": "tul",
    "path": [
        "179",
        "1689",
    ]
},
{
    "id": "8999",
    "name": "B",
    "synonyms": "hh",
    "path": [
        "1098",
        "167",
    ]
}

I want to create a new column "NAMES" that contains a list of all names. For example this:

NAMES
["A", "B"]
Select * from TAGS_TABLE

How can I do this?

Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45
x89
  • 2,798
  • 5
  • 46
  • 110
  • Does this answer your question? [apply regexp\_substr to each row separately](https://stackoverflow.com/questions/73853902/apply-regexp-substr-to-each-row-separately) – Felipe Hoffa Sep 26 '22 at 14:58

1 Answers1

0

Well your data is "almost" JSON, so if we convert it to json, we can then parse it, and flatten it:

with data as (
select parse_json('['||column1||']') as json from values
('{
    "id": "334",
    "name": "A",
    "synonyms": "tul",
    "path": [
        "179",
        "1689",
    ]
},
{
    "id": "8999",
    "name": "B",
    "synonyms": "hh",
    "path": [
        "1098",
        "167",
    ]
}'),
('{
    "id": "334",
    "name": "C",
    "synonyms": "tul",
    "path": [
        "179",
        "1689",
    ]
},
{
    "id": "8999",
    "name": "D",
    "synonyms": "hh",
    "path": [
        "1098",
        "167",
    ]
}')
)
select array_agg(f.value:name) within group (order by f.index) as output
from data d,
  lateral flatten(input=>d.json) f
group by f.seq
order by f.seq

gives:

OUTPUT
[ "A", "B" ]
[ "C", "D" ]

REGEXP_SUBSTR_ALL

As already given to you in your other question...

select regexp_substr_all(column1, '"name"\\s*:\\s*"([^"]*)"',1,1,'e') as answer
from data;
Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45