1

Given some json documents like this:

{
   "_id":"00006073",
    "subscribersIds":[
      170968,
      225647
   ]
}
-----------------------------------
{
   "_id":"00006072",
   "subscribersIds":[
      170968
   ]
}
--------------------------------
{
   "_id":"00006074,
   "subscribersIds":[
      228195,
      225647
   ]
}

Do you know how can I get a list of the subscribersIds, without repetition? The result should be something like this 170968, 225647, 228195, because I will need to use the result of this query as a condition of another one.

For Couchebase there is the "UNNEST" command that does it, however I didn't find the properly way to do it in MySQL 8, as SELECT DISTINCT doc ->> '$.subscribersIds[*]' FROM customers will return [170968,225647],[170968],[228195,225647]

Thanks in advance!

Jeff Mira
  • 25
  • 5

1 Answers1

2
mysql> select * from mytable;
+----------+------------------+
| _id      | subscriberIds    |
+----------+------------------+
| 00006072 | [170968]         |
| 00006073 | [170968, 225647] |
| 00006074 | [228195, 225647] |
+----------+------------------+

mysql> select j.subscriberId from mytable, 
  json_table(mytable.subscriberIds, '$[*]' columns (subscriberId int path '$')) j;
+--------------+
| subscriberId |
+--------------+
|       170968 |
|       170968 |
|       225647 |
|       228195 |
|       225647 |
+--------------+

mysql> select distinct j.subscriberId from mytable,
  json_table(mytable.subscriberIds, '$[*]' columns (subscriberId int path '$')) j;
+--------------+
| subscriberId |
+--------------+
|       170968 |
|       225647 |
|       228195 |
+--------------+

This is a fairly complex query to write every time you want to get the set of distinct subscriberIds.

It would be much easier if you did not use JSON at all, but stored the id's in a normalized manner, one per row in a second table.

mysql> create table mySubscribers (_id char(8), subscriberId int, primary key (_id, subscriberId));

mysql> insert into mySubscribers (_id, subscriberId) select _id, subscriberId from mytable, json_table(subscriberIds, '$[*]' columns (subscriberId int path '$')) j;
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from mySubscribers;
+----------+--------------+
| _id      | subscriberId |
+----------+--------------+
| 00006072 |       170968 |
| 00006073 |       170968 |
| 00006073 |       225647 |
| 00006074 |       225647 |
| 00006074 |       228195 |
+----------+--------------+

mysql> select distinct subscriberId from mySubscribers;
+--------------+
| subscriberId |
+--------------+
|       170968 |
|       225647 |
|       228195 |
+--------------+
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thank you! Actually I'm working on the migration from Couchbase do MySQL 8 of a project, and unfortunately I cant reorganize the documents structure now (perhaps in the future), that is why I'm having to do this kind of query. Thanks a lot for your help! – Jeff Mira May 27 '21 at 21:40
  • do you prefer comma-join over explicit 'cross join' for json_table? I'm so used to thinking of it as code smell but I can see why saying 'cross join' doesn't quite fit something like json_table where there is a join criterion, just not in an explicit on clause. (so happy json_table is in mariadb 10.6 so I can get more experience with it) – ysth May 27 '21 at 21:59
  • If it were production code, I would use `cross join`, just to remain consistent in my habit of using SQL-92 syntax in all cases. I was using the SQL-89 style (comma-join) here just to be brief. – Bill Karwin May 27 '21 at 22:02