2

My data is dirty, and now I am trying to correct them.

The data is like:

mysql> select attrs from goods  limit 10;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| attrs                                                                                                                                                                                                                                                                    |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"logo": "1", "cover": "", "level": 0, "tag_6": 8, "summary": "L1", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1532572800000, "courseStartTime": 1531706400000}  |
| {"logo": "1", "cover": "", "level": 0, "tag_6": 8, "summary": "L1", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1533717600000, "courseStartTime": 1532851200000}  |
| {"logo": "1", "cover": "", "level": 0, "tag_6": 8, "summary": "L1", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1534851600000, "courseStartTime": 1533985200000}  |
| {"logo": "1", "cover": "", "level": 0, "tag_6": 8, "summary": "L2", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1532594400000, "courseStartTime": 1531728000000}  |
| {"logo": "1", "cover": "", "level": 0, "tag_147": 8, "tag_145": 2 "summary": "L2", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1533728400000, "courseStartTime": 1532862000000}  |
| {"logo": "1", "cover": "", "level": 0, "tag_6": 8, "summary": "L2", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1534819200000, "courseStartTime": 1533952800000}  |
| {"logo": "1", "cover": "", "level": 0, "tag_127": 8, "summary": "NGL", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1532605200000, "courseStartTime": 1531738800000} |
| {"logo": "1", "cover": "", "level": 0, "tag_6": 8, "summary": "NGL", "showStatus": 0, "classAmount": 0, "productType": 0, "courseEndTime": 1533696000000, "courseStartTime": 1532829600000} |
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

What I want to remove some (key, value) pairs in the column attrs if any of these pairs exists. For example tag_147 and tag_124 are dirty, then any hit pair would be removed.

{"tag_147": 1, "tag_124": 2}  ===>  {}
{"tag_147": 1 "tag_100":2}  ==>  {"tag_100": 2}

How can i achieve it? Thanks.

What I have tried is 1. find attrs that contains keys tag_147 and tag_124. 2. Then update attrs. My original Sql is like:

update goods
set json_remove(*XXX*) 
where (select json_keys(attrs) from goods ) contains  (*tag_147*, *tag_127*)

Here I was blocked by constructing the contains statement...

shijie xu
  • 1,975
  • 21
  • 52

1 Answers1

4

If you just want to delete those two tags from the attrs values that contain both of them, you can use JSON_REMOVE on any attrs values found by JSON_CONTAINS_PATH, using the all mode of JSON_CONTAINS_PATH to only find values which contain both tags:

UPDATE goods
SET attrs = JSON_REMOVE(attrs, '$.tag_147', '$.tag_124')
WHERE JSON_CONTAINS_PATH(attrs, 'all', '$.tag_147', '$.tag_124')

If you want to completely remove rows where the attrs values contains both tags, you can use the same JSON_CONTAINS_PATH condition in a DELETE query:

DELETE
FROM goods
WHERE JSON_CONTAINS_PATH(attrs, 'all', '$.tag_147', '$.tag_124')

Demo on dbfiddle

If you want to delete those tags from the attrs values that contain any of them, you can remove the WHERE clause in the UPDATE i.e.

UPDATE goods
SET attrs = JSON_REMOVE(attrs, '$.tag_147', '$.tag_145')

And to delete rows with attrs values containing any of the tags, change the all parameter to JSON_CONTAINS_PATH to one:

DELETE
FROM goods
WHERE JSON_CONTAINS_PATH(attrs, 'one', '$.tag_147', '$.tag_145')

Demo on dbfiddle

Nick
  • 138,499
  • 22
  • 57
  • 95
  • @Akina without the `WHERE` the `UPDATE` will remove those tags from values which only have one of them instead of both. See https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=79e8a3d66f34beb76065bc39a6f9eb84 – Nick Mar 05 '20 at 05:54
  • @Akina of course that might be what OP wants... I'll add as an edit – Nick Mar 05 '20 at 05:56
  • @Akina I was basing the answer on `then pairs with both keys should all be removed` but the question is unclear, which is why I have edited the answer to include the alternate interpretation – Nick Mar 05 '20 at 06:04
  • 1
    @shijiexu the second `UPDATE` query in my answer will do what you want. `JSON_REMOVE` will only remove the pairs if they actually exist, so you don't need the `WHERE` clause on the `UPDATE` query. See the second demo link... – Nick Mar 05 '20 at 07:29