0

Let me explain . I have a json data set with values something like this:

"clientRequest": {
        "uri": "/flow.php?id=FS-6097&utm_source=facebook&utm_medium=cpc&utm_term=cold",
        "body": null,
        "bytes": 608,
        "flags": 1,     
        "referer": "http://m.facebook.com/",        
        "httpMethod": "GET",
        "httpProtocol": "HTTP/1.1"
    },

Now i want to create a Virtual column which only fetch value "FS-6097" from "clientRequest.uri". So a new column which just contains "FS-6097", not just whole uri.

I am already creating a virtual column for whole uri like below, which is fine.

ALTER TABLE `table_xyz_json`
    ADD `url` TEXT
    GENERATED ALWAYS AS (JSON_UNQUOTE(
        JSON_EXTRACT(json_data, '$.clientRequest.uri')
    ))
    VIRTUAL NULL;

any help would be highly appreciated .

RashFlash
  • 992
  • 2
  • 20
  • 40

1 Answers1

1

One option is to use SUBSTRING_INDEX function:

ALTER TABLE `table_xyz_json`
  ADD `url` TEXT GENERATED ALWAYS
  AS (`json_data` ->> '$.clientRequest.uri') VIRTUAL NULL,
  ADD `id` VARCHAR(7) GENERATED ALWAYS
  AS (SUBSTRING_INDEX(
        SUBSTRING_INDEX(`url`,
        'id=', -1),
      '&', 1)
     ) VIRTUAL NULL;

See db-fiddle.

UPDATE

You should adjust the expression of the column generated according to all the business rules that apply. For example, if a rule is that id may not exist you can try something like:

ALTER TABLE `table_xyz_json`
  ADD `url` TEXT GENERATED ALWAYS
  AS (`json_data` ->> '$.clientRequest.uri') VIRTUAL NULL,
  ADD `id` VARCHAR(7) GENERATED ALWAYS
  AS (IF(`url` REGEXP 'id=',
         SUBSTRING_INDEX(
           SUBSTRING_INDEX(`url`,
           'id=', -1),
         '&', 1),
         NULL
        )
     ) VIRTUAL NULL;

See db-fiddle.

wchiquito
  • 16,177
  • 2
  • 34
  • 45
  • thanks this is great. What if i want some adjustment like. sometimes url contains "lp=XXXX" instead of "id=XXXX" . so i can make sure that column contains both of this – RashFlash Sep 13 '17 at 07:55
  • Hi, i applied it on my Dataset, and found that its not working as it should be. example for urls which don't have "id=" , like " /apple-touch-icon-precomposed.png " , it generates virtual col value as "apple" . – RashFlash Sep 14 '17 at 02:14