3

It's common to diagnose web app with HAR file especially when seeking technical support from another person. However, HAR file contains sensitive info like request cookies. For example, if following this guide, it could share cookie values to technical support if HAR file is not cleaned up.

So I prefer deleting request cookies before sharing the HAR file.

Is there any simple command to do it?

shawnzhu
  • 7,233
  • 4
  • 35
  • 51
  • 1
    I use `cat given.har | jq 'del(.log.entries[].request.headers[] | select(.name == "Cookie"))' | jq 'del(.log.entries[].request.cookies)'` but expecting better answer(s) – shawnzhu Jan 22 '21 at 22:35
  • 1
    Please follow the [mcve] guidelines as much as possible. In particular, it would be useful to see a relevant extract from `cat given.har` – peak Jan 22 '21 at 23:05
  • @shawnzhu The use-case seems legitimate and practical. Since you're asking for a *better* answer, how would a *better* answer improve on your existing oneliner? Thanks – wsdookadr Jan 24 '21 at 21:28
  • 4
    @shawnzhu For what it's worth, there is a [HAR sanitizer here](https://github.com/google/har-sanitizer) written by Google, which [was also dockerized](https://www.smcmaster.com/2020/03/har-sanitizer-in-dockerhub.html) and available on dockerhub. It's capable of stripping [cookies, headers and various query string parameters](https://github.com/google/har-sanitizer/blob/master/harsanitizer/harsanitizer.py#L620) . – wsdookadr Jan 24 '21 at 21:35
  • @wsdookadr my current method is just a quick idea. I would love to get feedback on anything else should be cleaned up besides cookies. I like the answer of the HAR sanitizer and would love to accept it as an answer. – shawnzhu Jan 27 '21 at 14:36
  • using jq seems a perfectly reasonable solution to me. Never heard about HAR sanitizer, but it seems another good option. An alternative would be to not record the HAR file in a browser, and use a tool like https://www.webpagetest.org/ instead. – jackdbd May 07 '21 at 11:15

1 Answers1

2

To indiscriminately delete all request.cookies:

jq 'del(.. | .request?.cookies?)' www.ibm.com.har

As an example, given this JSON file:

$ jq -M . /tmp/hartest.json
{
  "one": {
    "foo": {
      "bar": true,
      "baz": 23
    }
  },
  "two": {
    "foo": {
      "bar": "Hello",
      "baz": "World"
    }
  },
  "three": {
    "fu": {
      "bar": "gone",
      "baz": "forgotten"
    }
  }
}

I can get rid of all 'bar' keys with this command:

jq -M 'del( .. | .bar?)' /tmp/hartest.json

Running the before and after through diff:

$ diff <(jq -M . /tmp/hartest.json) <(jq -M 'del( .. | .bar?)' /tmp/hartest.json )
4d3
<       "bar": true,
10d8
<       "bar": "Hello",
16d13
<       "bar": "gone",

If I just wanted to get rid of '.foo.bar' only, I would use this:

jq -M 'del( .. | .foo?.bar?)' /tmp/hartest.json

which, again, running through diff gives us:

$ diff <(jq -M . /tmp/hartest.json) <(jq -M 'del( .. | .foo?.bar?)' /tmp/hartest.json )
4d3
<       "bar": true,
10d8
<       "bar": "Hello",
Joe Casadonte
  • 15,888
  • 11
  • 45
  • 57