5

I want to log the $request_body in the access logs.

But some of the requests have some JSON fields that are sensitive like passwords.

Example:

[2019-03-28] 201 - POST /api/user/add HTTP/1.1 - {\x22email\x22:\x22test@test.com\x22,\x22password\x22:\x22myPassword\x22}

Is there a way to obfuscate the password value so the output would look something like this:

[2019-03-28] 201 - POST /api/user/add HTTP/1.1 - {\x22email\x22:\x22test@test.com\x22,\x22password\x22:\x22****\x22}
Raed
  • 519
  • 1
  • 6
  • 23
  • 4
    You really don't want to log the request bodies. Even if you figure out how to mask the passwords, you'll have to remember to update the masking code whenever you create a new API with sensitive information or modify an existing API. In my experience (25 years), no one ever remembers to. It also opens you up to other attack vectors--for example, what happens if someone crafts requests with a 1GB parameter (easy and fast enough with HTTP compression) and you're trying to log them all? – Alvin Thompson Apr 03 '19 at 19:03

2 Answers2

8

Here are some regex patterns wich can be used for obfuscating request body data in various formats.

Of course the fisrt thing you need to do is to add obfuscated data to log file line format with log_format directive:

log_format custom '$remote_addr - $remote_user [$time_local] '
                    '"$request" "$obfuscated_request_body" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

Let's look at the following post body data formats (assuming the field we need to obfuscate is password).

  • Request body is a JSON string (typical of REST API request)

JSON sample:

{"email":"test@test.com","password":"myPassword"}

Escaped JSON string:

{\x22email\x22:\x22test@test.com\x22,\x22password\x22:\x22myPassword\x22}

nginx map block:

map $request_body $obfuscated_request_body {
    "~(.*[{,]\\x22password\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
    default $request_body;
}
  • Request body is a JSON array of name and value pairs (returned by jQuery serializeArray() function)

JSON sample:

[{"name":"email","value":"test@test.com"},{"name":"password","value":"myPassword"}]

Escaped JSON string:

[{\x22name\x22:\x22email\x22,\x22value\x22:\x22test@test.com\x22},{\x22name\x22:\x22password\x22,\x22value\x22:\x22myPassword\x22}]

nginx map block:

map $request_body $obfuscated_request_body {
    "~(.*[\[,]{\\x22name\\x22:\\x22password\\x22,\\x22value\\x22:\\x22).*?(\\x22}[,\]].*)" $1********$2;
    default $request_body;
}
  • Request body is an urlencoded string (submitted by HTML form with enctype="application/x-www-form-urlencoded")

POST body sample:

login=test%40test.com&password=myPassword

nginx map block: nginx map block:

map $request_body $obfuscated_request_body {
    ~(^|.*&)(password=)[^&]*(&.*|$) $1$2********$3;
    default $request_body;
}

If you need to obfuscate more than one data field, you can chain several map transformations:

log_format custom '$remote_addr - $remote_user [$time_local] '
                  '"$request" "$obfuscated_request_body_2" $status $body_bytes_sent '
                  '"$http_referer" "$http_user_agent"';

map $request_body $obfuscated_request_body_1 {
    "~(.*[{,]\\x22password\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
    default $request_body;
}

map $obfuscated_request_body_1 $obfuscated_request_body_2 {
    "~(.*[{,]\\x22email\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
    default $request_body;
}

All given regexes will be working only with escape=default escaping mode of log_format nginx directive! If for some reason you need to change this mode to escape=json (available from nginx 1.11.8) or escape=none (available from nginx 1.13.10), I built regexes for this escaping modes too, but for some strange reasons couldn't managed them to work with nginx until specifying pcre_jit on; directive (although they pass other PCRE tests). For those who interested, these regexes are

  • for escape=json escaping mode:
map $request_body $obfuscated_request_body {
    "~(.*[{,]\\\"password\\\":\\\")(?:[^\\]|\\{3}\"|\\{2}[bfnrt]|\\{4})*(\\\"[,}].*)" $1********$2;
    default $request_body;
}

for JSON string, and

map $request_body $obfuscated_request_body {
    "~(.*[\[,]{\\\"name\\\":\\\"password\\\",\\\"value\\\":\\\")(?:[^\\]|\\{3}\"|\\{2}[bfnrt]|\\{4})*(\\\"}[,\]].*)" $1********$2;
    default $request_body;
}

for JSON array of name and value pairs.

  • for escape=none escaping mode:
map $request_body $obfuscated_request_body {
    "~(.*[{,]\"password\":\")(?:[^\\\"]|\\.)*(\"[,}].*)' $1********$2;
    default $request_body;
}

for JSON string, and

map $request_body $obfuscated_request_body {
    "~(.*[\[,]{\"name\":\"password\",\"value\":\")(?:[^\\\"]|\\.)*(\"}[,\]].*)" $1********$2;
    default $request_body;
}

for JSON array of name and value pairs.

Bonus - obfuscating GET request query parameters

Sometimes people also need to obfuscate data passed as GET request query parameters. To do this while preserving the original nginx access log format, let's look at the default access log format first:

log_format combined '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

nginx bulit-in $request variable can be represented as $request_method $request_uri $server_protocol sequence of variables:

log_format combined '$remote_addr - $remote_user [$time_local] '
                    '"$request_method $request_uri $server_protocol" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';

We need to obfuscate part of $request_uri variable data:

log_format custom '$remote_addr - $remote_user [$time_local] '
                  '"$request_method $obfuscated_request_uri $server_protocol" $status $body_bytes_sent '
                  '"$http_referer" "$http_user_agent"';

map $request_uri $obfuscated_request_uri {
    ~(.+\?)(.*&)?(password=)[^&]*(&.*|$) $1$2$3********$4;
    default $request_uri;
}

To obfuscate several query parameters you can chain several map translations as shown above.

Update - safety considerations

Alvin Thompson commented OP's question mentioning some attack vectors like very large compressed requests. It is worth mentioning that nginx will log these requests "as-is" in their compressed form, so log files will not grow an unpredictable way.

Assuming our log file has following format:

log_format debug '$remote_addr - $remote_user [$time_local] '
                 '"$request" $request_length $content_length '
                 '"$request_body" $status $body_bytes_sent '
                 '"$http_referer" "$http_user_agent"';

request with gzipped body of 5,000 spaces will be logged as

127.0.0.1 - - [09/Feb/2020:05:27:41 +0200] "POST /dump.php HTTP/1.1" 193 41 "\x1F\x8B\x08\x00\x00\x00\x00\x00\x00\x0B\xED\xC11\x01\x00\x00\x00\xC2\xA0*\xEB\x9F\xD2\x14~@\x01\x00\x00\x00\x00o\x03`,\x0B\x87\x88\x13\x00\x00" 200 6881 "-" "curl/7.62.0"

As you can see, $request_length and $content_length values (193 and 41) reflects the length of the incoming data from the client and not the byte count of the decompressed data stream.

In order to filter abnormally large uncompressed requests, you can additionally filter request bodies by their length:

map $content_length $processed_request_body {
    # Here are some regexes for log filtering by POST body maximum size
    # (only one should be used at a time)

    # Content length value is 4 digits or more ($request_length > 999)
    "~(.*\d{4})" "Too big (request length $1 bytes)";

    # Content length > 499
    "~^((?:[5-9]|\d{2,})\d{2})" "Too big (request length $1 bytes)";

    # Content length > 2999
    "~^((?:[3-9]|\d{2,})\d{3})" "Too big (request length $1 bytes)";

    default $request_body;
}

map $processed_request_body $obfuscated_request_body {
    ...
    default $processed_request_body;
}
Ivan Shatsky
  • 13,267
  • 2
  • 21
  • 37
-1

Look at this blog which talks about masking user data for logs: https://www.nginx.com/blog/data-masking-user-privacy-nginscript/

Faisal Memon
  • 1,047
  • 7
  • 7
  • 1
    From [How to answer](https://stackoverflow.com/help/how-to-answer): *Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline.* – Don't Panic Feb 04 '20 at 09:43