Here are some regex patterns wich can be used for obfuscating request body data in various formats.
Of course the fisrt thing you need to do is to add obfuscated data to log file line format with log_format
directive:
log_format custom '$remote_addr - $remote_user [$time_local] '
'"$request" "$obfuscated_request_body" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
Let's look at the following post body data formats (assuming the field we need to obfuscate is password
).
- Request body is a JSON string (typical of REST API request)
JSON sample:
{"email":"test@test.com","password":"myPassword"}
Escaped JSON string:
{\x22email\x22:\x22test@test.com\x22,\x22password\x22:\x22myPassword\x22}
nginx map
block:
map $request_body $obfuscated_request_body {
"~(.*[{,]\\x22password\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
default $request_body;
}
- Request body is a JSON array of
name
and value
pairs (returned by jQuery serializeArray()
function)
JSON sample:
[{"name":"email","value":"test@test.com"},{"name":"password","value":"myPassword"}]
Escaped JSON string:
[{\x22name\x22:\x22email\x22,\x22value\x22:\x22test@test.com\x22},{\x22name\x22:\x22password\x22,\x22value\x22:\x22myPassword\x22}]
nginx map
block:
map $request_body $obfuscated_request_body {
"~(.*[\[,]{\\x22name\\x22:\\x22password\\x22,\\x22value\\x22:\\x22).*?(\\x22}[,\]].*)" $1********$2;
default $request_body;
}
- Request body is an urlencoded string (submitted by HTML form with
enctype="application/x-www-form-urlencoded"
)
POST body sample:
login=test%40test.com&password=myPassword
nginx map
block:
nginx map
block:
map $request_body $obfuscated_request_body {
~(^|.*&)(password=)[^&]*(&.*|$) $1$2********$3;
default $request_body;
}
If you need to obfuscate more than one data field, you can chain several map
transformations:
log_format custom '$remote_addr - $remote_user [$time_local] '
'"$request" "$obfuscated_request_body_2" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
map $request_body $obfuscated_request_body_1 {
"~(.*[{,]\\x22password\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
default $request_body;
}
map $obfuscated_request_body_1 $obfuscated_request_body_2 {
"~(.*[{,]\\x22email\\x22:\\x22).*?(\\x22[,}].*)" $1********$2;
default $request_body;
}
All given regexes will be working only with escape=default
escaping mode of log_format
nginx directive! If for some reason you need to change this mode to escape=json
(available from nginx 1.11.8) or escape=none
(available from nginx 1.13.10), I built regexes for this escaping modes too, but for some strange reasons couldn't managed them to work with nginx until specifying pcre_jit on;
directive (although they pass other PCRE tests). For those who interested, these regexes are
- for
escape=json
escaping mode:
map $request_body $obfuscated_request_body {
"~(.*[{,]\\\"password\\\":\\\")(?:[^\\]|\\{3}\"|\\{2}[bfnrt]|\\{4})*(\\\"[,}].*)" $1********$2;
default $request_body;
}
for JSON string, and
map $request_body $obfuscated_request_body {
"~(.*[\[,]{\\\"name\\\":\\\"password\\\",\\\"value\\\":\\\")(?:[^\\]|\\{3}\"|\\{2}[bfnrt]|\\{4})*(\\\"}[,\]].*)" $1********$2;
default $request_body;
}
for JSON array of name
and value
pairs.
- for
escape=none
escaping mode:
map $request_body $obfuscated_request_body {
"~(.*[{,]\"password\":\")(?:[^\\\"]|\\.)*(\"[,}].*)' $1********$2;
default $request_body;
}
for JSON string, and
map $request_body $obfuscated_request_body {
"~(.*[\[,]{\"name\":\"password\",\"value\":\")(?:[^\\\"]|\\.)*(\"}[,\]].*)" $1********$2;
default $request_body;
}
for JSON array of name
and value
pairs.
Bonus - obfuscating GET request query parameters
Sometimes people also need to obfuscate data passed as GET request query parameters. To do this while preserving the original nginx access log format, let's look at the default access log format first:
log_format combined '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
nginx bulit-in $request
variable can be represented as $request_method $request_uri $server_protocol
sequence of variables:
log_format combined '$remote_addr - $remote_user [$time_local] '
'"$request_method $request_uri $server_protocol" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
We need to obfuscate part of $request_uri
variable data:
log_format custom '$remote_addr - $remote_user [$time_local] '
'"$request_method $obfuscated_request_uri $server_protocol" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
map $request_uri $obfuscated_request_uri {
~(.+\?)(.*&)?(password=)[^&]*(&.*|$) $1$2$3********$4;
default $request_uri;
}
To obfuscate several query parameters you can chain several map
translations as shown above.
Update - safety considerations
Alvin Thompson commented OP's question mentioning some attack vectors like very large compressed requests.
It is worth mentioning that nginx will log these requests "as-is" in their compressed form, so log files will not grow an unpredictable way.
Assuming our log file has following format:
log_format debug '$remote_addr - $remote_user [$time_local] '
'"$request" $request_length $content_length '
'"$request_body" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
request with gzipped body of 5,000 spaces will be logged as
127.0.0.1 - - [09/Feb/2020:05:27:41 +0200] "POST /dump.php HTTP/1.1" 193 41 "\x1F\x8B\x08\x00\x00\x00\x00\x00\x00\x0B\xED\xC11\x01\x00\x00\x00\xC2\xA0*\xEB\x9F\xD2\x14~@\x01\x00\x00\x00\x00o\x03`,\x0B\x87\x88\x13\x00\x00" 200 6881 "-" "curl/7.62.0"
As you can see, $request_length
and $content_length
values (193 and 41) reflects the length of the incoming data from the client and not the byte count of the decompressed data stream.
In order to filter abnormally large uncompressed requests, you can additionally filter request bodies by their length:
map $content_length $processed_request_body {
# Here are some regexes for log filtering by POST body maximum size
# (only one should be used at a time)
# Content length value is 4 digits or more ($request_length > 999)
"~(.*\d{4})" "Too big (request length $1 bytes)";
# Content length > 499
"~^((?:[5-9]|\d{2,})\d{2})" "Too big (request length $1 bytes)";
# Content length > 2999
"~^((?:[3-9]|\d{2,})\d{3})" "Too big (request length $1 bytes)";
default $request_body;
}
map $processed_request_body $obfuscated_request_body {
...
default $processed_request_body;
}