If you come from App Engine Standard Pyton (2) the logging experience is quite frustrating. The documentation on Google Cloud Services is a mess, scattered and often outdated. I found no example for best practices of high preformance logging on google App Engine node.js standard environment. So how to get decent, aggregated logging?
I think part of the problem is the fact that for Python the Google Infrastructure calls your code on a per request basis (somewhat an early implementation of cloud functions) while with node.js you have a long running process which is contacted by the infrastructure via HTTP/TCP/IP on localhost. So the GAE infrastructure can not thell easily which output belongs to which request.
Google suggests:
To write log entries, we recommend that you integrate the Bunyan or
Winston plugins with Cloud Logging. (source)
But using an in-process Cloud Logging client is generally an problematic approach for the async / single process server approach offered by node.js / Express. My Testing indicates that per logger call you get one outbound HTTP request to Stackdriver / Clound Logging. Batching seems to be possible but would been a pain to implement in failure situations.
Logging to stdout / stderr and transporting logs from there to the log viewer in a separate process (provided by App Engine infrastructure) provides much better decoupeling. And it should be cheaper because semingly you dont pay for the cycles spend there (?). This shoud be possible, as stated by Google:
you can send simple text strings to stdout and stderr. The strings
will appear as messages in the Logs Viewer, […] If you want to filter
these strings in the Logs Viewer by severity level, you need to format
them as structured data. For more information, see Structured logging.
If you want to correlate entries in the app log with the request log,
your structured app log entries need to contain the request’s trace
identifier. […] You can use the same technique in your App Engine
apps. (source)
To get structured logging output (basically lines of JSON) something like Pino is the right aproach. You need some setup to map Pino field names to the ones used by Stackdriver / Google Cloud Logging.
Interesting doumentation for field Names at Google is somewhat scattered:
Some mapping has to be done. Setting up Pino for Google / Stackdriver looks like this:
// https://github.com/pinojs/pino/blob/master/docs/api.md#level-string
// https://cloud.google.com/logging/docs/agent/configuration#special-fields
function _levelToSeverity (level) {
if (level < 30) { return 'DEBUG' }
if (level <= 30) { return 'INFO' }
if (level <= 39) { return 'NOTICE' }
if (level <= 49) { return 'WARNING' }
if (level <= 59) { return 'ERROR' }
if (level <= 69) { return 'CRITICAL' } // also fatal
if (level <= 79) { return 'ALERT' }
if (level <= 99) { return 'EMERGENCY' }
return 'DEFAULT'
}
const logger = pino({
// Default Level: `info` or taken from the environment Variable LOG_LEVEL
// levels below this get supressed so set this to `debug` or `trace`
// if you want to be able to reveal details in the App Engine logs
level: process.env.LOG_LEVEL || 'info',
formatters: { // Here the Mapping is happening
level (label, number) {
return { level: number, severity: _levelToSeverity(number) }
}
},
messageKey: 'message' // AppEngine / StackDriver want's this key
// Enable Pretty Printing on the development Server
prettyPrint: (process.env.NODE_ENV || 'development') === 'development',
})
This enables us to write out logs in a format which can be digested by the Google Log Viewer and allows us to 'drill down' in the logs.
Next thig is to enable logging per HTTP request. pino-http uses the logger defined above and can be wired into the Google machinery to use Googles Request-IDs:
const httpLogger = require('pino-http')({
logger: logger,
genReqId: function (req) {
// get a Google rrovided request ID to correlate entries
return (
req.headers['x-appengine-request-log-id'] ||
req.headers['x-cloud-trace-context'] ||
req.id
)
},
useLevel: 'debug'
})
This adds a log Instance to every Express-Request and can be used in your route hadlers as req.log.info('bar')
etc. You can also add structured fields to your logging like req.log.info({user: 'mike', action: 'create'}, 'foo')
. These fields can be queried in the log viewer.
I usually do not use pino-http
because it adds to much clutter. Instead I generate the requestid by myself.
If you want to correlate entries in the app log with the request log,
your structured app log entries need to contain the request’s trace
identifier. You can extract the trace identifier from the
X-Cloud-Trace-Context request header. In your structured log entry,
write the ID to a field named logging.googleapis.com/trace. For more
information about the X-Cloud-Trace-Context header
(X-Cloud-Trace-Context). (source)
This can be done by a Express-Middleware:
const gaeTraceMiddleware = (req, res, next) => {
// Add log correlation to nest all log messages beneath request log in Log Viewer.
const traceHeader = req.header('X-Cloud-Trace-Context')
if (traceHeader && process.env.GOOGLE_CLOUD_PROJECT) {
const [trace] = traceHeader.split('/')
req.log = res.log = req.log.child({
'logging.googleapis.com/trace': `projects/${process.env.GOOGLE_CLOUD_PROJECT}/traces/${trace}`,
'appengine.googleapis.com/request_id': req.header('x-appengine-request-log-id')
})
}
next()
}
Now the Express instance wired up to gaeTraceMiddleware
can use res.log
to output stuff for Google Log Viewer. There it will appear nearly as neat as Python logs:
const app = express()
app.use(httpLogger)
app.use(gaeTraceMiddleware)
// App Engine Instance Start
app.get('/_ah/warmup', (req, res) => {
res.log.info('Warmup')
res.send('HOT')
})
But you still will be missing per request aggregation but still the logging experience get's much better with this.