0

I'm very new to CloudWatch Insights, and I've been trying to understand how to get it to work with Python logging. Currently I have an AWS Glue ETL query setup in PySpark/Python. I am using the default logging package for Python in the script.

I've read the documentation, and I couldn't find any details as to how to format the logging that would make it query-able through CloudWatch Insights. Ideally, I would like to setup different fields in the log messages that I can query and get the values from through with Insights.

Here's an example of a logging message in the script:

import timeit

start = timeit.default_timer()

...run some code

stop = timeit.default_timer()

runtime = stop - start

logging.info('Runtime: {}'.format(runtime))

I would want to query the custom field like @Runtime to show all the runtimes in that column for different runs. With this, I would also like to see a simple Insight query example so I can build on that.

Anyone help would be really appreciated!

mlenthusiast
  • 1,094
  • 1
  • 12
  • 34
  • Can you put in the bits of code you have till now... The logging is the same as the python logging we usually write in python. What is the format you are looking for.. Examples would be helpful – Emerson Apr 05 '20 at 23:50
  • Added a code snippet, hope it makes sense! – mlenthusiast Apr 06 '20 at 00:00

1 Answers1

1

Its the sames as setting up a simple logger

Simple sample below

MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
logging.basicConfig(format=MSG_FORMAT)
logger = logging.getLogger('Something')
logger.setLevel(logging.INFO)

Then your code

start = timeit.default_timer()

...run some code

stop = timeit.default_timer()

runtime = stop - start

logger.info('Runtime: {}'.format(runtime))
Emerson
  • 1,136
  • 1
  • 6
  • 9
  • What would the query for CloudWatch Insights be for this? That's what I'm confused with @Emerson – mlenthusiast Apr 06 '20 at 00:15
  • have u tried ``` fields @timestamp, @message | filter @message like "Runtime" ``` – Emerson Apr 06 '20 at 00:21
  • `fields @timestamp, @message | filter @message like "Something"` works, not "Runtime". What I am looking for is to have fields like @Runtime show the runtime value in this case. – mlenthusiast Apr 06 '20 at 00:34
  • ``fields @timestamp,@message | parse @message " [*] * *: *" as level, loggername, exception | filter @exception like "RunTime"``` – Emerson Apr 06 '20 at 00:49
  • ``fields @timestamp,@message | parse @message " [*] *: *" as level, loggername, exception | filter @exception like "RunTime"``` sorry i thin i added an extra asterix – Emerson Apr 06 '20 at 00:57
  • Still doesn't work, I switched the @exception to "Something" and tried that as well – mlenthusiast Apr 06 '20 at 01:00
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/211030/discussion-between-emerson-and-codingenthusiast). – Emerson Apr 06 '20 at 01:04
  • Glue logs are a dump from their yarn/spark exuections and this means that the logs you want is part of a execution log line in cloud watch. This makes it hard to parse and retrieve the information that you want especially since your log line has newline characters in it – Emerson Apr 06 '20 at 02:12
  • A possible solution would be to send your log as a stream to s3, and then put those log messages onto cloudwatch and run insights off them. That way your log messages wont be mixed up with the rest of the glue messages that get dumped into cloudwatch – Emerson Apr 06 '20 at 02:49
  • Or better yet, try directly putting custom events within your glue code using boto3 and then running insights out of that log group – Emerson Apr 06 '20 at 20:33