2

Is there a way to add/enable timestamp to Dask scheduler/workers console logs.?

dask: 0.15.0-py35_0
distributed: 1.17.1-py35_0

If I use the above versions this is not enabled -

Scheduler -

distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO -   Scheduler at: tcp://192.168.200.23:8600
distributed.scheduler - INFO -       bokeh at:       192.168.200.23:8620
distributed.scheduler - INFO -        http at:       192.168.200.23:8610
distributed.scheduler - INFO - Local Directory: /jenkins_VegaFarm_edi-vf-3-4/workspace/AutoBATS/tmp/5896310/cbecb324-9b7c-46af-8ed7-5075aab3f225
distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO - Register tcp://192.168.200.23:33876
distributed.scheduler - INFO - Register tcp://192.168.200.23:43544
distributed.scheduler - INFO - Register tcp://192.168.200.23:43675
distributed.scheduler - INFO - Register tcp://192.168.200.23:39567
distributed.scheduler - INFO - Register tcp://192.168.200.23:33450
distributed.scheduler - INFO - Register tcp://192.168.200.23:42608
distributed.scheduler - INFO - Register tcp://192.168.200.23:36773
distributed.scheduler - INFO - Register tcp://192.168.200.23:43157
distributed.scheduler - INFO - Starting worker compute stream, tcp:/

Workers -

distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:33621'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:42826'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:37509'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:36526'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:46298'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:36025'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:46421'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:36029'
distributed.nanny - INFO -         Start Nanny at: 'tcp://192.168.200.23:41999'
distributed.worker - INFO -       Start worker at: tcp://192.168.200.23:33876
distributed.worker - INFO -              nanny at:       192.168.200.23:44329
distributed.worker - INFO -               http at:       192.168.200.23:34181
distributed.worker - INFO -              bokeh at:       192.168.200.23:8789
distributed.worker - INFO - Waiting to connect to:  tcp://192.168.200.23:8600
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -       Start worker at: tcp://192.168.200.23:33450
distributed.worker - INFO -              nanny at:       192.168.200.23:39203

Is there any way I can prefix timestamp to these logs.? This will help in debugging some internal crashes.

Kind regards, Jacob.

B Jacob
  • 389
  • 3
  • 9

1 Answers1

2

You can provide a log formatting string in your ~/.dask/config.yaml file. Here is the current default that results in your current logs:

distributed:
  admin:
    log-format: '%(name)s - %(levelname)s - %(message)s'

See logging module documentation for more formatting options

MRocklin
  • 55,641
  • 23
  • 163
  • 235
  • I tried adding the line, but Dask doesn't seem to pick it up. log-format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s' – B Jacob Jan 25 '18 at 16:49
  • This feature seems to be hardcoded in function initialize_logging() from file distributed/config.py. I had to manually edit the format of the function to get it working - fmt = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'. Is this fixed in the latest version of Dask? – B Jacob Jan 25 '18 at 16:53
  • This definitely works in master. I haven't checked to see how far back it goes. – MRocklin Jan 26 '18 at 13:57
  • Thanks, tested on Distributed version 1.17.1 and works as suggested. – B Jacob Feb 01 '18 at 14:54
  • Just to clarify: The fully-qualified name for this configuration setting is `distributed.admin.log-format` – Stuart Berg Jan 06 '19 at 22:02