3

For my erlang application, I have used both sasl logger and log4erl and both are giving poor performance when the number of events sending to them is around 1000 per second. log4erl was giving better performance but after some time its mailbox starts filling up and thus starts bloating the VM.

Will using disk_log be a better option (as in will it work under 1000 events per sec load ?).

I tried using disk_log on the shell... in the example they are converting the message to be logged to binary (list_to_binary) first and writing to file using "blog" function.

Would doing like this help me in using an efficient high volume logger ?

One more doubt:: Using disk_log:blog the size of the text was just 84 bytes...but with disk_log:log_terms..the size was 970 bytes..why such a big difference ?

Arunmu
  • 6,837
  • 1
  • 24
  • 46
  • 2
    Erlang's IO libraries can handle both lists and binaries (and mixes of the both). You don't need to convert anything that will go to a port eventually. However, a library API might require it for whatever reason. – Adam Lindberg Mar 01 '11 at 12:59
  • Using disk_log:blog the size of the text was just 84 bytes...but with disk_log:log_terms..the size was 970 bytes..why such a big difference ? – Arunmu Mar 01 '11 at 13:12
  • The problem is perhaps you are using the logger as a tracer. SASL especially has quite an overhead. – I GIVE CRAP ANSWERS Mar 01 '11 at 17:34

2 Answers2

2

Hack something on your own. Dedicated logger with in-memory storage and bulk dumps to disc is the fastest solution. If you cannot afford losing any data (in case of VM crash), do it on remote node. Once I used the remote 'solution' and I queried every 5sec target VM. I didn't notice impact on the system.

user425720
  • 3,578
  • 1
  • 21
  • 23
  • @user425720 :: Right now I am going with disk_log with "internal format" which will dump the logs in binary format which I will parse it out. contd... – Arunmu Mar 01 '11 at 14:42
  • I will be opening disk_log in a process that will keep running and i will pass the name of the log to all the processes that requires logging..is it the right approach considering disk_log has some owner and anonymous concept. – Arunmu Mar 01 '11 at 14:44
  • @ArunMu To be honest I didn't seriously use disc_log. Your scenario sounds good. It is common to have logger daemon and pass around ref to it. I didn't use disc_log, because binary format on disc was not handy at all. – user425720 Mar 01 '11 at 15:08
1

On high volume logging I prefer battle tested solutions like scribe or maybe flume. Check erl_scribe .

frail
  • 4,123
  • 2
  • 30
  • 38
  • true, but list of deps to take care is significant – user425720 Mar 01 '11 at 15:48
  • Logging can seem an easy job. But with experience I could say that it would become a heavy burden, even a bottleneck when you are getting high volumes. It would become a problem to digest logs in distributed enviroment aswell (when you throw 2 erlang nodes to get more concurrency). Scribe and Flume both are tuned for these kinds of problems. – frail Mar 01 '11 at 16:02
  • More dependencies with thrift and all :( – Arunmu Mar 01 '11 at 16:17