2

We have a node project and use with supervise to supervise the node server. The command that we use is:

nohup supervise /usr/share/$PACKAGE/superviseRun/supervise$NODE_PORT >> /var/log/<company_name>/sp-sms/$PACKAGE/supervise$NODE_PORT-$(date +"%d-%m-%y").log 2>> /var/log/<company_name>/sp-sms/$PACKAGE/supervise$NODE_PORT-$(date +"%d-%m-%y").log &

This command has been there in production for 3 years. Now recently someone changed the above command to:

nohup supervise /usr/share/$PACKAGE/superviseRun/supervise$NODE_PORT >> /var/log/<company_name>/sp-sms/$PACKAGE/supervise$NODE_PORT-$(date +"%d-%m-%y").log 1>> /var/log/<company_name>/sp-sms/$PACKAGE/supervise$NODE_PORT-$(date +"%d-%m-%y").log &

Basically 2>> was changed to 1>> by mistake. Post this change, we started noticing slowness in our apis. Some apis which would ideally complete in less that 2 secs took 2-4 mins to complete. We used binary search to narrow down on the buggy commit and reverted it. After reverting back this change, things started working fine. While this buggy commit was in production we saw a lot of EPIPE errors:

Error: write EPIPE
    at exports._errnoException (util.js:856:11)
    at WriteWrap.afterWrite (net.js:767:14)

And after reverting back this change, there were no EPIPE errors. I'm sure that this EPIPE error is somehow connected to the above mentioned buggy commit. Can someone help me understand what's going on here.

PS: I know 2 & 1 are file descriptors for stderr and stdout respectively.

J. Chomel
  • 8,193
  • 15
  • 41
  • 69
neeraj080
  • 119
  • 6
  • It looks like concurrent writes to the log file where occurring. If you really want to be sure and have a real answer here, you should try to reproduce the error and post here the "how to". – J. Chomel Aug 31 '16 at 08:27
  • @J.Chomel Can you explain why you say that concurrent writes were taking place because of the buggy commit. – neeraj080 Aug 31 '16 at 09:14

1 Answers1

1

Following the "buggy commit" you're talking about, here is what you had:

nohup supervise /usr/share/$PACKAGE/superviseRun/supervise$NODE_PORT \
 >> /var/log/<company_name>/sp-sms/$PACKAGE/supervise$NODE_PORT-$(date +"%d-%m-%y").log \
1>> /var/log/<company_name>/sp-sms/$PACKAGE/supervise$NODE_PORT-$(date +"%d-%m-%y").log &

This means your program is writing twice to the same file at the same time. This is undefined behavior. In some case, it might work. In other case, your system might open twice the same file, then write from one hand (muti-threads) and close the file, then the other hand will try to write in a closed file. It also depends on how your program handle SIGPIPE.

Read more about this on https://unix.stackexchange.com/questions/84813/what-makes-a-unix-process-die-with-broken-pipe

Community
  • 1
  • 1
J. Chomel
  • 8,193
  • 15
  • 41
  • 69