0

Fromt he Mannul of fcntl in solaris, Upon successful completion, value returned for F_SETLKW will be "Value other than -1". But Apache httpd 1.3.41 source code (http_main.c) check if the returned value is positive like:

int ret;

while ((ret = fcntl(lock_fd, F_SETLKW, &unlock_it)) < 0 && errno == EINTR) {
    /* nop */
}

if (ret < 0) {
    ap_log_error(APLOG_MARK, APLOG_EMERG, server_conf,
                "fcntl: F_SETLKW: Error getting accept lock, exiting!  "
                "Perhaps you need to use the LockFile directive to place "
                "your lock file on a local disk!");
    clean_child_exit(APEXIT_CHILDFATAL);
}

In very rare case, apache in one of our system will exit beacuse of this failed test. I suspect this was caused by a negative value less than -1 returned by fcntl.

So when will fcntl in solaris return a value less than -1?

1 Answers1

0
  1. in your code sample, fcntl returns <0 (e.g. -1 you know) means might have errors if errno was not EINTR, and if errno == EINTR (interrupted), it is not an error, just suggest retrying again.
  2. "Fromt he Mannul of fcntl in solaris, Upon successful completion, value returned for F_SETLKW will be Value other than -1", meant returns 0 or >0 when success, ">=0" is a value other than -1, not <-1 as you guessed.
Test
  • 1,697
  • 1
  • 11
  • 10
  • I can understand those code snippet, but with some concern that the condition to judge if fcntl failed. From the mannul of solaris, fcntl could return a value like -2 with successful completion. In this code, apache will exit with error msg. Is it a problem? Besides, it looks like fcntl almost don't return a value less than -1 with successful completion. My question is that when will this happen. –  Oct 13 '09 at 07:08
  • No. fcntl returns -1. The reason it'll exit is because that it might fail to get the lock when Solaris under high loads, in this case, fcntl returns -1 still, and errno is ENOLCK, while Apache reports the same error log as other errors based on above code. Reports ENOLCK under high loads is the known issue of Solaris. – Test Oct 14 '09 at 07:27
  • so your proper action should be: verify by hacking the apache code to log return value and errno. you'll found what was the root cause. – Test Oct 14 '09 at 07:45
  • Yes, that's one option. But the last time this happened was one month ago. Even i reproduce it, i still have no idea what make fcntl behaviour like this. –  Oct 14 '09 at 13:18