1

I am running simple R job by root and another limited user. The execution time differs significantly. What can be the source of problem?

Further information

Here is how I compare the run time:

# time /share/binary/R/bin/R CMD BATCH s1n\=50.R

real    0m0.278s
user    0m0.217s
sys 0m0.032s
# su john
$ time /share/binary/R/bin/R CMD BATCH s1n\=50.R

the run under john user takes a long time and never finishes! The output of perf during these interval is:

   PerfTop:     906 irqs/sec  kernel:19.3%  exact:  0.0% [1000Hz cycles],  (all, 8 CPUs)
-------------------------------------------------------------------------------------------------------------------------------------------------------------

             samples  pcnt function                      DSO
             _______ _____ _____________________________ _______________________________

              598.00 14.5% __GI_vfprintf                 /lib64/libc-2.12.so            
              194.00  4.7% intel_idle                    [kernel.kallsyms]              
              176.00  4.3% read_hpet                     [kernel.kallsyms]              
              170.00  4.1% bcEval                        /usr/local/R/lib64/R/bin/exec/R
              141.00  3.4% ___printf_fp                  /lib64/libc-2.12.so            
              138.00  3.4% __strchrnul                   /lib64/libc-2.12.so            
              121.00  2.9% Rf_cons                       /usr/local/R/lib64/R/bin/exec/R
              120.00  2.9% R_gc_internal                 /usr/local/R/lib64/R/bin/exec/R
               91.00  2.2% _IO_default_xsputn_internal   /lib64/libc-2.12.so            
               88.00  2.1% Rf_allocVector                /usr/local/R/lib64/R/bin/exec/R
               84.00  2.0% _IO_file_xsputn_internal      /lib64/libc-2.12.so            
               82.00  2.0% scientific                    /usr/local/R/lib64/R/bin/exec/R
               72.00  1.7% MatrixSubset                  /usr/local/R/lib64/R/bin/exec/R
               71.00  1.7% duplicate1                    /usr/local/R/lib64/R/bin/exec/R
               68.00  1.7% floor                         /lib64/libm-2.12.so            
               64.00  1.6% __strcmp_sse42                /lib64/libc-2.12.so            
               53.00  1.3% Rf_findVarInFrame3            /usr/local/R/lib64/R/bin/exec/R
               53.00  1.3% Rf_protect                    /usr/local/R/lib64/R/bin/exec/R
               50.00  1.2% _IO_str_init_static_internal  /lib64/libc-2.12.so            
               50.00  1.2% Rf_eval                       /usr/local/R/lib64/R/bin/exec/R
               43.00  1.0% Rf_formatReal                 /usr/local/R/lib64/R/bin/exec/R
               43.00  1.0% Rf_matchArgs                  /usr/local/R/lib64/R/bin/exec/R
               41.00  1.0% _int_malloc                   /lib64/libc-2.12.so            
               36.00  0.9% _itoa_word                    /lib64/libc-2.12.so            
               33.00  0.8% __ieee754_log                 /lib64/libm-2.12.so            
               31.00  0.8% Rf_EncodeReal                 /usr/local/R/lib64/R/bin/exec/R
               29.00  0.7% Rf_mkPROMISE                  /usr/local/R/lib64/R/bin/exec/R
               29.00  0.7% do_bind                       /usr/local/R/lib64/R/bin/exec/R
               28.00  0.7% Rf_install                    /usr/local/R/lib64/R/bin/exec/R
               27.00  0.7% __vsnprintf                   /lib64/libc-2.12.so            
               27.00  0.7% _IO_no_init                   /lib64/libc-2.12.so            
               23.00  0.6% _IO_old_init                  /lib64/libc-2.12.so            
               22.00  0.5% __GI__nss_files_parse_servent /lib64/libnss_files-2.12.so    
               22.00  0.5% Rconn_printf                  /usr/local/R/lib64/R/bin/exec/R
               22.00  0.5% finite                        /lib64/libm-2.12.so            
               22.00  0.5% findVarLocInFrame             /usr/local/R/lib64/R/bin/exec/R
               21.00  0.5% Rf_getAttrib                  /usr/local/R/lib64/R/bin/exec/R

I suspect to ulimit and ``disk quota. After disabling the disk quota, the problem is still exist. Unfortunately, limits are equal underrootandjohn. Here is the output ofulimit -a` (thanks to @Eric DANNIELOU):

# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127383
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
# su john
$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127383
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Operating System: CentOS 6.2

HW: Intel Core-i7 16GB RAM

Thank you in advance!

lashgar
  • 681
  • 1
  • 5
  • 16
  • " suspect to ulimit and disk quota. After disabling the disk quota, the problem is still exist" : Then try to increase John user limits? –  Dec 24 '12 at 09:59
  • But what kinds of limitation can be the reason? – lashgar Dec 24 '12 at 14:49
  • Maximum number of file descriptors for example : ulimit -a as John will give you all possible limitations. Feel free to edit /etc/security/limits.conf accordingly, once you have found. –  Dec 24 '12 at 14:59
  • Thank! But no luck! `root` and `john` have the same limitations (added into the question). Any other idea? :( – lashgar Dec 24 '12 at 15:07
  • 1
    strace the R process to see if it hangs somewhere? –  Dec 24 '12 at 15:30
  • Many thanks! very interesting, it hangs and the last printed line is `wait4(-1,` . So I should see what is the `wait4`. – lashgar Dec 24 '12 at 15:40
  • You should look a few lines upper ; wait4 won't give you much information –  Dec 24 '12 at 15:46
  • I've tracked the `strace` for `root` and `john`. The outputs are the same (expect the process-ids), but `john` stalls on `wait4`. `root` continues and prints a few lines until the end of `R` execution. I cannot dump all the log here in comments, but just where `john` stalls, the `root` user continues and calls `rt_sigprocmask`, `rt_sigreturn`, `rt_sigaction`, `read`, and `exit_group` functions. – lashgar Dec 24 '12 at 15:59
  • What is the child trying to do? – John Siu Dec 25 '12 at 06:24
  • @JohnSiu I have no idea! Unfortunately, it is the `R` application not mine. – lashgar Dec 25 '12 at 07:16
  • I really don;t know R. But does your account has full permission to file related to the `BATCH`? – John Siu Dec 25 '12 at 07:19
  • @JohnSiu `BATCH` runs the `R` in non-interactive and reads from a file. Anyway, I guess the problem is about some environment variables in `bash`. Many thanks. – lashgar Dec 25 '12 at 09:16

1 Answers1

0

I'm not sure it will be relevant, but you should really use su - john instead of su john : that way it will invoke a clean login shell. Please check that, doing so, the ulimit -a could now possibly show some relevant differences?

Another thing: use strace -f R instead of strace R so that when it invokes a child process, strace traces that child also and shows exactly where that one hangs.

Olivier Dulac
  • 1,202
  • 7
  • 14