All of the sudden an application that I've been working on is hammering the server's cpu and memory. I have not had any code changes since this "pegging" has started. I did some digging into strace to try and find out what is going on but I'm needing help deciphering what the output really means. I took one pid that was running for several minutes with cpu usage at 100% and using about 1.5GB of memory and ran strace -c with it. I got the following output. I'm taking a long time in the clone and wait4 commands. Can anybody give me a direction to move in with this info?
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.10 1.011982 252996 4 clone
30.67 0.573741 51 11296 brk
11.98 0.224099 56025 4 wait4
1.96 0.036701 53 687 munmap
0.68 0.012642 1580 8 mremap
0.58 0.010928 2 4886 5 read
0.01 0.000135 0 2854 fstat
0.01 0.000112 0 845 25 open
0.00 0.000073 0 935 115 lstat
0.00 0.000044 0 97 23 access
0.00 0.000043 0 464 40 stat
0.00 0.000037 0 466 write
0.00 0.000037 0 466 gettimeofday
0.00 0.000035 0 840 close
0.00 0.000000 0 173 poll
0.00 0.000000 0 210 lseek
0.00 0.000000 0 688 mmap
0.00 0.000000 0 5 rt_sigaction
0.00 0.000000 0 5 rt_sigprocmask
0.00 0.000000 0 16 writev
0.00 0.000000 0 55 setitimer
0.00 0.000000 0 2 socket
0.00 0.000000 0 2 2 connect
0.00 0.000000 0 4 accept
0.00 0.000000 0 6 shutdown
0.00 0.000000 0 4 getsockname
0.00 0.000000 0 10 setsockopt
0.00 0.000000 0 2 getsockopt
0.00 0.000000 0 8 semop
0.00 0.000000 0 38 fcntl
0.00 0.000000 0 168 flock
0.00 0.000000 0 12 getdents
0.00 0.000000 0 25 getcwd
0.00 0.000000 0 10 chdir
0.00 0.000000 0 2 unlink
0.00 0.000000 0 2 chmod
0.00 0.000000 0 15 umask
0.00 0.000000 0 7 times
0.00 0.000000 0 2 1 futex
0.00 0.000000 0 4 epoll_wait
0.00 0.000000 0 6 openat
0.00 0.000000 0 4 pipe2
------ ----------- ----------- --------- --------- ----------------
100.00 1.870609 25337 211 total
NEW ---ADDITIONAL INFO
I have some more info on this. User's of my application upload images. The directory where the images get uploaded to now has 80k plus image files that have been uploaded. When I do a strace of a process that is hogging cpu and memory I get a lot of reads in that directory followed by brk. Below is an exceprt from the strace.
poll([{fd=18, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
write(18, "e\2\0\0\3SELECT \"File\".\"ClassName\", "..., 617) = 617
read(18, "\1\0\0\1\0244\0\0\2\3def\4live\4File\4File\tCla"..., 16384) = 11488
read(18, "\0010\0010\00266\10MMSImage\343\0\0D\10MMSImage\02320"..., 16384) = 13032
read(18, "\373\0011\0010\0010\003122\10MMSImage\345\0\0|\10MMSImag"..., 16384) = 11584
read(18, "257\373\0011\0010\0010\003179\10MMSImage\345\0\0\255\10MMSI"..., 16384) = 16384
read(18, "nswer_Image/0_9373043939_1321407"..., 16384) = 16384
read(18, "or question #1054.Eassets/UserCo"..., 16384) = 16384
then a bit later a ton of brk's
brk(0x7f2ba285e000) = 0x7f2ba285e000
brk(0x7f2ba291e000) = 0x7f2ba291e000
brk(0x7f2ba295e000) = 0x7f2ba295e000
brk(0x7f2ba299e000) = 0x7f2ba299e000
brk(0x7f2ba29de000) = 0x7f2ba29de000
brk(0x7f2ba2a1f000) = 0x7f2ba2a1f000
brk(0x7f2ba2a5f000) = 0x7f2ba2a5f000
brk(0x7f2ba2a9f000) = 0x7f2ba2a9f000
brk(0x7f2ba2adf000) = 0x7f2ba2adf000
brk(0x7f2ba2b1f000) = 0x7f2ba2b1f000
brk(0x7f2ba2b5f000) = 0x7f2ba2b5f000
brk(0x7f2ba2b9f000) = 0x7f2ba2b9f000
brk(0x7f2ba2bdf000) = 0x7f2ba2bdf000
brk(0x7f2ba2c1f000) = 0x7f2ba2c1f000
brk(0x7f2ba2c5f000) = 0x7f2ba2c5f000
brk(0x7f2ba2c9f000)
Anybody know what may be going on there. Do I have too many files in the directory or something like that? Does the harddrive have to seek to the directory too much?