We have a web application running for 2 years without any problems. Suddenly a week ago the response times were becoming very bad. About 10-50 times slower than normal.
At a time there are maybe 10-20 users using the system. 90% of the user requests results in a database request. The systems responds normal early morning and in the evening when not many users are online.
- How can we detect the problem. Step-by-step documentation to resolve the problem?
- Are there specialised companies or specialist who could help us solving the problem?
Environment
Windows Server 2003
Quadcore Intel Xeon X3220, 2.4GHZ, 2 GB Ram
Sybase Anywhere 9 Database - Driver: jconn3.jar
Glassfish 2.1
Internet band width of server: 100MB/s
Applications
Webapplication with SmartGWT-Frontend (SmartGwt 2.4)
WebService accessed by external company
No EJBs, only WebContainer
First of all, it doesnt seem that the hardware is at the limit.
Java.exe is sometimes at 25% of CPU usage when heavy request are done, using 374 MB Ram
sybase-db server: 220MB ram
available memory: always around 1GB
Snapshot of requests
I made a snapshot of all request during 8 Minutes
210 seconds client requests (gwtservice) 45%
Total 967 requests, 212 milliseconds per request
100 seconds webservice (BankOrderService) 20%
Total 86 requests, 1170 milliseconds per request
160 seconds loading frontend elements into browser (.js, .png, jpg, .css etc.) 35%
Total 623 requests, 250 milliseconds per request
Example of most time consuming requests (in milliseconds):
15427.302 25.07.2012 11:50 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/091FF14E7C1D1187C770833D67B13321.cache.html
13558.571 25.07.2012 11:50 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Core.js
12631.877 25.07.2012 11:50 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Grids.js
11238.439 25.07.2012 11:50 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Forms.js
10535.141 25.07.2012 11:50 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/sc/modules/ISC_DataBinding.js
10003.115 25.07.2012 11:55 Erfolg anonymous REMOTE_WEB xx.yy.zz.25 URI:/BankWebService/BankOrderService
9999.412 25.07.2012 11:49 Erfolg anonymous REMOTE_WEB xx.yy.zz.25 URI:/BankWebService/BankOrderService
9999.229 25.07.2012 11:55 Erfolg anonymous REMOTE_WEB xx.yy.zz.25 URI:/BankWebService/BankOrderService
9992.415 25.07.2012 11:49 Erfolg anonymous REMOTE_WEB xx.yy.zz.25 URI:/BankWebService/BankOrderService
9990.473 25.07.2012 11:55 Erfolg anonymous REMOTE_WEB xx.yy.zz.25 URI:/BankWebService/BankOrderService
9132.848 25.07.2012 11:55 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/gwtservice
5933.174 25.07.2012 11:50 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Grids.js
5864.426 25.07.2012 11:50 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Core.js
5571.739 25.07.2012 11:50 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/sc/modules/ISC_DataBinding.js
5473.637 25.07.2012 11:50 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Forms.js
5158.104 25.07.2012 11:50 Erfolg user3 REMOTE_WEB xx.yy.zz.237 URI:/BankApp/org.Bank.Main/gwtservice
4488.047 25.07.2012 11:50 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/images/chf.jpg
4442.574 25.07.2012 11:56 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Core.js
4072.268 25.07.2012 11:54 Erfolg anonymous REMOTE_WEB xx.yy.zz.25 URI:/BankWebService/BankOrderService
3939.546 25.07.2012 11:56 Erfolg user2 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Grids.js
3876.443 25.07.2012 11:50 Erfolg user1 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/sc/modules/ISC_Foundation.js
3727.795 25.07.2012 11:50 Erfolg user4 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/gwtservice
3630.225 25.07.2012 11:48 Erfolg user4 REMOTE_WEB xx.yy.zz.162 URI:/BankApp/org.Bank.Main/091FF14E7C1D1187C770833D67B13321.cache.html
3552.007 25.07.2012 11:50 Erfolg user5 REMOTE_WEB xx.yy.zz.228 URI:/BankApp/org.Bank.Main/gwtservice
Sessions
18 active Sessions
After a client login (provided by glassfish, https), Once the user is authenticated by glassfish, there there is a second login in the application itself where the user has to define into which branch he wants to login. After the second login, 3 attributes (username, branch, ip-address) are stored in the session.
There are always about 40%-50% of sessions without these 3 attributes, I interpret it like that, that the first login was made but the second not.
examples:
session id:e6df980ab67cf0456d78761eefa1
8 sessions without the 3 attributes
session id:d72d16bdabb5500e73f721475440:{username=user1, branch=000x, ipadr=xx.xx.xx.xx}
10 sessions with the 3 attributs
I thought maybe these 8 sessions are from a hacker? I ran wireshark to find out if there are some suspicious ip-Addresses, however I havent found a lot. One day there was an ip from Sweden and we have nothing in Sweden. However this wasnt a lot of traffic, just a few lines in the wireshark capture log during a few seconds.
at 7/17/2012 The msn account of one of the users has been hacked.
Around that date the problems started as well. maybe with a delay of 1-2 days.
Coincidence?
Any help is highly appreciated.