3

I have Domino servers in geographically disperse data centers in the U.S.

Sometimes when I open an NSF on one of those servers the connection times out then when I open the NSF again it connects immediately.

This has been going on for years and during that time I have upgraded and changed my own internet connection and moved servers to different data centers. Of course I have direct connection documents using fixed IP addresses. When I do a Notes client Trace nothing is out of the ordinary.

My business partner experiences the same thing from an entirely different city and different ISP but to the same servers.

Never have any trouble connecting to the HTTP server, just over port 1352.

Does anyone have any recommendations on a process to determine what is causing this problem?

4 Answers4

3

Any chance to get a VPN from your machine directly to the server? There are "security solutions" that drop packages, connections, ... Using a VPN, you can rule out issues with the network on the way there.

leyrer
  • 405
  • 2
  • 7
  • Yup. I'll give that a shot. – Russell Maher Oct 10 '12 at 18:47
  • Was able to test this yesterday and I think that is the issue. If I establish a VPN connection and then set up connection documents to use the internal IP address there seems to be no unusual pausing to connect to the server. After switching back today the pause is back. The only real difference in these two scenarios is that the server firewall is not engaged when using the VPN connection to that server. That also seems to jive historically. I can live with this scenario now that I am pretty confident in the cause. Thanks very much. Great suggestion. – Russell Maher Oct 23 '12 at 14:17
1

I haven't had this problem, so I'm not sure if this helps.

In the Notes Client. File -> Preferences -> Notes Ports

Select TCPIP and click the Options... button. Change the timeout value to something higher.

Tommy Valand
  • 119
  • 2
  • Another good idea! Trying it. – Russell Maher Oct 02 '12 at 18:25
  • This has stopped the failure but I am trying Simon's suggestion below because even though I have stopped getting the "Can't connect" message after bumping timeout to 30, it still clearly needs more time sometimes. Thanks! – Russell Maher Oct 05 '12 at 03:02
1

We also had this poblem, the only fix we found was this:

  • Add the following to the windows registry HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters
  • Add 'new DWord value': KeepAliveTime
  • value (decimal): 60000

Hope this will help

for more background info see:

0

It is a very broad question and there is no one single answer.

First you can use the following in the notes.ini

CONSOLE_LOG_ENABLED=1
DEBUG_THREADID=1
CLIENT_CLOCK=3 

(maybe 30 for CLIENT_CLOCK)

This will generate console logs in the IBM_TECHNICAL_SUPPORT folder which will track response times/traffic.

Look for parts that reference high ms numbers and check the related information. If you want to post any related lines here I can look at it (make sure to remove any confidential information from log line!)

I'd also recommend reviewing the following.

http://www-01.ibm.com/support/docview.wss?uid=swg27008849

Also you should mention what version of Notes+Domino you are using, as there have been performance improvements across versions (eg. Managed mail replicas).

Simon O'Doherty
  • 320
  • 1
  • 7
  • Thanks Simon. I have updated my notes.ini and did a quick baseline to see what a normal time looks like. I am running 8.5.3 servers and clients on whatever latest FP was. Pretty much the same problem though over all versions. Will update later. – Russell Maher Oct 05 '12 at 03:05
  • I added these settings and have been examining the log file when the delay in opening the NSF occurs. One time it took 21,000+MS to open after the console log file said "Server not responding" and then it opened the NSF. By following Tommy suggestion to increase the TCIP port timeout I think the NSF opens after the long delay instead of just giving me the client error message indicating the server was not responding but that doesn't really indicate where the problem is. If I could figure out a rhythm to it but it just seems so random. – Russell Maher Oct 08 '12 at 14:40
  • [16FC:0013-1094:alarm] (OPEN_SESSION: 31 ms) [16FC:0013-1094:alarm] 538 ms. [138+294=432] (Session Closed) [16FC:0002-1700] [16FC:0002-1700] (158-4579 [158]) OPEN_DB(CN=XXX/O=XXX!!mail\XXX.nsf): 21301 ms. [138+0=138] (Remote system no longer responding) [16FC:0002-1700] [16FC:0002-1700] (159-4601 [159]) OPEN_DB(CN=XXX/O=XXX!!mail\XXX.nsf): (Connect to XXX/xxx: 140 ms) (Exch names: 0 ms)(Authenticate: 0 ms.) – Russell Maher Oct 08 '12 at 15:05
  • [16FC:0002-1700] (OPEN_SESSION: 30 ms) [16FC:0002-1700] (Opened: REP862579C2:00190929) 37 ms. [138+294=432] [16FC:0002-1700] [16FC:0002-1700] (160-4601 [160]) SERVER_AVAILABLE_LITE(CN=XXX/O=XXX): 79 ms. [30+86=116] – Russell Maher Oct 08 '12 at 15:06
  • It seems fine because it connects in 31 ms and then a little later, nothing has changed and maybe five minutes has passed, then the whole no longer responding thing. Maybe the issue really is with the network at the server. Server log shows no issues and no lapses. There is a firewall infront of these servers but, as I said, that has always been the case and during the course of three years this server has been moved from one data center to another with the same results. This is not a huge issue. Just inconvenient. If I could locate the issue though I would feel better about it. – Russell Maher Oct 08 '12 at 15:11
  • Leyrer's response is the best way to proceed at this point. – Simon O'Doherty Oct 09 '12 at 12:45
  • Only thing I can add is if you are getting it consistently on the same database, then it may be corruption, design issue. For example a badly designed view. – Simon O'Doherty Oct 10 '12 at 21:01