1

hope to get an answer or at least some kind of clue to a weird problem we are having with our web-servers.

As part of web-server's normal activity, we log some key interactions into a file log. The file log is nothing extraordinary, its a simple LogInteraction function encapsulated in a DLL that web-servers call to log these interactions. Here is the DLL's logging code:

Private Shared Function Log(messageLog As String, fileName As String) As Boolean
    Dim result As Boolean = False
    Using streamWriter As StreamWriter = New StreamWriter(fileName, True)
        streamWriter.WriteLine(messageLog + Environment.NewLine)
        result = True
    End Using
    Return result
End Function

Public Shared Function LogInteraction(accountName As String, logMessage As String) As Boolean
    Dim logName As String = String.Empty
    Dim result As Boolean = False
    Try
        logName = String.Format("{0}_{1}",accountName, DateTime.Now.ToString("yyyyMMdd"))
        result = Log(String.Format("{0}, {1}, {2}", DateTime.Now.ToLongTimeString(), accountName, logMessage), logName)
    Catch ex As Exception
        result = False
        LogError("Could not log activity: " + logMessage + Environment.NewLine + ex.ToString())
    End Try
    Return result
End Function

now the weird issue is that ever since we upgraded the project's .NET framework from 3.5 to 4.7, we started seeing a weird effect - sometimes web-servers would not produce file logs. We have half a dozen web servers serving a single customer, depending on a load, user trying to access the website will be redirected to a specific webserver. So, on a bad day, more than half servers have zero logs, but looking at IIS logs we see normal traffic, so its not like there is an issue with a load balancer and that webserver simply stopped getting traffic.

As far as we know, client experience is not affected. We have not received any complaints from our customers. From their perspective everything is working fine. But from our perspective, we see no logs on what looks like random servers.

Here are some key data we gathered when investigating this weird issue:

  • sometimes just one server is missing logs, sometimes about half;
  • not the same servers are having issues. For example: on Monday, servers # 3,5 and 9 have no logs. On Tuesday, servers 2, 7 and 10 don't have it;
  • never had this issue in .NET framework 3.5;
  • while the DLLs that log this did change (had to be recompiled because we went from 3.5 to 4.7), the code did not;
  • resetting/recycling app pool magically fixes the issue or causes it (we had an automated job recycling app pool shortly at midnight and noticed the issue appearing or disappearing at that precise time)

We are currently have no clue why this issue is happening or whats casing it. We've gone through the code over and over and it does not looks like a code issue - again, the code works just fine in 3.5. Has anyone experienced anything remotely similar?

Thank you

George
  • 2,165
  • 2
  • 22
  • 34

1 Answers1

0

The issue was a code issue.

In the DLL that logs into a file, there was a line of code that basically added a slash to the log-file path every time there was a log entry. So the file path would become something like this "c:\mylogpath\\\logfilename.txt" - which surprisingly works fine in Windows.

This file path is a shared variable so it does not reset naturally (must reset web.config or recycle app pool for web site to reset itself). With each log entry the slashes in front of the logfilename.txt would grow, until the file-path variable reaches 32767 characters (pre-Windows 10) and then get the "Path too long" exception

George
  • 2,165
  • 2
  • 22
  • 34