3

I am trying to create a simple file monitor that checks for updates periodically in a log file and processes the updates. I tried using FileSystemWatcher but that requires my application to keep running forever. I am thinking more on the lines of read the changes, quit, wait for timer, read changes again.

I have created a service that runs periodically to read the file and get the whole data. Below is the simple code.

private void SchedularCallback(object e)
{
    string logFile = ReadFromFile("C:\\test.log");
    this.WriteToFile(logFile);
    this.ScheduleService();
}

WriteToFile function writes the data to a separate file (process data, actual process can involve other tasks like calling WCF services, checking internet access etc). ReadFromFile reads the log file every time the callback happens. Below is the code that reads the file.

private string ReadFromFile(string path)
{
    try
    {
        string logs = "";
        using (StreamReader reader = new StreamReader(path, true))
        {
            logs = reader.ReadToEnd();
            reader.Close();
        }

        return logs;
    }
    catch (Exception ex)
    {
        WriteToFile("Simple Service Error on: {0} " + ex.Message + ex.StackTrace);

        //Stop the Windows Service.
        using (System.ServiceProcess.ServiceController serviceController = new System.ServiceProcess.ServiceController("SimpleService"))
        {
           serviceController.Stop();
        }
        return "";
    }
}

As you notice, this code reads the whole file every time callback happens. Since log file can end up really big, reading and processing whole file every time is not possible. To improve this, I thought of using FileSystemWatcher, but that will keep my service running forever and just be a real performance drain. Instead if I can read just the changes in the file, it will be faster.

I also thought of storing the last offset of the streamwriter, but that will work only if data is appended. If someone deletes whole log or changes a line or two, last offset won't work.

In this case, what will be the best way. Log file obviously won't change constantly, so I don't need to keep my service running. I am unsure if a binarystream and then comparison with last binary stream will be a good idea. Any suggestion on a possible approach is appreciated. Basically something like what git does to identify changes since last commit, that's what I am looking for.

Thanks.

boop_the_snoot
  • 3,209
  • 4
  • 33
  • 44
jitendragarg
  • 945
  • 1
  • 14
  • 54

2 Answers2

1

Have a look at the USN Journal for NTFS.

It basically logs all changes to files on an NTFS disk.

Here are some links which might prove useful:

  1. Creating, Modifying, and Deleting a Change Journal
  2. Fsutil usn
  3. File Path from USN Journal
boop_the_snoot
  • 3,209
  • 4
  • 33
  • 44
Rick van Lieshout
  • 2,276
  • 2
  • 22
  • 39
  • 1
    Will improve answer later, gotta catch a train! – Rick van Lieshout Sep 14 '17 at 11:31
  • 1
    Have a safe journey! :P – boop_the_snoot Sep 14 '17 at 11:31
  • Just had a look at the USN journal. Sounds like the perfect place to start. Let me do some research and try to implement it. Will accept it as answer, in a while. Want to make sure, that I can actually do this in the service. – jitendragarg Sep 14 '17 at 11:53
  • Ok, so, it does work. Using USN journal, I can get the data. Now, I just need to figure out how to run the `fsutil usn readdata c:\temp\sample.txt` command using C# code. That I think is a different problem that should come as its own question. – jitendragarg Sep 14 '17 at 12:38
  • @RickvanLieshout - Please remove the reliance on the external links in this answer. You certainly can put links in answers, but the links should only be there to support content you have already put in the answer. Another way of looking at this is that your answer should remain valid if the external site changes its links. – Enigmativity Sep 14 '17 at 13:09
  • If you intend using your program on other computers don't forget that some users turn off their USN journal. – Chibueze Opata Sep 14 '17 at 14:23
-1

This is exactly what a FileSystemWatcher is good for. As long as it is a single file, the resource usage is going to be minimal.

Update: Indeed, polling vs listening to the API/Kernel for changes might be a bit overkill for something like a log. It might be better to search a log/journal (assuming they are turned on). But at worst/most reliably, you could use your own timer system to monitor the Size+LastModifiedTime on the file. Using an MD5 checksum should also be okay/fast.

Then if there are changes, you could use a diff library to sync. e.g. diffplex.

If you could test and later benchmark the results here. That would be really helpful for me as well as other users as I've actually implemented something like this before with FileSystemWatcherEx.

Chibueze Opata
  • 9,856
  • 7
  • 42
  • 65
  • As the OP states "**I thought of using FileSystemWatcher, but that will keep my service running forever and just be a real performance drain.**" – boop_the_snoot Sep 14 '17 at 11:33
  • I tried using FileSystemWatcher. It keeps the service running, nonstop, and then calls the `OnChanged` event for every keystroke. So, if I add `Hello` in the log file, FileSystemWatcher calls the event 6 times. I will look into the diffplex though. That might be what I am looking for. – jitendragarg Sep 14 '17 at 11:37
  • Checked diffplex. It works for text that is already in memory. It doesn't work for text in a file. I want to read changes in file, without reading whole file, if that makes sense. – jitendragarg Sep 14 '17 at 11:47
  • Oh, that's a bit complicated. If you are sure your files are text files and **only ever changes from the bottom**, you would store the last stream position yourself, then seek to that position any time there are changes, and read from that position. You would of course have to make a case for when the log is cleared/deleted. – Chibueze Opata Sep 14 '17 at 11:51