0

I am reading a CSV file line by line. I will process the line and then dump it into another CSV file (simplified version of the problem). I am using MSMQ as a "fire and forget" mechanism. However, it's significantly slower than expected (more than 100%). Is there something I am missing here, like a setting?

I tried to run the Send call in a seperate thread. It didn't help.

Sender code:

    class Program
    {
        static void Main(string[] args)
        {
            string queueName = ".\\Private$\\TestQueue";

            if (!MessageQueue.Exists(queueName))
            {
                MessageQueue.Create(queueName, false);
            }

            MessageQueue messageQueue = new MessageQueue(queueName);
            string path = @"file.csv";

            int lines = 0;
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            using (StreamReader reader = new StreamReader(path))
            {
                string readLine;
                while ((readLine = reader.ReadLine()) != null)
                {
                    lines++;
                    messageQueue.Send(new Message(readLine));

                }
            }
            stopwatch.Stop();
            Console.WriteLine("Lines: " + lines);
            Console.WriteLine($"Time: {stopwatch.Elapsed.ToString("g")}");
        }
    }

Receiver code:

   class Program
    {
        static void Main(string[] args)
        {
            MessageQueue queue = new MessageQueue(".\\Private$\\TestQueue");
            queue.Formatter = new XmlMessageFormatter(new[] { typeof(string) });
            WriteToReport(queue);
        }

        private static void WriteToReport(MessageQueue queue)
        {
            StreamWriter writer = new StreamWriter($@"Log_{DateTime.Now:yyyyMMddhhmmssms}.csv");

            while (true)
            {
                Message message = queue.Receive();
                string row = message?.Body as string;
                if(message == null || string.IsNullOrEmpty(row)) break;

                writer.Write(row);
            }
            writer.Close();
        }
    }

Note:

  1. I tried to just read the messages in the receiver program without writing them to anywhere. It's still slow.

  2. Sending the messages in a seperate thread did not help.

  3. Increasing the queue size didn't help.

Shouldn't the Send method be asynchronous? I was expecting it to just Send the message and continue processing.

borat
  • 126
  • 1
  • 3
  • 13
  • 1
    How can something be 100% slower than expected? Regardless of that oddity, MSMQ provides a mechanism for reliable messaging; It's not guaranteed to be speedy as in your scenario you have no control over the delivery of the messages. – Martin Oct 29 '19 at 09:30
  • You are really tunneling through the windows operating system and sending a secure message between machines. So a secure connection has to be established which is probably slowing down the process. What is the actual delay you are seeing? I wouldn't be surprised if it was a couple of seconds. Try two messages in a row and see if second is much quicker than first which will indicate the delay is the setup of the connection. I call it tunneling because you are using a Microsoft proprietary protocol for establishing the connection instead of a standard (like TCP) protocol for the connection. – jdweng Oct 29 '19 at 09:33
  • is the CSV idea just to demo the use of MSMQ? Because it doesn't seem like the most efficient way to transfer a file. – ADyson Oct 29 '19 at 09:39
  • No. The only thing to worry about is why it is so fast. One obvious way to get ahead is to store the file on a file share that multiple machines have access to and generate only *one* message about it. – Hans Passant Oct 29 '19 at 09:41
  • Eeverything is happening on the same machine, but in production, the user might want to write the log file to a network drive. @ADyson, customer requested to log every line that is processed by our application. The reason I chose MSMQ was to somehow fire and forget the logging of the line so that it doesn't slow down the actual execution. Do you know a better way of achieving that? – borat Oct 29 '19 at 09:44
  • I experimented with streaming the log data to a file. That is pretty fast (only 7% impact on the actual execution). However, it got very slow when I tried to log it to a network drive (unsurprisingly). I just want to be able to write the lines to the log without it having a large impact on the core execution. – borat Oct 29 '19 at 09:48
  • 1
    @borat That sounds more like a task for a logging framework (like [NLog](https://nlog-project.org/2010/05/09/asynchronous-logging-in-nlog-2-0.html), [log4net](https://www.nuget.org/packages/Log4Net.Async), [Serilog](https://github.com/serilog/serilog-sinks-async), ...). Most of them will allow for asynchronous logging, too. – Fildor Oct 29 '19 at 09:48
  • Can you be clearer about the purpose of this? Because all I see is you taking one CSV file, sending it one line at a time and then writing that into another CSV file one line at a time. You don't appear to change the format of the data, and you don't appear to be appending to an existing file. Why not just copy the file to the network drive directly? I'm not seeing the point of all the streaming. – ADyson Oct 29 '19 at 09:50
  • @ADyson, each line is read from files A and B, and checked if there are differences in the cells (it's a long story, but that's the gist of it). If there're no differences (full match), the row is logged to a file (stupid customer request). – borat Oct 29 '19 at 09:53
  • I see. So not all lines are placed in the message queue. That isn't clear from the code. – ADyson Oct 29 '19 at 09:57
  • No it's not. I wanted to simplify the problem and not clutter it with business logic. – borat Oct 29 '19 at 09:58
  • Ok, but in this case, in doing so, you removed the entire rationale for the code. Maybe just leave an "if" block in there with a comment before it saying "//do some processing and decide whether to send the row in the message queue or not", so we understand it's a filtered list. – ADyson Oct 29 '19 at 10:00
  • 1
    Nonetheless, you could just write them to another local file and then move/copy the file to the network drive at the end. There's reasonable chance that would be quicker. Unless there could be other machines also writing into the same log file simultaneously? In which case a publisher-subscriber messaging system like this makes a bit more sense. – ADyson Oct 29 '19 at 10:00
  • That occured to me as well. However, I'm afraid the customer will not have enough local space (hence, they choose a network drive). So the idea was to stream the data to wherever it will eventually end up. – borat Oct 29 '19 at 10:10
  • I will try the async logging suggested by @Fildor and see if this solves my weird problem. – borat Oct 29 '19 at 10:14
  • 2
    MSMQ only uses the network when it needs to and that is generally controlled by which queue the caller is writing to. Since you are writing to a **local** queue (`.\\Private$\\TestQueue`), the network is **not** required and messages will be stored in _flat files_ in `%windir%\system32\msmq` on the _local machine_. _[How does MSMQ manage messages?](https://stackoverflow.com/questions/802661/how-does-msmq-manage-messages)_. It's clear your queue isn't transactional (which are slower) because if it were your send/receive would fail silently. –  Oct 29 '19 at 10:15
  • @borat how big is this file exactly? And how small is the local disk going to be? Storage is pretty cheap these days, even on a laptop or something. I'm just trying to ascertain whether this is a realistic fear or not. But yes you may be better going with a proper logging framework anyway. – ADyson Oct 29 '19 at 10:29
  • This is a perfect example of an XY Problem. Don't try and fix any issues you have with using MSMQ for what you're doing, because really you just shouldn't be using it. As already suggested above by @Fildor, you'd be much better off using a logging framework. We use Serilog a hell of a lot (100s of 1000s of entries every hour) and it doesn't slow anything down - it truly is "fire and forget", but for logging, which is what you really need. MSMQ is ideally for communication between applications, not passing around logging information - that just seems like overkill and misuse. – Reinstate Monica Cellio Oct 30 '19 at 12:32

0 Answers0