9

I'm evaluating various interprocess communication methods for a couple of .NET 2.0 processes residing on the same machine. Naturally, .Net Remoting is a candidate, and theoretically the fastest configuration should be IpcChannel (named pipes) + BinaryFormatter.

My benchmarks really do show that Remoting over IpcChannel could mostly be faster than TcpChannel, but IpcChannel shows a steep drop in throughput as messages get bigger (around 30 MB):

Message Size    30 MB       3 MB        300 KB      3 KB
Remoting / TCP  120 MB/s    115.4 MB/s  109.5 MB/s  13.7 MB/s
Remoting / IPC  55 MB/s     223.3 MB/s  218.5 MB/s  20.3 MB/s

Does anyone have any idea why, or any idea how to optimize performance of either channel? I do need to pass 30 MB BLOBs around, and would like to avoid having to deal with shared memory / memory mapped files. Also, I can't afford writing these to disk (much slower).


The following method was used for the benchmarks (called repeatedly, measured total time, divided total payload size by total time).

private byte[] _bytes = null;

public byte[] HelloWorld(long size)
{
    if (_bytes == null || _bytes.Length != size)
        _bytes = new byte[size];
    return _bytes;
}
Yodan Tauber
  • 3,907
  • 2
  • 27
  • 48
  • Just a thought, have you confirmed that with the larger payloads the messages are NOT getting cached to disk somewhere? Here's the reason I ask- IIS 7 automatically spools large requests to disk to avoid consuming too much RAM...I'm wondering if either .Net or Windows implements a similar behavior and when the message size increases, hidden disk i/o occurs. If not disk i/o, my next guess would be the message chunk size. – Tim M. Dec 12 '10 at 18:18
  • @Tim: I haven't. Assuming that I do find out that it has to do with hidden disk i/o, is there anything I can actually do about it? e.g. reconfigure Remoting or IpcChannel to behave differently? – Yodan Tauber Dec 13 '10 at 08:22
  • @Yodan - I don't know how you would reduce hidden disk caching but it might have something to do with how much RAM is allocated to the process(es) in question. Have you looked at resource consumption on the machine when sending large messages repeatedly? Even with very large files (I've tried with 500MB+) pure stream manipulation (like transferring data from one process to another) results in very little memory consumption and no disk i/o. Therefore, if you see RAM/disk spikes (especially if you see differences between TCP and IPC) it may give you an indication of what is going on. – Tim M. Dec 13 '10 at 16:09
  • @Yodan - If you try a larger message (maybe 100 MB) does performance degrade in a linear fashion? – Tim M. Dec 13 '10 at 16:11
  • @Tim: with 100 MB messages - TCP: 112 MB/s, IPC: 18 MB/s. – Yodan Tauber Dec 14 '10 at 08:07
  • @Yodan - that's a steep performance drop...anything noteworthy with resource consumption on the machine when you ran that test? Also, is there a reason that you are using remoting versus raw sockets (http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.aspx)? I read the links that @Alois posted below, and I agree that the formatter is probably creating some of the overhead (although I don't know why it would be different between TCP and IPC). – Tim M. Dec 14 '10 at 14:27
  • @Tim: Of course there's a reason I'd rather not use raw sockets - Remoting is a complete RPC/IPC solution, sockets are just a means of transport. I'd have to reinvent a few wheels. Regarding the BinaryFormatter - it is indeed much slower than it should be, but is certainly not the cause for the steep performance drop of IpcChannel. – Yodan Tauber Dec 14 '10 at 16:50
  • Also, other than the CPU being utilized a bit *more* with the TcpChannel, I can't find any notable difference between the resource consumption in the two cases. – Yodan Tauber Dec 14 '10 at 16:53

3 Answers3

1

Why do you want to avoid shared memory? It is the most obvious choice for moving large BLOBs.

zvrba
  • 24,186
  • 3
  • 55
  • 65
  • Because it introduces a whole new layer of complications and synchronizations (and strong coupling), while these two processes are being developed by two separate companies. Debugging this will surely be a nightmare. I guess I just don't want to pull out the big guns before I have to. – Yodan Tauber Dec 13 '10 at 07:37
1

The "strange" behaviour for big messages sizes (30MB) does most certainly orginate from GC pressure. By the way BinaryFormatter should be the slowest of all possible formatters. DataContractFormatter might be much better or a hand written one like this beauty http://codebetter.com/blogs/gregyoung/archive/2008/08/24/fast-serialization.aspx should be about 16 times faster. How did you measure the times? Was the sending and receiving process the same one? I think 120 MB/s send receive are quite good for .net with a very busy garbage collector. You should have a look a the % GC Time Performance counter to check if it is high. If it is > 95% you should use memory more sparingly. As other commenters have already pointed out memory mapped files are the way to go if you need to pass huge amounts of data between processes. There are many free implementations around like

http://www.codeproject.com/KB/recipes/MemoryMappedGenericArray.aspx

and

http://msdn.microsoft.com/en-us/library/ff650497.aspx (Smart Client Offline Application block has one dll which does contain a nice implementation).

Yours, Alois Kraus

Alois Kraus
  • 13,229
  • 1
  • 38
  • 64
  • First, thank you for your suggestions. I will look into them. DataContractFormatter is unfortunately not an option (stuck with .NET 2.0). When I measured the times, I used two separate processes for the server and the client. – Yodan Tauber Dec 13 '10 at 07:40
  • The client process uses around 25% CPU and spends <1% of time in GC. The server process spends between 10-80% of time in GC, but uses only around 5% CPU. This is true for both IpcChannel and TcpChannel, and therefore, I find it hard to believe that this is causing IpcChannel to become 4 times slower when the message size is 30 MB (as TcpChannel doesn't suffer from the same symptom). – Yodan Tauber Dec 13 '10 at 09:17
  • If the sender has low CPU usage then it could be that your receiver does block the communication. How many cores does your machine have? If your client process consumes 25% on a quad core machine then it does saturate one CPU. – Alois Kraus Dec 16 '10 at 22:33
1

A gun smaller then shared memory but still powerful enough for the job would be sockets. Upon executing the remote procedure, have it create a Listening socket on some fixed or ad-hoc port number, connect from the client to it, use NetworkStream to write data from one side to another.

It will work like a charm, I'm sure.

This article should get you started.

And, even though you are not mentioning anything about having to have server and client ond separate machines, you'll still have that ability, which will vanish if you use shared memory.

Daniel Mošmondor
  • 19,718
  • 12
  • 58
  • 99
  • Interesting idea. I am able to achieve 230 MB/s by invoking a method by either IpcChannel or TcpChannel Remoting and passing the actual binary data via a separate socket. I will need to consider this - thanks! – Yodan Tauber Dec 19 '10 at 13:54
  • @Yodan - glad to help - I won't ever consider using SOAP for passing large object of several MBs in size - at first glance it kind of smells funny. – Daniel Mošmondor Dec 19 '10 at 20:39
  • @Yodan Tauber - you probably also want to evaluate a similar design but using a named pipe side channel instead of sockets for the binary data. I'd guess this would be quicker still. – Chris Dickson Dec 21 '10 at 12:58
  • Who said anything about SOAP? :) – Yodan Tauber Dec 27 '10 at 07:21