Tuesday 30 March 2010

Receive-Side Scaling & MPI

I recently spent an interesting few days discovering, diagnosing & resolving an MPI performance issue on a small Gigabit Ethernet interconnect based cluster. The problem first came to light after running the MPI quick check and Throughput diagnostics from within the HPC Cluster Manager diagnostics suite. Results were significantly down on expected values, returning an average of just over 200 microseconds latency and just over 60MB/s throughput. Expected figures are closer to 50 micoseconds and 105MB/s, so something was quite obviously amiss.
After checking that all appropriate firmware and drivers were up to date the issue was still apparent. Time for some good old fashioned detective work, driving the problem in to an ever smaller box until the answer popped out. Following many combinations of driver version and network setting configuration, the culprit turned out to be the Receive-Side Scaling feature when enabled on newer versions of the driver in question. When turned on and configured to use <1 queue performance was degraded. When turned off, or turned on but configured to use a single queue, performance was as expected. Interestingly when using older versions of the driver RSS could be on and configured to use multiple queues without any performance degradation.

During the investigation I spoke to Xavier Pillons, a Windows Server performance guru at Microsoft, and he came up with some very useful tips which I'm sure he won't mind me sharing:

1.Check the driver release version.
2. Check the TCP Global parameters (use the command line Netsh int tcp show global).
3. On Windows server 2008 try to disable RSS, and play with Chimney Off/On.
4. For a better latency you can disable Interrupt Moderation Rate on the Network Interface.

Problem solved, or at least a workaround found. Happy Days! :)

No comments:

Post a Comment