Friday 21 May 2010

Performance enhancing shrugs, featuring Lizard

So you've just set up your brand spanking new cluster, and it all seems to work as designed. Diagnostic tests pass with flying colours, and you've run a handful of small test jobs through to get your eye in.
But could it perform better?
Well that's a good question, and before you start to look for the answer you need to think about how you define current performance, how best to baseline your cluster.
Personally I perform a set of standard, well defined tests. The results of these, coupled with good awareness of expected results for the type of hardware, not only give benchmarks for the cluster, but also indicate whether everything is behaving itself. The tests I use are:
1. mpipingpong
2. High Performance Linpack
3. An appropriate ISV application benchmark
As you can see they will provide fairly high level results, and aim to increase confidence rather than troubleshoot. Let's look at them a little more closely.


mpipingpong
mpipingpong is used to analyse the latency and bandwidth when passing a message between two processes on one or more computers using MPI. This is a lightweight test, which completes in a short time, and a couple of simple runs are available in the diagnostic tests suite which comes as part of Windows HPC Server. Result! Simply run the
MPI Ping-Pong: Lightweight Throughput and MPI Ping-Pong: Quick Check tests across all cluster nodes to produce bandwidth and latency results respectively.


High Performance Linpack (HPL)
HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. This is the application used to quantify SuperComputer performance for Top500 qualification, and is generally regarded as the industry standard HPC benchmark. That's not to say that it shows how your cluster performs under your real world workloads, but it certainly allows for analysis of performance when compared against other, similar machines. Once again, Microsoft have come up trumps and provide a packaged version of HPL wrapped in a marvellous application called Lizard. Lizard de-stresses the HPL run process by:
1. Providing a consistent, compiled HPL executable. If you've ever tried to compile HPL yourself you'll know exactly what a benefit this is.
2. Automatically tweaking HPL input parameters in order to obtain the best possible result for your cluster configuration. There are many Linpack parameters, and automation makes the tuning process very simple.


ISV application
This is one that for your environment you'll almost certainly will have more knowledge of than me. Let's just say that firing a known real world workload across your cluster will give excellent feedback, particularly if you're able to compare results against other machines you own, or published benchmarks. As an example, in the Engineering field Ansys publish Fluent benchmark results online, which give independent comparisons to onsite test runs.

But what do the results mean? Well, let's think about them one by one.
mpipingpong
Obviously the results you achieve will depend on the hardware configuration of your machine, but for guidance, you should expect the following:

Network Type      Throughput        Latency
GigE                     ~110MB/S      40-50 microseconds
10GigE                 ~800MB/S      10-15 microseconds
DDR IB                ~1400MB/S    <2 microseconds
QDR IB                ~3000MB/S    <2 microseconds

If you're way off these numbers you should start troubleshooting.

Lizard
Lizard will provide both an actual performance number, in Flops, and a cluster efficiency number, which should give you a good idea how well your cluster is performing against expected results (based on comparison with your head node processor). This is a good starting figure, but it's worth digging a bit deeper to determine how well your cluster is performing. There are lots of resources out there which will tell you the optimum result for your processor type, but these should be taken with a pinch of salt, as you will lose performance through inefficiencies outside the processor (memory / interconnect etc.).

ISV application
Many ISVs publish well defined benchmark figures which can be used as comparisons. Just be aware that it's unlikely that they will have benchmarked a hardware configuration exactly the same as yours. It's also good value to run a job on (for example) a powerful workstation or alternative cluster. This will help form a good view of where your cluster should be performing.

So what's next?
Well, that all depends on your results. Are they looking good? Great, keep an eye on things, let the users loose, and ask for their feedback. Users are quick to notice when their jobs are not running as they like. Slightly concerned about your benchmark results? Nice, we're into performance troubleshooting and diagnosis, which I'll cover in separate post.

One last thing - make benchmarking a regular thing, it's not only a good thing to report on, but it can be a good warning system for cluster issues.

Thursday 20 May 2010

PowerShell is Powerful

A pretty obvious thing to say I'm sure you'll agree, but for the sysadmin it is an incredibly useful link between the old WSH scripting world and full on object based C# or similar. I've been working with Windows Server for many years, and I'm struggling to come up with another administrative advancement which brings as much to the table as PowerShell.
Anyway, after that generic praise, let's get down to the real topic for this post. I have formed close allegiance with several Windows HPC Server PS commandlets over the past few years, so I thought I'd write about some of my favourites. Apologies to those left out, but I know they'll be waiting for me to come their way in the future despite not being given love here :)

Deployment
When you're building out multiple clusters, creating a PowerShell script which removes the need to manually work through the cluster configuration wizard is a sweet deal. As you're probably aware the cluster set up wizard runs through he following steps:


See the nice green checks? All achieved using PS commandlets. Let's break it down a bit...
Configure your network. Could be quite complicated as it takes into account various disconnected options such as network topology, subnet details, firewall settings, DHCP and NAT. But this can all be set up using Set-HpcNetwork As an example, say you want to create a Topology 1 setup, using NAT, firewall on for the Enterprise network, off for the private network, public address = currently assigned to NIC, private address = 192.168.1.253 private range 192.168.1.235 - 8.250, DHCP on for the private network, Headnode to provide NAT function. Check this out...

Set-HpcNetwork -topology private -enterprise 'NICName' -EnterpriseFirewall $True -private 'NICName2' -PrivateIpAddress 192.168.1.253 -PrivateSubnetMask 255.255.255.0 -PrivateDHCP $true -PrivateDhcpStartAddress 192.168.1.235 -PrivateDhcpEndAddress 192.168.1.250 -privateDHCPGateway 192.168.1.253 -PrivateDHCPDns 192.168.1.253 -PrivateNat $True -PrivateFirewall $False

Sweet!
Now then, provide installation credentials. Want to add domain\nodebuild as the node installation account?

Set-HpcClusterProperty -InstallCredential DOMAIN\nodebuild

Configure the naming of new nodes. Let's say you fancy naming the nodes after the cluster, so something like CLUSTER-CN001 onwards. The commandlet your looking for here is

Set-HpcClusterProperty -NodeNamingSeries

but you need to find the cluster name first. I use PS to grab the CCP_SCHEDULER environment variable.

Create a node Template? Simple, knock up a template with the appropriate steps, then export it. If your template has an image associated with it get that ready than import the image using

Add-HpcImage -Path

Now import your node template (which should reference your image if applicable)

Import-HpcNodeTemplate -Path

All done right? Well you may need to import drivers for he deployment process etc:

Add-HpcDriver -Path

You're all set, and of course by using a scripted method this is all nicely documented and repeatable.

Node Management

How do you control node group membership? I used to use the UI to manually add nodes to groups, which was a little bit painful so another delve into PS gave me a better way. I use Get-HpcNode in conjunction with Add-HpcGroup like this


Example - if you populate the node description with the software it has installed (softwarepackage2):

Get-HpcNode|where {$_.description -eq "softwarepackage2"}|Add-HpcGroup -name SoftwarePackage2Group

Example - if you want to add a node to a group based on it's config e.g. installed memory

Get-HpcNode|where {$_.Memory -ge "8000"}|Add-HpcGroup -name Over8GBGroup

Keeping an Eye on Business


The Operation log contains many juicy bits of info which are sometimes easy to miss. I tend to run up a monthly report which includes details of recent warnings and errors in the log.  PS again comes to my assistance, with the help of the Select-Object commandlet to hit the last 500 entries (appropriate for my needs).

Get-HpcOperation -State committed | Select-Object -Last 500 | Get-HpcOperationLog -Severity  Error,Warning

Also in the report is output of failed / failed to run diagnostics. I have an Ops Manager environment in place which of course provides overall management and reporting, but it also periodically runs diagnostic jobs. These are useful, and I grab the results like this:

Get-HpcTestResult -teststate -LastRunTime (get-date).AddMonths(-1)

So, to sum up, PowerShell is awesome! 

Monday 10 May 2010

A job a day keeps the user at bay

There's always one isn't there? A pesky user who wants to try something fancy on a cluster but isn't entirely sure how to do it. This is the coal face of HPC. It might not have the glamour of procurement, or the technical challenge of system administration, but being able to help a stuck user is possibly the most rewarding part of the job.
I like to set myself a challenge every now and then (job a day? well, maybe not quite that often) to submit a job with crazy parameters, or oddball resource requirements. You never know when a user will want to do something similar, and then a) you'll be able to help, and b) you'll look pretty smart ;)
Thinking more about this I might fire in a blog post every now and than containing an interesting job submission technique, might be interesting to see if anyone can come up with any nifty alternatives...

Know your underlying infrastructure

So, you're looking to deploy and support a Windows HPC Server cluster, but you have a sneaky suspicion that there's more under the covers that you bargained for. What do you know, your instincts are correct, and you're suddenly in a world of Microsoft technologies which you should at least be aware of. 
The good news is (and this is a big advantage) that these underlying technologies are common, and it may be that your company/organisation has experience in those areas, for example in a corporate IT team. If those skills can be leveraged for your Windows HPC deployment, then it's all gravy - you can stick to what you do best & eke out tip top performance, and write some killer submission scrips for your cluster users.
But wait, what if you have no such skills in house? Well, here's a quick rundown of some of the things involved...


Windows Server 2008
The base Operating System. Note that Windows HPC Server can run on various editions of the OS (e.g. HPC Edition, Standard Edition, Enterprise Edition), which should show that the HPC Pack is a separate entity to the OS. License wise it's important to note that you can only run Windows HPC related technologies on 2008 HPC Edition - no chance of saving a few notes by using it as a base OS for your corporate Exchange system ;)
When thinking about the OS, pay particular attention to driver versions and settings. Use the built in reliability, performance & logging tools to your advantage.


Active Directory
There's no getting round the Active Directory thing. It's at the core of everything Windows HPC Server does, from deployment to running jobs to data authorisation. There are several potential options here depending on your environment. If you have a corporate AD I would strongly suggest that you work with your corp IT guys to integrate the cluster. This type of configuration will smooth the wheels of progress significantly. Allowing an existing authentication regime, one in which users already have accounts set up, can save a bunch of user admin overhead. If this is not an option, it's worth spending at least a bit of time pondering your AD architecture. It's nice and easy to simply promote your headnode to a domain controller, but I would suggest that you also run up another, separate Domain Controller, as losing your AD can be a royal PITA to recover from.
Either way, take some time to get this bit right, and to learn the basics of AD operation & you'll likely see payback in future.


SQL Server
I'm planning to dive into SQL Server a bit deeper in another post, but suffice to say it's well worth picking up some knowledge in this area.


Windows Deployment Services
WDS provides the platform for the super slick node deployment mechanism within Windows HPC Server. It's wrapped so nicely that you may never need to poke about under the covers, but definitely pay attention to imagex, diskpart, the \\<headnode>\reminst share and it's contents.


DHCP and DNS
OK so these are not necessarily Windows specific, but getting your DNS and DHCP knowledge down is very useful. I'm going to post about network configuration in another post, so will try to include some DNS / DHCP tips there too.


RRAS
Routing and Remote Access Services is an umbrella for bunch of useful Windows features. These include Dialup; VPN (both client server and site to site); IP subnet routing, and Network Address Translation (NAT). In the case of Windows HPC Server the NAT part is of interest. It plays an integral part in operation of those network topologies which rely on compute nodes only having connection to the private network. In these cases traffic destined for hosts on subnets other then the private network travel via the headnode (as gateway), and NAT out onto the enterprise network. This may be pertinent e.g. for Windows Updates and the like.


Windows Failover Clustering
If you're going for a High Availability head node solution, take some time to become acquainted with how Windows Failover Clustering works. Behaviour and types of shared disks; should you used disk share, quorum, node majority; Failover Cluster network configuration; Cluster resource DNS registration; verification and support of cluster components. This is a big subject in itself, and an awareness of how the technology works is important.