Thursday 29 April 2010

It's all in the name





I like names, they have a funny way of telling a story in a single word. Take my name for instance, I'm sure you read the word 'Dan' and assume that I'm a strong, intelligent, handsome guy who's no end of fun to hang out with, right? Or maybe it's the other way round, and the person defines your perception of a name. I mean, if all people named Dan are super cool, does that make the name Dan super cool?
Anyway, what I really want to talk about here is how Windows HPC handles name resolution, particularly across private and application networks.
First off let's think a little about general Windows host name resolution order. This looks like this, listed in order:

1. Checks it's own name.
2. Looks in the Local DNS cache (you can list entries using ipconfig /displaydns).
3. Local HOSTS file (C:\Windows\System32\Drivers\etc\Hosts)
4. Adds the Search Suffix configured on the machine (if not FQDN), and query DNS
5. WINS (NetBios name resolution)
6. Broadcast on local subnet
7. Local LMHOSTS file (C:\Windows\System32\Drivers\etc\Hosts)

Pretty thorough I'm sure you'll agree.
Now let's look at this with our high performance and management hats on. We want to be able to get an  answer to our name resolution queries as quickly as possible, while ensuring consistency across all nodes in the cluster. We also need to resolve a host name to the appropriate network address for the cluster network we're after. Running through the resolution order above we can ignore 1. for obvious reasons. 2. may be interesting performance wise,  but as the cache may not contain records for all nodes, it's an inconsistent choice. 3. hmmm, the good old local hosts file, sounds kinda antiquated and simplistic don't you think? But it's at number 3 on the list, crucially checked before DNS resolution. And maybe it can be managed by one of the HPC services running on all nodes? Oh, this is starting to sound decent. Just to be sure though let's continue. 4. is our old friend DNS, which sounds like the way to go. But each DNS lookup can take a relatively long time. Seems like it'd be good for management, but as good as a cluster managed solution? Once we get to 5, 6 and 7 things are drifting off into desperation, so let's not say too much about those guys.

Well what do you know, Hosts seems to be a very good choice here, and lo and behold that's how it works in practice! Check out this example hosts file taken from a handy dev cluster... 

# Copyright (c) 1993-1999 Microsoft Corp.
# This host file is maintained by the Compute Cluster Configuration
# Management Service. Changes made to the file that match the netbios names
# for existing nodes in the cluster will be removed and replaced by entries
# calculated by the management service.
# Modify the following line to set the property to 'false' to disable this
# behavior. This will prevent the management service from making any
# further modifications to the file
# ManageFile = true

127.0.0.1                localhost
192.168.5.23             HPCDEV-HN02                    #HPC
192.168.100.11           HPCDEV-HN02                    #HPC
192.168.0.11             HPCDEV-CN001                   #HPC
192.168.0.10             HPCDEV-CN002                   #HPC
192.168.0.134            HPCDEV-CN003                   #HPC
192.168.0.1              HPCDEV-HN01                    #HPC
192.168.0.2              HPCDEV-HN02                    #HPC
192.168.0.3              HPCDEV-VHN01                   #HPC
192.168.1.1              HPCDEV-HN01                    #HPC

192.168.1.2              HPCDEV-HN02                    #HPC
192.168.1.3              HPCDEV-VHN01                   #HPC
192.168.5.22             Enterprise.HPCDEV-HN01         #HPC
192.168.5.23             Enterprise.HPCDEV-HN02         #HPC

192.168.5.28             Enterprise.HPCDEV-VHN01        #HPC
192.168.0.11             Private.HPCDEV-CN001           #HPC
192.168.0.10             Private.HPCDEV-CN002           #HPC
192.168.0.134            Private.HPCDEV-CN003           #HPC
192.168.0.1              Private.HPCDEV-HN01            #HPC
192.168.0.2              Private.HPCDEV-HN02            #HPC
192.168.0.3              Private.HPCDEV-VHN01           #HPC
192.168.1.11             Application.HPCDEV-CN001       #HPC
192.168.1.10             Application.HPCDEV-CN002       #HPC
192.168.1.134            Application.HPCDEV-CN003       #HPC
192.168.1.1              Application.HPCDEV-HN01        #HPC
192.168.1.2              Application.HPCDEV-HN02        #HPC
192.168.1.3              Application.HPCDEV-VHN01       #HPC






This file reflects the current addressing of a HA head node cluster configuration which has three compute nodes. Network topology is 3. Compute nodes isolated on private and application networks. Interesting to note that only the active head node addresses are detailed in the standard format listing (HPCDEV-HN02) for networks other than private and application (in this case Failover cluster heartbeat network and enterprise).
Check out those funky Enterprise. Private. and Application. entries. This allows the cluster service to be very specific in its address resolution requests, assuring it will always get back the address on an appropriate network.

But what if you want to host non HPC Server managed machines on your private network, therefore requiring hosts to register in DNS (they do not by default)? Well, you can use the awesomeness that is powershell...
Set-HpcNetwork -PrivateDnsRegistrationType WithConnectionDnsSuffix

One thing to beware of - check out the warning at the top of the hosts file. If you set  Managefile = False and manually alter entries previously managed by HPC Server things may get a little broken.

No comments:

Post a Comment