RHEL cluster question

Flaherty, Patrick pflaherty at wsi.com
Tue Jun 14 14:03:54 EDT 2011


> Greetings, fellow Linux lovers;
> 
> Ran into a little situation today where we need to cycle power/reboot
a
> bunch of nodes that are down and out, by telnet to the relevant
> terminal server ports and the advanced management module.  This
> involves multiple consoles, windows, command line, GUI, the works, as
> follows:
> 
> 
> 
> Subject:  RHEL cluster, 4.0 through 5.3.
> 
> Issue:  How to find IP addresses of terminal server ports which
service
> individual nodes which are down and out.  (need to telnet to them for
> troubleshooting/maintenance/rebooting)
> 
> And:  IP address and/or hostname of advanced management module which
> runs on the clusters .
> Some clusters have a "magic decoder ring" file that gives this
> information;  most don't.
> 
> Any thoughts?  Workaround so far has been via eyeballing racks of
> blades and doing various arithmetic problems in our heads.

It sounds like you work on a bunch of clusters that are configured like
crap by academics. You can keep most of that stuff organized with with
proper DNS naming or static-dhcp. It takes a bit to set up, but you map
DHCP to mac addresses, and cname hostname.console.blah.com to the
management module Since you probably can't tell them to rip out all of
their dns/dhcp infrastructure, maybe something like netdisco.org would
help. It uses CDP + SNMP to grab arp tables and map out your network.
You should be able to tell which blades are hooked to which switch
ports, and from there figure out the management module.

Patrick




More information about the gnhlug-discuss mailing list