Does sensors report the memory and the voltages/temperature correctly?
No alarms shown?
Are there any error or warning messages on bootup? Any errors
or warnings in /var/log/messages?
In /var/log/messages: There is a suggestion that etherwake might
have been updated kernel: ether-wake uses obsolete (PF_INET,SOCK_PACKET) This can be ignored for now.
There is no newer version of ether-wake that I know of.
Also, I found: modprobe: modprobe: Can't locate module
net-pf-10 I think this might be network packet filtering. It would be nice to know how to fix this.
The addition of the corresponding module in
the kernel should fix it. There was no clear indication of what it was in the
kernel config. I'll attempt enabling network packet filtering as a module next
The error messages about the missing random
number generator are OK.
Does the /data directory have reduced # of inodes and reserved space for
Does X run properly using startx as root?
Do gcc and ddd work correctly?
Does networking run properly using dhcp? Is it running full duplex
Does the system shut down and power off with shutdown -h now?
Does the system reboot with shutdown -r now
After power-down, if AC power is cycled, does system remain off?
If a running system is unplugged, then plugged in, does it remain off?
Can that system now be powered on from another machine using etherwake?
Does it fsck and boot up correctly?
When plugged into the UPS with a serial line, does the UPS properly
shut down the machine when its battery gets low?
If power to the UPS is then restored, does the node remain turned
Is NTP running correctly?
Is the hardware clock synchronized with the software clock after the software
clock has had its time synched with an ntp server?
Is the hardware clock synchronized with the software clock by a crontab
Is there a running script that calls vga_screenoff 10 minutes after keyboard
input stops, and then calls vga_screenon when input starts again?
Does running drag 1.2 show > 290 Mflops provided the screen is blanked
by the above?
Does hdparm -tT /dev/hda report good speeds (~120 and ~28 MB/s)?
Does /hdparm /dev/hdd report jazzed up parameters for CDROM?
Is automount on the slave configured so that cd /mnt/floppy and cd /mnt/cdrom
work correctly if a floppy or cd are present?
PASS. Does this work with both ext2 and msdos floppies?
Should we install fftw, lam, and mpich on each machine (just to have the
libraries local on each machine to cut down network use)?
PASS - in /ldcg
Does man -k work?
Is there a decent /root/.netscape file with some
Does /proc/fans/ exist and contain sensible information?
Are big files properly supported? Does the cp /etc/termcap a; cat a a a
a a > b; cat b b b b b > c; etc allow creation of files > 2 GB?
PASS using bash2
Here are a set of tests that the cloned node should pass when plugged into
the master via a switch
Correctly gets its identity (eg, n012) using dhcp from the master on bootup.
Does automount work?
On the master, does cd /net/s012 work properly (with whatever the correct
slave node # is).
PASS Does cd /net/m001 on the slave properly mount the master?
PASS Does cp file /net/s012/ work correctly?
When connected to the master via a switch, does an mpi job run properly?
Can root on the master log into a node with just rsh s012 (no password
Does rsh work correctly, eg does rsh s012 uptime correctly return the uptime
on s012 from m001
Do the above automount and rsh tests work correctly between two slave nodes?
Are warning messages from the logging daemon correctly logged to the master?
we are not taking this approach anymore.
rather, a script will be written to pick up logs on demand (and/or cron'd)