UWMLSC > Beowulf Systems > Medusa
   Slave Tests

Slave Tests for 9 version 1.0

This is a list of tests to perform on a slave node after kickstarting it. Please indicate passed tests with *PASS*, failed tests with *FAIL* and provide additional information whenever possible. Also, please alter the above _ to the currnet kickstart/clone version you are testing.


  1. Does sensors report the memory, voltages and temperature correctly? No alarms shown? (Run sensors to find out.)
    *FAIL*
    -bash: sensors: command not found

  2. Are there any error or warning messages...
    1. On boot up (run dmesg): *PASS*
    2. During and after latest boot in /var/log/messages: *FAIL*
      1. ifup: ./ifup: line 268: [: : integer expression expected
      2. exportfs[2488]: No 'sync' or 'async' option specified for export "129.89.200.0/255.255.254.0:/". Assuming default behaviour ('sync'). NOTE: this default has changed from previous versions
      3. Roughly ten messages similar (but not the same) to the previous.
      4. xfs: ignoring font path element /usr/X11R6/lib/X11/fonts/cyrillic (unreadable)

  3. Does the /data directory have reduced # of inodes and reserved space for root? (Checking that the total space for /data is approximately 75360860 with df is reliable enough.)
    *PASS*

  4. Do /boot, /lib/modules not contain vestiges of old kernels, etc?
    *PASS*

  5. Does X run properly using startx as root? This should work on a generic monitor such as a Samsung SyncMaster 770.
    *FAIL*
    Inconsistent behaviour. Once X started with a blank screen, once it started with semi-faulty video and twm, many times it failed to start ("Fatal server error: no screens found).

  6. Do gcc and ddd work correctly?
    *FAIL*
    -bash: ddd: command not found

  7. Does networking run properly? Is it running full duplex 100baseT (you can use mii-tool to test)?
    *PASS*

  8. Does it correctly gets its identity (eg, s012) using dhcp from the master on bootup?
    *PASS*

  9. Does the system shut down and power off with shutdown -h now?
    *PASS*

  10. After power-down, if AC power is cycled, does system remain off?
    *PASS*

  11. Does the system reboot with shutdown -r now?
    *PASS*

  12. If a running system is unplugged, then plugged in, does it remain off?
    *PASS*

  13. Can that system now be powered on from another machine using etherwake? Does it fsck and boot up correctly?
    *FAIL*
    Does not etherwake.

  14. Does the machine act properly as a UPS master and UPS slave? If a UPS master, does it properly issue commands to shut down other machines under its control when the battery is low on the UPS? If a UPS slave, does it understand commands given to it by a UPS master? If power is restored to the UPS, does the slave remain turned off?
    *FAIL*
    nut not installed.

  15. Are there no files with dates in the future? (One way to check this is to do the following:
    cd / ; touch temp.data ; find . -cnewer temp.data -print -xdev ; rm -f temp.data
    If any files are listed, they have dates set later than the moment you executed the above command.)
    *PASS*

  16. Is NTP running correctly? Is the hardware clock synchronized with the software clock after the software clock has had its time synched with an ntp server? (Look in /var/log/messages for messages from xntpd. If ntp is working correctly, the /etc/ntp/drift file will contain a non-zero entry. Set the hardware clock in the bios to something incorrect (like last week). Boot up. Check the BIOS clock again. It has been set correctly to GMT. This might not work if you are off the UWM subnet. It depends how tight they are on securing the ntp servers.)
    *FAIL*
    Changing the time in the BIOS does not get corrected.
    Drift file is zero

  17. Is vgablank running correctly? (It should call vga_screenoff ten minutes after keyboard input stops, and calls vga_screenon when input resumes.)
    *FAIL*

  18. Does running drag 1.2 show > 290 Mflops (at n=220) provided the screen is blanked by vgablank?
    *FAIL*
    no vgablank

  19. Does hdparm -tT /dev/hda report good speeds (~120 and ~28 MB/s)?
    *PASS*

  20. Does hdparm -tT /dev/hdc report good speeds (~120 and ~28 MB/s)?
    *PASS*

  21. Does hdparm -tT /dev/hdd show good results?
    *PASS* (Assumming 139 and 3.6 MB/s are good)

  22. Is automount on the slave configured so that cd /mnt/floppy , cd /mnt/floppy_msdos and cd /mnt/cdrom work correctly if a floppy or cd are present? *FAIL*
    `cd /dev/cdrom` did not work, plus there isn't a floppy drive anymore!

  23. Are big files (2 GB and higher) properly supported with bash2? If the following commands do not dump core, they should be supported:
    bash2
    cp /etc/termcap a; cat a a a a a > b; cat b b b b b > c; cat c c c c c > d; cat d d d d d > e; cat e e e e e e > f

    *PASS* (default shell is bash2)

  24. Does automount work?
    1. On the master, does cd /net/s012 and cd /netdata/s012 work properly (with whatever the correct slave node # is)? *PASS*

    2. Does cd /net/m001 on the slave properly mount the master? *PASS*

    3. Does cp file /net/s012/ work correctly? *PASS*


  25. Can root on the master log into a node with just rsh s012 (no password needed)? *PASS*

  26. Does rsh work correctly, e.g. does rsh s012 uptime correctly return the uptime on s012 from m001? *PASS*

  27. Do the above automount and rsh tests work correctly between two slave nodes? *PASS*

  28. Does recloning over the network preserve the /data partition and remake the / and /boot partitions?
    *PASS*

  29. Does /proc/fans contain 'files' that correctly report the fan RPMS? *FAIL*

  30. Is sensorscheck operational and reporting errors if they occur? *FAIL*


$Id: slavetests-9system-1.0.html,v 1.9 2003/10/07 21:11:07 kflasch Exp $
Check this page for dead links, sloppy HTML, or a bad style sheet; or strip it for printing.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.