UWMLSC > Beowulf Systems > Medusa
   UPS Testing

Power Failure Results

On Tuesday May 4 a large portion of the campus lost power for about an hour. Below are the details of what happened with the cluster.

  1. Power failed at around 15:27
  2. Nodes Shutdown
    Between 15:29:24 and 15:29:48 nodes started shutdown process.
    Between 15:29:55 and 15:30:02 upsd on ups maters "disconnected"
  3. UWMLSC Powered Down
    15:27:21 UWMLSC reported being on battery
    15:47:21 User requested FSD!
    15:47:37 System is being shutdown by UPS
    15:47:52 127.0.0.1 disconnected
    15:48:03 UWMLSC exited on signal 15
  4. Powerware UPS failed on switch at about 16:06:32
  5. The following machines were then shutdown by hand:
    15:34:42 contra told to shutdown - 15:35:22 contra went down
    15:34:55 hydra told to shutdown - 15:35:37 hydra went down
    15:37:02 condor told to shutdown - 15:37:14 condor went down
    15:39:40 hades
    15:42:54 tigger
    15:43:42 watchtower
    15:45:26 nest
    15:46:26 kanga
    15:47:24 dataserver started shutdown (nuts) - 15:47:52 down
    15:51:56 gravity FSD -> 15:53:52 shutdown (nuts) - 15:56:12 gravity (powered back on for email) - 16:03:41 down again
    15:55:40 medusa rebooted - 16:03:51 medusa shutdown
  6. The following machines shut down hard after power was lost: 15:43:11 storage2 last log message
    16:04:13 storage1 started to go down - 16:09:10 last log entry, still not down
  7. Power was restored at around 16:25
  8. 17:10:56 slave nodes started coming up - a few had problems (3 batteriesand 1 cabling)
  9. 16:30:56 uwmlsc partly up - 16:39:44 uwmlsc partly up - 16:47:48 uwmlsc
    partly up - 16:54:07 uwmlsc up, yay! ** medusa set as primary
    DNS; "Starting NFS Services" hung but worked after medusa turned on.
    When other machines came up: 17:07:09 hydra
    17:08:41 contra
    17:05:54 condor
    17:38:46 kanga
    17:30:45 hades
    16:26:56 nest
    17:11:42 watchtower
    17:38:49 tigger
    16:55:51 gravity
    16:51:16 medusa
    16:35:51 storage1 partly up - 16:35:51 partly up - 16:52:42 up
    16:35:23 storage2 started coming up - 16:51:36 storage2 finished
    coming up (after medusa and thus DNS came up)
Check this page for dead links, sloppy HTML, or a bad style sheet; or strip it for printing.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.