This is a listing of the files that have been changed in building the master node (front end) of the beowulf system.
etc/passwd  This file was modified by adding our users and the other standard password information. Note that we ensure that this file is identical on the master and the other nodes. 
etc/group  This file was modified by adding our standard groups.  As with the password file, it is identical on the master and the other nodes. 
etc/hosts This file has a list of all the IP numbers assigned to the machines in our private network.  Because our subnet is a private one, there are NO nameservers anywhere that can translate the symbolic hostnames like n030 into IP numbers.  We choose not to run the nameserver daemon named to simplify our systems.  Notice that the entry for the master node n001 differs slightly from that of the other nodes because it has an additional name ( which is the name by which it is known to the outside world.  There is also an entry that appears for our networking switch (switch1) which has a web/telnet interface and can be accessed over the web. 
etc/hosts.allow This file gives our "software development environment" machines (chandra .. weyl) access to all of the internet services such as FTP and rlogin on the master, or anything else which is started by the inet daemon.   It ALSO gives access to these services to ANY machine on our private network, ie any machine which matches network address / netmask  The boodpd line appears necessary so that the boodp server which runs on this machine (explained below) can be accessed by any other node.  Note: this entry may be wrong or unecessary (we're not sure!). 
etc/hosts.deny This file works in conjunction with etc/hosts.allow to define which machines have access to our internet services.  In this case, we are denying access to all machines except those listed in the hosts.allow table (ie, we are taking the conservative, xenophobic approach!).
etc/host.conf This is the standard file produced by a RedHat installation.  It is different on the non-master nodes.  It tells the master to resolve its IP addresses first from the /etc/hosts table if possible, and then if not, through a nameserver. In particular, the master node has access to Domain Name Service through its gateway to the outside world, so it is allowed (in the bind option) to use this to attempt to resolve IP addresses that are not listed in the etc/hosts file. 
etc/hosts.equiv This file permits users from any of the listed machines, including all the nodes, to have instant access to the master through rlogin, rsh, and other commands without having to give a login password.  This is particularly important for running mpi and mpich code which uses rsh to start processes on different machines. 
etc/resolv.conf This is the standard file produced by a RedHat installation.  It lists the two nameservers that are available to the master through its gateway connection to the outside world.  It also specifies the alternate domain name to search for a hostname with the "search" line.  In particular, if the DNS server fails to return an IP address for a given name, the domain will be appended to that name, and the DNS server will be asked to try again to resolve that new, longer name. Note that this file should be removed from all the other nodes, to prevent them from even THINKING about getting Domain Name Services to resolve unknown addresses (ie, those not listed in the etc/hosts file of these other nodes).  If they do attempt such resolution, they will hang for minutes at a time. 
etc/sysconfig/network This is standard file produced the a RedHat installation.  It's what would be produced if there were only a single ethernet card and it were connected to the 129.89.57.* network.  In particular it specifies that the eth0 device functions as a gateway device. 
etc/sysconfig/network-scripts/ifcfg-eth0 This is a standard file produced by a normal linux installation, and is used by ifconfig to configure the eth0 100-base T card network interface.  It assigns an IP address and a netmask and network to the interface, and specifies that the inteface should be turned on at bootup.  It also tells the master where broadcast packets to the "outside world" should go. It does not tell the card how to broadcast packets to the private network, since this ethernet card is not connected to that private network.  WARNING: in experimenting with different ethernet card configurations, DO NOT create files of the form ifcfg-eth0.SAVE or anything of the form ifcfg-*.  Such files will be read by the startup scripts and used to configure your interfaces!  Use the form SAVE.*. 
etc/sysconfig/network-scripts/ifcfg-eth1 This is a file that we added to "turn on" the other network card, and assigns an IP address to it. This other network card is on the "private" subnet and does not act as a gateway. 
etc/sysconfig/network-scripts/ifcfg-lo Standard loopback device file produced by a normal RedHat installation.
etc/fstab Local and NFS mounted disks and partitions. The local file systems include: 
  • /dos is DOS disk partition (the only type that can be recognized by the Windows-NT firmware to run linload.exe) 
  • /home is large disk for user home directories that is exported to all the other nodes 
  • /data is large disk for data that is exported to the other nodes 
  • /nfsr  stands for NFS-root.  It stores a complete copy of the file systems for any of the "slave" nodes, and they boot from it as part of the automated cloning process. 
  • /usr/local  is used for any locally installed software.  It is also exported to all the nodes. 
etc/exports This is a list of the file systems that are exported from the master node to all of the other nodes.  In particular: 
  • /nfsr is the nsf-root file system: it is exported so that whenever we want to clone a node, it can mount this as its root file system and then work comfortably on formatting and doing other things to its own disk. 
  • home is the directory for user's files. 
  • /data is the directory for data files. 
  • /usr/local is the directory for locally installed software.  To simplify maintenance this lives on a single partition on the master. 
All of these files are exported with the no_root_squash option, which  enables root on any node to have all the usual disk read/write priviledges associated with being root. 
etc/rc.d/rc3.d/S99rdate This script has been added.  It uses the rdate command to set the time on the master at bootup, obtaining the correct time from a trusted host on the net.
etc/rc.d/rc.local Lines have been added which (1) start the bootp server (needed to clone nodes) and which (2) start the time daemon to synchronize times on all the nodes with the time on the master node.
etc/bootptab This lists the hardware ethernet address of every machine.  That way, during the cloning process, a node can determine or discover its name.
etc/ethers This file is almost certainly not needed!
etc/inetd.conf I think the only modification here was to enable the bootp server, but this didn't work properly and we decided to run it in the background from an init script.
etc/pam.d/login Modified so that root can login from non-console locations.
etc/pam.d/rlogin Ditto from rlogin
etc/pam.d/rexec  Ditto for rexec
nfsr/boot/vmlinux.gz We've created a directory called /nfsr which means "Network File System -- Root" on the master.  It's a copy of the entire file system of a single node, and is used for cloning, as the root file system for network booting.
nfsr/boot/ptable  This is a copy of the partition table of the disk that we intend to clone.  It's created by putting the first 512 bytes of the raw disk device into a file, with the command 
    dd ibs=512 count=1 if=/dev/sda of=ptable
nfsr/sbin/init.normal  This is a copy of /sbin/init, which is the normal init program that runs on bootup.
nfsr/sbin/init.cloning This is a special script used in cloning.   It contains the sequence of commands executed by the clone to copy its files from the /nfsr partition of the master.
nfsr/sbin/init A copy of one of the two previous files.
nfsr/mnt/root Create this directory (mount point).  It will be used in cloning the nodes, as the mounting point the for machine's hard disk.
root/.rhosts  A list of all the beowulf nodes, placed in root's home directory, allows root login without a password from any node onto the master.
var/spool/cron/root This uses the S99rdate script described above to update the time every 12 hours on the master node, using a local machine as the trusted time-source
sbin/reread.c This is a simple program which uses an ioctl() call to force the disk controller to write the partition table at the start of the disk (rather than simply leaving it cached in memory).  This program is used in the cloning script sbin/init.cloning above.
sbin/reread Executable for previous program.
Some comments: