LIGO Data Grid

Navigation

CompComm
LSC
LIGO

DataGrid Details

What is LSC DataGrid?
Cluster Usage
Monitoring
Available Data
Service Details
OSG

User Manual

How to get started
Install Data Grid Client
Getting Certificates
Renewing Certificates
Certificates in your Browser
Account Request
Intro to Data Grid Tools
Matlab Cluster Tips
FAQ

Admin Manual

Install DataGrid Server
Get server certificates
Configure/deploy Condor
Configure/deploy CondorView
Graceful Condor shutdown
CondorC on LDG
LAMS / VOMS Admin
Syracuse X4500 Pages
Edit these web pages

Bug Tracking

LDG trouble ticket system

Policy

Reference O/S Schedule

LDG Collaborations

Condor-LIGO biweekly telecon
Globus-LIGO monthly telecon
Archival GriPhyN-LIGO WG pages

Exits

LSC
LIGO
OSG

Globusligo

Configuring and Deploying Condor

When the LSC DataGrid Server (or more properly, the VDT Server upon which the LSC DataGrid Server is based) is installed Condor is only setup to run on that single machine.

Primarily this is because it would be very difficult for the installation software (Pacman) to detect the details of your cluster configuration, and for the cache authors to build caches to handle all possible variations.

Most likely the default installation and configuration is not what you want.

Below are instructions for deploying Condor onto your cluster using one particular method and configuration option or style, and making some basic assumptions. Condor is very flexible and so you may choose to install, configure, and deploy it in a variety of ways. If you find that the instructions below do not suit your needs please see the Condor manual.

  1. If you have not already, source the setup.sh file:
    source /opt/ldg/setup.sh
    
  2. Run condor_configure to slightly modify your current installation:
    /opt/ldg/condor/condor_configure --local-dir=/opt/ldg/condor/home
    
  3. Create a condor_config.local file:
    touch /opt/ldg/condor/home/condor_config.local
    
  4. Edit /opt/ldg/condor/etc/condor_config:
    • Set CONDOR_HOST to the FQDN of the machine on which you installed the server package. If this machine has a seperate network interface just for access to the cluster nodes, use that FQDN or equivalent IP address.
    • Set RELEASE_DIR to be /opt/ldg/condor or equivalent for your system.
    • Set LOCAL_DIR to be $(RELEASE_DIR)/home
    • Set LOCAL_CONFIG_FILE to be $(LOCAL_DIR)/condor_config.local
    • Set CONDOR_ADMIN to an appropriate email address
    • Set UID_DOMAIN to the subnet domain for your cluster. For example at UWM a typical node has FQDN medusa-slave001.medusa.phys.uwm.edu so the subnet domain is medusa.phys.uwm.edu.
    • Set FILESYSTEM_DOMAIN to be $(FULL_HOSTNAME) if your cluster does NOT have a shared filesystem for users, or set it to the subnet domain if it does have a shared filesystem for users.
    • Set USE_NFS to be True if your cluster has a shared filesystem for users.
  5. Edit /opt/ldg/condor/etc/examples/condor.boot and set MASTER=/opt/ldg/condor/sbin/condor_master
  6. Create a tar file that you can deploy onto each node of your cluster:
    tar -cf condor.tar /opt/ldg/condor
    
  7. Create a condor user and group on each node of your cluster. A home directory is not necessary nor is a login shell. If you use NIS that is fine too.
  8. Deploy the tar file onto each node of your cluster, creating /opt/ldg/condor on each node.
  9. On each node copy /opt/ldg/condor/etc/examples/condor.boot to /etc/init.d/condor.
  10. On each node execute chkconfig --add condor.
  11. On each node create the symlink /etc/condor/condor_config -> /opt/ldg/condor/etc/condor_config
  12. Back on the machine on which you installed the server package, create the file /opt/ldg/condor/home/condor_config.local if necessary and edit it to add the following:
    COLLECTOR_NAME = FQDN
    DAEMON_LIST    = MASTER, COLLECTOR, NEGOTIATOR, STARTD, SCHEDD
    COLLECTOR      = $(SBIN)/condor_collector
    NEGOTIATOR     = $(SBIN)/condor_negotiator
    
    If this machine has a seperate network interface for the cluster nodes also add the line
    NETWORK_INTERFACE = your ip address
    
    where the right hand side is the IP address for the seperate network interface.
  13. Start condor on the machine on which you installed the server package:
    /etc/init.d/condor start
    
  14. Start condor on the nodes by doing the same on each node.
  15. On the server or Condor Central Manager machine do /opt/ldg/condor/bin/condor_status to see the status of your Condor pool.

This completes a basic deployment and configuration of Condor. You are strongly encouraged to read the Condor Manual and learn how to configure Condor is the best way for your particular cluster.





Supported by the National Science Foundation. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF)
$Id: condordeploy.html,v 1.4 2007/11/06 03:41:04 patrick Exp $