LSC Data Grid (6 sources) Load

Navigation

General Information
LSC LIGO Scientific Collaboration
LIGO-Caltech
LIGO-Hanford Observatory
LIGO-Livingston Observatory

DataGrid Details

What is LSC DataGrid?
LDG Clusters Usage [Ganglia]
Available Data per site
Grid Service Details [Monitoring]

User Manual

How to get started
Install Data Grid Client
Getting Certificates
Account Request
SSH Login Portal
CVS/Bug Account Request
Request Software changes to SCCB

Admin Manual [(*) = optional]

Install DataGrid Server
Get server certificates
Configure/deploy Condor
Include site into Grid Monitoring
Graceful Condor shutdown
(*) Configure/deploy CondorView
(*) Configure Condor Flocking
(*) CondorC on LDG
LAMS / VOMS Admin [LSC internal]
Syracuse X4500 wiki [passwd required]
Edit these web pages

Request/Bug Tracking

Request Tracking System [RT]
LDG trouble ticket system

Policy

Reference O/S Schedule

LDG Collaborations

Condor-LIGO biweekly telecon
Globus-LIGO monthly telecon
LIGO VO in Open Science Grid [OSG]
Archival GriPhyN-LIGO WG pages

Exits

OSG

Using Pegasus to Run LSC Code on the Grid

Requirements on users system

The users system is the machine that the user uses to generate their abstract workflow, run Pegasus to generate the concrete DAG and then runs the concrete DAG from.

The users machine should have the following software installed and configured correctly:

  1. LSC approved operating system: currenltly Fedora Code 3.
  2. Installation of Condor version 6.7.8 or greater.
  3. Installation of LSC Data Grid Server version 3.0 or greater. The LSC Data Grid server should be correctly configured so that the following services are enabled:
    1. The Condor pool on the local machine should be up and running. In particular:
      • Condor should accept and run jobs in the scheduler universe
      • Condor should accept and run jobs in the globus universe
    2. Incoming and outgoing gsissh should work correctly.
    3. Incoming and outgoing gsiftp should work correctly.
    4. The Globus GRAM job manager should work correctly to accept jobs. In particular:
      • The globus fork job manager should be correctly configured and accept incoming jobs.
  4. The lscsoft repository should be installed so that lal and lalapps can be built.
  5. The ligo.sh scripts should be installed so that environment variables for the installed software are correctly configured when the user logs in.

Stuart Anderson should be able to set up a machine at Caltech which is properly configured by following the configuation that was used to set kitalpha up as a FC3 machine; most of the neccessary software is already installed in /ldcg

Other users starting from a blank Fedora Core 3 machine should:

  1. Follow the LSC Data Grid Server Install Instructions
  2. Follow the instructions to Install lscsoft from the yum repository

The following LSC Data Grid machine are know to be correctly configured (as of 7/21/2005):

ldas-grid.ligo.caltech.edu

You will also need a valid grid certificate and private key installed in ${HOME}/.globus

If do not have a certificate, follow the user instructions to get a digital certificate. If you already have a certificate and key on a different machine, you can copy them to the machine your will be running Pegasus from. Make sure they are in the directory ${HOME}/.globus with the correct permissions: 400 for userkey.pem and 644 for usercert.pem


Access to the LIGO data and LDR

To generate a DAX using the inspiral pipeline, you will need access to an LDRdataFindServer (i.e. the ability to run LSCdataFind). If you do not already have this, fill in the LSC Data Grid account request form to request an account on the LSC Data Grid, making sure that you specify that you need to use LSCdataFind.

To run Pegasus you will need access to the LIGO RLI servers that know where the data is and access to the grid ftp servers that have the data. You will need to contact Scott Koranda and Stuart Anderson (at least) to be added to the UWM and CIT servers. You will need to send them your DN and request that they do the following:

  1. Edit the file $LDR_LOCATION/globus/etc/globus-rls-server.conf
  2. Look for the section that starts
    # permission for Pegasus users to query LRC RLI
  3. Add a line of the form (replace my DN with the users DN)
    acl /DC=org/DC=doegrids/OU=People/CN=Duncan Brown 792417: lrc_read rli_read stats
  4. Send a kill -HUP to the globus-rls-server process. There is no need to stop and restart the rest of LDR.
  5. Edit the file $LDR_LOCATION/globus/etc/grid-mapfile.gridftp and add the users DN to this file mapping to the user who has access to the data (i.e. grid or datarobot). Note when this goes production, users should be mapped to a user who has read-only access to the data to prevent them from deleting it using uberftp.
  6. Edit the file $LDR_LOCATION/globus/etc/grid-mapfile.gram and add the users DN mapping to a user who has access to the data.

Compile the inspiral code

Follow the LAL and LALApps install instructions to build and install the inspiral software.

Note that you should always use the --enable-condor option when configuring LALApps so that static, standard universe executables are built which can be easily run on the grid.


Download and install glue

The Grid LSC User Environment (Glue) is not yet included in the lscsoft repository so you will need to download it from CVS.

  1. Create an empty file called .noglue in your home directory by running
    touch ${HOME}/.nolscsoft-glue
    Log out and log back in. This will disable any system installed copies of Glue.
  2. Follow the Glue README file to install Glue.

Installation of VDS

The LSC Data Grid server is built on top of VDT which include the VDS (the package that contains Pegasus), however the version of VDS installed in LDG 3.0 and LDG 3.5 does not have the features we wish to test.

A recent version of vds-binary should be installed for running Pegasus. Nightly builds are available from the Pegasus cvs build web page. The correct version for FC3 is linux-i686-glibc235. The version vds-binary-1.3.10-linux-i686-glibc235-20050901.tar.gz is known to have all the necessary bug fixes to create concrete DAGs for the inspiral pipeline.

Install the VDS as follows:

  1. Create a directory to contain VDS. We assume this will be created under ${HOME} here:
    mkdir ${HOME}/vds
  2. Download the VDS binary tarball into this directory and uncompress it:
    cd ${HOME}/vds
    wget http://vds.isi.edu/cvs-nightly/vds-binary-1.3.10-linux-i686-glibc235-20050901.tar.gz
    tar -zxvf vds-binary-1.3.10-linux-i686-glibc235-20050901.tar.gz
  3. This will create a directory with a version number in it containing the VDS binaries. Create a symbolic link called vds-current that points to this directory:
    ln -s vds-1.3.10 vds-current
  4. Add the following lines to your .bashrc
    unset CLASSPATH
    VDS_HOME=${HOME}/projects/grid/vds/vds-current
    export VDS_HOME
    source ${VDS_HOME}/setup-user-env.sh
    export PATH=${VDS_HOME}/bin:${PATH}
    These should be added AFTER sourcing the LDG setup script.
  5. Log out and log back in to update your environment variables.
If you need to get a newer version of the VDS, simply download it into ${HOME}/vds and change the symbolic link to point to the new directory.

Generate an inspiral DAX

Once the inspiral code is installed, you can generate inspiral workflows. The following shows how to create a simple DAX that can be used to test the LSC Data Grid and/or the OSG.

  1. Make sure that you can talk to an LSC data find server by running:
    LSCdataFind --ping
    it should respond with
    LDRdataFindServer at ldas-cit.ligo.caltech.edu is alive
    where ldas-cit.ligo.caltech.edu is replaced by your local LSC data find server.
  2. Make a directory to work in, for example:
    mkdir ${HOME}/grid_inspiral
  3. Download the files into this directory.
  4. Uncompress the cache file directory (this contains the names of the calibration frames)
    tar -zxvf cache_files.tar.gz
  5. Make sure you have a valid grid proxy by running
    grid-proxy-init
  6. Run the script lalapps_inspiral_pipe to generate a DAX by executing it with the arguments
    lalapps_inspiral_pipe --datafind --template-bank --inspiral --triggered-bank 
    --triggered-inspiral --coincidence --config-file inspiral_pipe.ini --log-path . --dax
  7. This should create a file called inspiral_pipe.dax which can be given to Pegasus.

Use Pegasus to generate a concrete DAG from the DAX

The first step is to obtain a vaild pool configuration file and transformation catalog for Pegasus.

If you want to run on the Open Science Grid

You can run the program vds-get-sites to obtain a site config and transformation catalog.

  1. Run
    vds-get-sites --grid osg-itb
    replace osg-itb with osg to use the OSG proper instead of the testbed.
  2. Copy the resulting files to the directory containing the DAX:
    cp ${VDS_HOME}/var/tc.data .
    cp ${VDS_HOME}/etc/sites.xml .

If you want to run on the LSC Data Grid

There is currently no automated way of generating a pool config and transformation catalog for the LSC Data Grid. You can generate your own by downloading the files

You will need to edit these files to get the correct paths to the required directories where you have permission to write files. Once you have done this, run the command

genpoolconfig --poolconfig sites.txt --output sites.xml
to generate the XML pool config file required by Pegasus.

For both the OSG and the LSC Data Grid

Tell Pegasus where to find the inspiral executable you have built by doing the following:

  1. Edit the file tc.data and add the locations of the inspiral executables to be staged onto the grid. At the bottom of the file, add the lines
    local ligo::lalapps_tmpltbank:1.0 GSIFTPPATH/bin/lalapps_tmpltbank STATIC_BINARY INTEL32::LINUX
    local ligo::lalapps_inspiral:1.0 GSIFTPPATH/bin/lalapps_inspiral STATIC_BINARY INTEL32::LINUX
    local ligo::lalapps_inca:1.0 GSIFTPPATH/bin/lalapps_inca STATIC_BINARY INTEL32::LINUX
    NOTE: You should replace the string GSIFTPPATH with the gsiftp URL of the inspiral binaries on your machine. The command
    echo gsiftp://`hostname -f`${LAL_PREFIX}
    should give you the correct string to replace GSIFTPPATH. For me this command returns
    gsiftp://ldas-grid.ligo.caltech.edu/archive/home/dbrown
    You can check these URLs are correct before running Pegasus by using uberftp to copy them to your home directory.
  2. If you are using the LSC data grid tc.data, add the locations of the dirmanager and kickstart executables on your local pool to the tc.data file. NOTE: this is not needed if you created your tc.data with vds-get-sites. At the bottom of the file add the lines
    local transfer VDS_HOME/bin/transfer INSTALLED INTEL32::LINUX vds::bundle_stagein=1
    local dirmanager VDS_HOME/bin/dirmanager INSTALLED INTEL32::LINUX
    NOTE: you should replace the string VDS_HOME with the value of the environment variable VDS_HOME that you defined previously. For example, for me this is set to
    [dbrown@kitalpha.ligo pegasus]$ echo $VDS_HOME 
    /archive/home/dbrown/projects/grid/vds/vds
    and so the correct path to the transfer executable is
    /archive/home/dbrown/projects/grid/vds/vds/bin/transfer

Obtain a VDS properties file for Pegasus. Download the file

into the directory containing the DAX.

Since the calibration frames are not yet retrieved via LDR, you will need a PFN cache that tells Pegasus where it can find the calibration data. Download the file

to the directory containing the DAX. You will need gsiftp access to the machine ldas-grid.ligo.caltech.edu to obtain this data.

Now you should be able to run Pegasus to generate a concrete DAG.

  1. Make sure you have a vaild grid proxy and you don't have any other proxies hanging around:
    unset X509_USER_PROXY
    grid-proxy-init
  2. Run gencdag to create the concrete DAG. This example creates a concrete dag that runs on the LSC data grid pools at Caltech, Penn State, Hanford, UWM and LSU and returns the results to the local pool:
    gencdag -Djava.net.preferIPv4Stack=true -Dvds.properties=./properties
    -vvvvv -a -r -p uwm,psu,cit,lho,supermike,helix -o local -d inspiral_pipe.dax 
    --dir all_clusters --cache calibration_pfn_cache.txt
    The option -p specifies a comma separated list of pools to use: for example -p UWMilwaukee,OSG_LIGO_PSU,BNL_ATLAS_1 will use three of the OSG production pools. The pool names can be obtained from GridCat for OSG or from the sites.xml file for the LSC data grid.
    The option --dircontrols the name of the directory to which the concrete DAG is written. For example you could specify --dir osg_test to create the DAG and submit files in the directory osg_test.

The final stage is to run the concrete DAG:

  1. Change into the working directory containing the concrete DAG. For the above example
    cd UWMilwaukee
  2. Submit the concrete DAG by running
    condor_submit_dag inspiral-0.dag
  3. You can watch the progress of the concrete DAG with
    tail -f inspiral-0.dag.dagman.out

Where to go for help

If everything is configured correctly, it should work. If you have problems, mail the griphynligo mailing list.

Supported by the National Science Foundation. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF)
$Id$