Using VDS to Convert DAX to DAG
These instructions assume you have already generated a DAX and that you are working on a machine on which the LSC DataGrid Server is installed. Note that you can install the LSC DataGrid Server on any workstation or desktop--it need not be the head node of a cluster.
The LSC DataGrid Server (rather then client) is necessary because you want to have a GridFTP server and the like installed and running.
- Set up your environment
As with all the LSC Grid tools, you need to source ~/ldg-3.0/setup.sh before beginning. Then to be sure check to make sure that VDS_HOME is defined:
$ echo $VDS_HOME ~/ldg-3.0/vds
- Download the LSC pool configuration file
VDS needs to understand what resources are available on which to run your workflow, and what the necessary properties or attributes are for each resource. VDS learns about the resources from a pool configuration file.
The pool configuration file is an XML file. Normally it would be auto-generated for you using the Globus MDS service. The LSC DataGrid, however, does not have MDS up and running yet.
So for now the pool configuration file is generated from a plain text file. Please download that file into your working directory.
- Edit the pool configuration file
The pool configuration file contains entries for various LSC resources which you should not edit.
It also, however, contains an entry for the "local" pool which you must edit. Please see the comments in the file since they (hopefully) detail how to edit the "local" pool configuration.
The configuration of paths (such as gridftp and workdir) can be affected by configurations for the VDS system. See the later steps for details. You may want to return and edit the pool configuration file a second time.
- Convert the pool configuration file to XML
After editing the pool config file you need to convert it to XML so that the VDS tools can properly parse it. To convert the text file to XML run the following:
genpoolconfig -f pool.config.txt -o pool.config.xml
- Download configuration file for VDS
The VDS tools (like Pegasus) have a lot of configuration options. The minimum set needed at this time by LSC scientists is given here.
Please download that file into your working directory and edit it, using the comments in the file as a guide.
- Download the inspiral transformation catalog file
The VDS tools require a transformation catalog file. This file gives details about the executables or transformations that run as part of the workflow.
Right now it is required that the executables be availabe on each remote resource that you plan to run on. That is, the executables have to be "pre-staged". This will change in the near future.
For the inspiral workflow please download this transformation catalog file and then edit it so that the paths to the executables at each site point to the locations where you put your binaries.
The paths to executables transfer and dirmanager are most likely correct already for each site.
- Create a proxy credential
VDS needs to authenticate on your behalf to the RLS servers as well as the jobmanagers. Please do the following:
- Unset the environment variable X509_USER_PROXY if it is set.
- Run grid-proxy-init to create a valid proxy.
- Convert the DAX to a DAG
With the pool configuration file in place to describe what resources are available, the transformation catalog to describe what executables (or transformations) are available, and the configuration file for VDS (Pegasus), you are now ready to convert the DAX to a concrete DAG.
First create the directory "CondorDir" in your working directory where all the Condor submit files and such will go:
Next convert the DAX to a DAG by running the following:
gencdag -Dvds.properties=./VDSproperties -p site1,site2,site3,... -o local --force --authenticate --verbose --random --dax inspiral_pipe.dax --dir ./CondorDir
Here site1,site2,site3,... is the list of sites that you want VDS to schedule jobs across as listed in the pool configuration. For example, if you want to schedule jobs across UWM, LSU, and CIT then you should enter -p uwm,cit,lsu.
- Create directories if necessary
If you have defined directories or paths in your VDS properties file (such as PEGASUS_WORKDIR or PEGASUS_STORAGE) then make sure that these directories exist at all the sites on which you will run.
Eventually this restriction will be removed with a new version of Pegasus (VDS).
- Stage applications
In the transformation catalog you listed the location at each site where executables can be located. You should check now to make sure that the executables are available and if not stage them.
Eventually this restriction will be removed with a version of Pegasus (VDS) that stages executables.