LSC Data Grid (6 sources) Load


General Information
LSC LIGO Scientific Collaboration
LIGO-Hanford Observatory
LIGO-Livingston Observatory

DataGrid Details

What is LSC DataGrid?
LDG Clusters Usage [Ganglia]
Available Data per site
Grid Service Details [Monitoring]

User Manual

How to get started
Install Data Grid Client
Getting Certificates
Account Request
SSH Login Portal
CVS/Bug Account Request
Request Software changes to SCCB

Admin Manual [(*) = optional]

Install DataGrid Server
Get server certificates
Configure/deploy Condor
Include site into Grid Monitoring
Graceful Condor shutdown
(*) Configure/deploy CondorView
(*) Configure Condor Flocking
(*) CondorC on LDG
LAMS / VOMS Admin [LSC internal]
Syracuse X4500 wiki [passwd required]
Edit these web pages

Request/Bug Tracking

Request Tracking System [RT]
LDG trouble ticket system


Reference O/S Schedule

LDG Collaborations

Condor-LIGO biweekly telecon
Globus-LIGO monthly telecon
LIGO VO in Open Science Grid [OSG]
Archival GriPhyN-LIGO WG pages



LIGO prioritized list of Condor RFE/bugs

LIGO prioritized list of Condor RFE/bugs

Last edited: $Id$
Click here for the CVS history of this list

  1. [condor-admin #15287] X509 certificate management enhancement request.

  2. [condor-admin #15277] DAGMan spool directory efficiency. How to avoid making O(10^5-10^6) copies of executables.

  3. Multi-core support for dynamic provisioning of resources.

  4. [condor-admin #17168] Shadow failures to connect to schedd

  5. Tool to gather all the log files associated with a given job.

  6. [condor-admin #15669] RFE to optionally delete stdout/stderr files automatically.

  7. [condor-admin #14006] Append to stdout/err files on re-execution.

  8. [condor-admin #17092] Local universe scheduling latencies.

  9. An automatic logfile aggregation for large DAG runs on those log files that users want to keep.

  10. [condor-admin #14493] condor_hold/DAGMan enhancement request Leading to PR 776

  11. [condor-admin #13160] condor_off fails to checkpoint in 6.7.14? Proper sequencing of daemon shutdown on machines that have both startd and ckptserver running.

  12. [condor-admin #17283] Ancillary suggestion for a crash report tool.

  13. [condor-admin #12616] Controlling resource abusive jobs For example, using setrlimit().

  14. [condor-admin #17239] condor_submit stuck in CPU spin-loop

  15. [condor-admin #17219] stdout occasionally lost for jobmanager-condor

  16. [condor-admin #17136] condor_run intermittently returning NULL results

  17. [condor-admin #14572] schedd deadlock on startup acquiring lock on user log file

  18. How to avoid thousands of jobs "flushing through a black hole machine".

  19. [condor-admin #17143] ImageSize update problem
    Peter investigating what it would take to have the same level and frequency of reporting for the Standard Universe as for the Local.
    Corollary--if this is too much for the standard it is probably too much for vanilla and how do we scale back on large pools.

  20. [condor-admin #17291] Job on hold without a reason

  21. Condor to rename core files to unique names to avoid overwriting when there are multiple problems.

  22. [condor-admin #14189] condor_history -backwards doesn't work

  23. [condor-admin #17465] condor_master core dump

Supported by the National Science Foundation. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF)