LIGO Data Grid

Navigation

CompComm
LSC
LIGO

DataGrid Details

What is LSC DataGrid?
Cluster Usage
Monitoring
Available Data
Service Details
OSG

User Manual

How to get started
Install Data Grid Client
Getting Certificates
Renewing Certificates
Certificates in your Browser
Account Request
Intro to Data Grid Tools
Matlab Cluster Tips
FAQ

Admin Manual

Install DataGrid Server
Get server certificates
Configure/deploy Condor
Configure/deploy CondorView
Graceful Condor shutdown
CondorC on LDG
LAMS / VOMS Admin
Syracuse X4500 Pages
Edit these web pages

Bug Tracking

LDG trouble ticket system

Policy

Reference O/S Schedule

LDG Collaborations

Condor-LIGO biweekly telecon
Globus-LIGO monthly telecon
Archival GriPhyN-LIGO WG pages

Exits

LSC
LIGO
OSG

Globusligo

LIGO prioritized list of Condor RFE/bugs

LIGO prioritized list of Condor RFE/bugs

Last edited: $Id: prioritized.html,v 1.11 2008/07/20 00:31:17 anderson Exp $
Click here for the CVS history of this list

  1. [condor-admin #15287] X509 certificate management enhancement request.

  2. [condor-admin #15277] DAGMan spool directory efficiency. How to avoid making O(10^5-10^6) copies of executables.

  3. Multi-core support for dynamic provisioning of resources.

  4. [condor-admin #17168] Shadow failures to connect to schedd

  5. Tool to gather all the log files associated with a given job.

  6. [condor-admin #15669] RFE to optionally delete stdout/stderr files automatically.

  7. [condor-admin #14006] Append to stdout/err files on re-execution.

  8. [condor-admin #17092] Local universe scheduling latencies.

  9. An automatic logfile aggregation for large DAG runs on those log files that users want to keep.

  10. [condor-admin #14493] condor_hold/DAGMan enhancement request Leading to PR 776

  11. [condor-admin #13160] condor_off fails to checkpoint in 6.7.14? Proper sequencing of daemon shutdown on machines that have both startd and ckptserver running.

  12. [condor-admin #17283] Ancillary suggestion for a crash report tool.

  13. [condor-admin #12616] Controlling resource abusive jobs For example, using setrlimit().

  14. [condor-admin #17239] condor_submit stuck in CPU spin-loop

  15. [condor-admin #17219] stdout occasionally lost for jobmanager-condor

  16. [condor-admin #17136] condor_run intermittently returning NULL results

  17. [condor-admin #14572] schedd deadlock on startup acquiring lock on user log file

  18. How to avoid thousands of jobs "flushing through a black hole machine".

  19. [condor-admin #17143] ImageSize update problem
    Peter investigating what it would take to have the same level and frequency of reporting for the Standard Universe as for the Local.
    Corollary--if this is too much for the standard it is probably too much for vanilla and how do we scale back on large pools.

  20. [condor-admin #17291] Job on hold without a reason

  21. Condor to rename core files to unique names to avoid overwriting when there are multiple problems.

  22. [condor-admin #14189] condor_history -backwards doesn't work

  23. [condor-admin #17465] condor_master core dump

Supported by the National Science Foundation. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF)
$Id: prioritized.html,v 1.11 2008/07/20 00:31:17 anderson Exp $