Search


DASWG LAL Doxygen

Docs

How-To's
Technical
Software Docs
Minutes

Download

Browse CVS, Git, or SVN
Software Repositories
OS Security Updates
LIGO software virtual machine
VMware SL6 Install

Participate

SCCB - Software Change Control Board
Edit these pages
Sub-committees
Mailing List
Telecon

Projects

DMT
DQSEGDB
Glue
GraceDB
gstlal
LALSuite
LDAS Tools
LDG Client/Server
LDR
ligoDV
LIGOtools
LVAlert Administration
LVAlert
MatApps
Metaio
NDS Client
PyLAL
LSCSOFT VM

Legacy Projects

geopp
LDAS
LDM
LIGOtools
LSCGIS
Onasys
OSG-LIGO

How to Stop and Start the LSC segment database

Introduction

The components of the LSC segment database are reasonably robust against hard shutdowns of the hardware they are running on, but in normal circumstances it is important to cleanly shutdown the services before powering off or rebooting the machine.

Note that a single machine in the peer-to-peer network can be rebooted with affecting the other two peers.

The steps below to shutdown and restart the services should be run as the user ldbd with the exception of semaphore allocation step which should be done as root.

Stopping the LSC segment database

The procedure for stopping the LSC segment database is as follows:

  1. Shutdown the ldbdd processes for LSCsegFindServer and LDBDServer.,
  2. Shutdown the Q replication capture and apply programs.
  3. Shutdown the WebSphere MQ channels, queue managers and listeners.
  4. Clean IPC.
  5. Stop DB2.
To accomplish this, perform the following steps:
  1. First shutdown the LSCsegFindServer, LSCsegFindServerDev, and LDBDServer which servers the DMT by sending the processes a SIGTERM. For example:
    [ldbd@ldas-cit ~]$ ps -ef | grep ldbdd
    ldbd 12839  1  0  Jun 11 ?  2:02 /usr/bin/python /usr1/ldbd/glue-1-17/bin/ldbdd -d -c /export/ldbd/etc/ldbdserver.ini
    ldbd 13058  1  0  Jun 11 ?  3:25 /usr/bin/python /usr1/ldbd/glue-1-17/bin/ldbdd -d -c /export/ldbd/etc/lscsegfindserver.ini
    ldbd 13261  1  0  Jun 11 ?  2:20 /usr/bin/python /usr1/ldbd/glue-1-17/bin/ldbdd -d -c /export/ldbd/etc/lscsegfindserverdev.ini
    kill -TERM 12839
    kill -TERM 13058
    kill -TERM 13261
    
    At the observatories, there is also an additional LDBDServer process for the trigger database:
    ldbd 13305  1  0  Jun 11 ?  1:51 /usr/bin/python /usr1/ldbd/glue-1-17/bin/ldbdd -d -c /export/ldbd/etc/trgserver.ini
    
    This process should also be sent a SIGTERM.

  2. The following code can be used to shutdown Q replication and WebSphere MQ.
    • At Hanford use
      asnqccmd CAPTURE_SCHEMA=ASN CAPTURE_SERVER=SEG_LHO LOGSTDOUT=Y STOP
      asnqacmd APPLY_SCHEMA=ASN APPLY_SERVER=SEG_LHO LOGSTDOUT=Y STOP
      runmqsc QM1 << EOF
      stop channel (QM1_TO_QM2)
      stop channel (QM1_TO_QM3)
      end
      EOF
      endmqm -i QM1
      endmqlsr -m QM1
      
    • At Livingston use
      asnqccmd CAPTURE_SCHEMA=ASN CAPTURE_SERVER=SEG_LLO LOGSTDOUT=Y STOP
      asnqacmd APPLY_SCHEMA=ASN APPLY_SERVER=SEG_LLO LOGSTDOUT=Y STOP
      runmqsc QM2 << EOF
      stop channel (QM2_TO_QM1)
      stop channel (QM2_TO_QM3)
      end
      EOF
      endmqm -i QM2
      endmqlsr -m QM2
      
    • At Caltech use
      asnqccmd CAPTURE_SCHEMA=ASN CAPTURE_SERVER=SEG_CIT LOGSTDOUT=Y STOP
      asnqacmd APPLY_SCHEMA=ASN APPLY_SERVER=SEG_CIT LOGSTDOUT=Y STOP
      runmqsc QM3 << EOF
      stop channel (QM3_TO_QM1)
      stop channel (QM3_TO_QM2)
      end
      EOF
      endmqm -i QM3
      endmqlsr -m QM3
      

  3. Clean up IPC:
    ipclean

  4. Get and remove the id's of any stale ipc's:
    [ldbd@ldas.ldas-la ~]$ ipcs | grep ldbd | awk '{print " ipcrm  -"$1"  "$2}'
    ipcrm -q 117440760
    ipcrm -q 117440759
    [ldbd@ldas.ldas-la ~]$ ipcrm -q 117440760
    [ldbd@ldas.ldas-la ~]$ ipcrm -q 117440759
  5. Finally shutdown DB2 with
    db2stop
    

It is now safe to restart or power down the machine.

Starting the LSC segment database

The following commands should work after either a reboot or a hard power off, unless the database has been badly damaged. DB2 restarts automatically on startup, so there is no need to manually start the DB2 process. The procedure for restarting the LSC segment database is

  1. Allocate sufficient number of semaphores
  2. Start the WebSphere MQ queue manager, channels and listener
  3. Start the Q capture and apply programs
  4. Start the LSCsegFindServer and LDBDServer daemons

To accomplish this, perform the following steps:

  1. Make sure that the database is down. If you just booted, it would most likely be up. To shut it down, execute
    db2stop
    
    as user ldbd.
  2. Adjust the number of semaphores as user root:
    projmod -K "project.max-sem-ids=(priv,524288,deny)" default
    projmod -a -K "project.max-shm-ids=(priv,4096,deny)" default
    projmod -a -K "project.max-msg-ids=(priv,4096,deny)" default
    projmod -a -K "process.max-sem-nsems=(priv,32767,deny)" default
    projmod -a -K "project.max-shm-memory=(priv,4G,deny)" default
    For some reason these settings do not currently survive a reboot but this might be fixed in the near future and this step will become unnecessary.
    The next steps are done as ldbd.
  3. Start the database:
    db2start
    
  4. The following code can be used to start WebSphere MQ and Q replication.
    • At Hanford use
      strmqm QM1
      runmqsc QM1 << EOF
      start channel (QM1_TO_QM2)
      start channel (QM1_TO_QM3)
      end
      EOF
      nohup runmqlsr -t tcp -m QM1 &> /export/ldbd/var/log/mqlsr.out </dev/null &
      export CAP_PATH=/export/ldbd/var/db2/capture
      export APP_PATH=/export/ldbd/var/db2/apply
      asnqcap capture_server=seg_lho capture_schema=ASN CAPTURE_PATH=${CAP_PATH} 1>> ${CAP_PATH}/asnqcap.stdout 2>> ${CAP_PATH}/asnqcap.stderr < /dev/null &
      asnqapp apply_server=seg_lho apply_schema=ASN APPLY_PATH=${APP_PATH} 1>> ${APP_PATH}/asnqapp.stdout 2>> ${APP_PATH}/asnqapp.stderr < /dev/null &
      
    • At Livingston use
      strmqm QM2
      runmqsc QM2 << EOF
      start channel (QM2_TO_QM1)
      start channel (QM2_TO_QM3)
      end
      EOF
      nohup runmqlsr -t tcp -m QM2 &> /export/ldbd/var/log/mqlsr.out </dev/null &
      export CAP_PATH=/export/ldbd/var/db2/capture
      export APP_PATH=/export/ldbd/var/db2/apply
      asnqcap capture_server=seg_llo capture_schema=ASN CAPTURE_PATH=/export/ldbd/var/db2/capture 1>> ${CAP_PATH}/asnqcap.stdout 2>> ${CAP_PATH}/asnqcap.stderr < /dev/null &
      asnqapp apply_server=seg_llo apply_schema=ASN APPLY_PATH=/export/ldbd/var/db2/apply 1>> ${APP_PATH}/asnqapp.stdout 2>> ${APP_PATH}/asnqapp.stderr < /dev/null &
      
    • At Caltech use
      strmqm QM3
      runmqsc QM3 << EOF
      start channel (QM3_TO_QM1)
      start channel (QM3_TO_QM2)
      end
      EOF
      nohup runmqlsr -t tcp -m QM3 &> /export/ldbd/var/log/mqlsr.out </dev/null &
      export CAP_PATH=/export/ldbd/var/db2/capture
      export APP_PATH=/export/ldbd/var/db2/apply
      asnqcap capture_server=seg_cit capture_schema=ASN CAPTURE_PATH=/export/ldbd/var/db2/capture 1>> ${CAP_PATH}/asnqcap.stdout 2>> ${CAP_PATH}/asnqcap.stderr < /dev/null &
      asnqapp apply_server=seg_cit apply_schema=ASN APPLY_PATH=/export/ldbd/var/db2/apply 1>> ${APP_PATH}/asnqapp.stdout 2>> ${APP_PATH}/asnqapp.stderr < /dev/null &
      
    You should check the apply and capture stdout files for error messages, but everything should now be running.

  5. Start the DMT LDBDServer with
    ldbdd -d -c /export/ldbd/etc/ldbdserver.ini

  6. Start the LSCsegFind server with
    ldbdd -d -c /export/ldbd/etc/lscsegfindserver.ini

  7. Start the LSCsegFindDev server with
    ldbdd -d -c /export/ldbd/etc/lscsegfindserverdev.ini

  8. At the observatories, start the trigger LDBDServer with
    ldbdd -d -c /export/ldbd/etc/trgserver.ini

$Id$