LSC Data Analysis Software Working Groups

Navigation

DASWG
LSC
LIGO


DASWG LAL Doxygen

Docs

How-to
Minutes
Technical
Software Docs

Download

Browse CVS
Repositories

Participate

Change Control Board
Edit these pages
Sub-committees
Mailing List
Telecon

Projects

DMT
geopp
Glue
LAL Home Page
LALApps Home Page
LDAS
LDG Client/Server
LDM
LDR
LIGOtools
MatApps
MDS4
Metaio
Onasys
Online
OSG-LIGO
pyLAL

LSC Segment List Format Specification

Introduction

One thing learned from analyzing the data from the S1 run was that we needed a standard way of keeping track of data quality information. The "S2 Segment Data Quality Repository" was created to fill this need, and has been continued for subsequent science runs. This "repository" provides lists of segments (time intervals) in a simple ASCII file format which can be parsed fairly easily by data analysis programs and scripts. A number of software tools have been written to generate, parse, and manipulate segment lists in this format, including the LIGOtools 'segments' package (which includes a Tcl library and the 'segwizard' graphical user interface) and the Python 'segment' class in glue.

Although the segment list files that we have been working with have a reasonably self-evident format, there has not been a written format specification up to now (September 2005). The different parsing codes differ somewhat in the details of what formats they are able to parse successfully. This document is an attempt to establish a baseline format specification that all parsing codes should be able to handle. The intent is to capture the common capabilties of the existing codes (to the extent that they are sufficient for our needs) rather than to require the development of additional code. Any given implementation may be able to handle more general cases.

Basic concepts

A segment is a time interval, possibly with some associated information such as a numerical index, a data quality flag string, etc. The end time of a segment is required to be greater than or equal to the start time of the segment.

A segment list is simply a list of segments. The list has an order to it (it is a list, not a set), but the segments do not have to appear in chronological order in the list. The segments in a list may represent overlapping time intervals.

It is useful to define a few additional concepts even though they have no bearing on the format of a segment list file:

File format specification

A segment list file is an ASCII file which is to be parsed line by line. The hash character ('#') begins a comment, so the parsing code should ignore this character and all subsequent characters on a line. Blank lines - that is, lines consisting only of whitespace (spaces and/or tabs) after removal of the comment, if any - are to be ignored. Each non-blank line contains the information for one segment.

A line with segment information consists of at least two fields separated by whitespace. The line may also have leading and/or trailing whitespace, which is ignored. Two of the fields - either the first and second, or the second and third (see below) - represent the start time and end time, respectively, expressed in GPS seconds. Each GPS time is formatted either as an integer or as a decimal floating-point number. Considering the time span of observations with gravitational wave interferometers, each GPS time should have an integer part which is a positive 9- or 10-digit decimal number.

Optionally, the first field on the line can be an integer "index". It should have no more than eight digits, so that it is distinguishable from a GPS time. Other than this, there is no restriction on the value of the index; in particular, the index is not required to be equal to the ordinal number of the segment in this list. If an index is present, then the start time and end time of the segment follow it on the line.

A segment may have one or more fields of associated information following the GPS end time (in addition to the index which may optionally appear at the beginning of the line, as described above). An item of associated information may be a number or a string (with no internal whitespace) and is application-specific. The parsing code should ignore the additional-information field(s) if not relevant for the application.

Example segment list file


# This is a segment list file.  It may contain comments anywhere.
    # There can be whitespace before the hash character.
723892545 723892560   #-- This segment happens to have integer times
723904200 723905200
 723904205  723905205 #-- Extra whitespace is OK, and overlapping is OK
723905303.542 724038223.598746221
103878332 103878544   #-- Remember that GPS times change to 10 digits in 2011 !

# Blank lines and comments in the middle of the file are OK
5 804323335 804323504   #-- This line has an 'index' at the begining
 12  804350000  804350000
# Now, some segments with associated information:
 792331300 792331400 BAD_TIMING
 792331500 792331600 BAD_TIMING 5 2 ex
 23346  792331300.25 792331400.4  HighNoise   # Has an index AND assocated info

Revision history

$Id: seglist_format.html,v 1.9 2005/09/26 06:20:51 pshawhan Exp $