Function:

This is a quick and robust test to see if a collection of values has a probability distribution that is consistent with a Gaussian normal distribution (``normal IFO operation"), or if the collection of values contains ``outlier" points, indicating that the set of values contains ``pulses", ``blips" and other ``obvious" exceptional events that ``stick out above the noise" (caused by bad cabling, alignment problems, or other short-lived transient events).

The arguments are:

`array:`Input. The values whose probability distribution is examined are`array[0..n-1]`.`n:`Input. The length of the previous array.`min:`Input. The minimum value that the input values*might*assume. For example, if`array[]`contains the output of a 12-bit analog-to-digital converter, one might set`min=-2048`. Of course the minimum value in the input array might be considerably larger than this (i.e., closer to zero!) as it should be if the ADC is being operated well within its dynamic range limits. If you're not sure of the smallest value produced in`array[]`, set`min`smaller (i.e., more negative) than needed; the only cost is storage, not computing time.`max:`Input. The maximum value that the input values*might*assume. For example, if`array[]`contains the output of a 12-bit analog-to-digital converter, one might set`max=2047`. The previous comments apply here as well: set`max`larger than needed, if you are not sure about the largest value contained in`array[]`.`print:`Input. If this is non-zero, then the routine will print some statistical information about the distribution of the points.

The value returned by `is_gaussian()` is 1 if the distribution of
points is consistent with a Gaussian normal distribution with no
outliers, and 0 if the distribution contains outliers.

The way this is determined is as follows (we use to denote the
array element `array[i]`):

- First, the mean value of the distribution is determined using the
standard estimator:

(16.5.327) - Next, the points are binned into a histogram . Here is the number of points in the array that have value . The sum over the entire histogram is the total number of points: .
- Then the standard deviation is estimated in the following robust
way. It is the smallest integer for which

(16.5.328) - Next, the number of values in the range from one standard deviation
to three standard deviations is found, and the number of values
in the range from three to five standard deviations is found.
This is compared to the expected
number:

(16.5.329) - If there are points more than five standard deviations away from
the mean, or significantly more points in the 3 to 5 standard deviation
range than would be expected for a Gaussian normal distribution, then
`is_gaussian()`returns 0. If the numbers of points in each range is consistent with a Gaussian normal distribution, then`is_gaussian()`returns 1.

- Authors: Bruce Allen, ballen@dirac.phys.uwm.edu
- Comments: This function should be generalized in the obvious way, to look at one sigma wide bins in a more systematic way. It can eventually be replaced by a more rigorously characterized test to see if the distribution of sample values is consistent with the normal IFO operation.