Rayleigh spectrogram task

The Rayleigh statistic task is designed to measure if the coefficient of variation of the power spectral density is significantly different from nearby stretches of data. This task uses nearby data to estimate the significance of the statistic, meaning that if data proceeding the signal is equally non-stationary as the data stretch containing the candidate, no DQ issue will be flagged.

Requirements

This task requires the following packages that may not be included in the requirements list for dqrtasks:

  • gwpy

  • gwdetchar

Description

The coefficient of variation of a PSD is defined as the ratio of the standard deviation to the mean of the PSD. This coefficient is determined on a frequency-by-frequency basis to create a Rayleigh spectrum. The standard deviation and mean of a PSD are approximately equal in Gaussian noise, meaning the expected value of this statistic per frequency bin is 1.0. The task then records the largest deviation per frequency.

The standard deviation and mean are calculated using shorter segments of data within the data stretch of interest. To then estimate a p-value, data from before the candidate is broken into chunks the same size as the original data stretch. A Rayleigh spectrum is calculated for each chunk, and the largest deviation recorded. The largest deviation during the data stretch of interest is then compared against the largest values from the background data. The p-value is simply the fraction of background data stretches with larger Rayleigh statistic variations than the one seen in the data stretch of interest.

Example command-line

This is the help message:

$ dqr-rayleigh --help
usage: dqr-rayleigh [-h] [-v] [-V] [-o OUTPUT_DIR] --gps-time GPS_TIME
                    --channel CHANNEL [--frame-type FRAME_TYPE]
                    [--total-duration TOTAL_DURATION]
                    [--chunk-duration CHUNK_DURATION]
                    [--after-event-time AFTER_EVENT_TIME]
                    [--pvalue-threshold PVALUE_THRESHOLD]

Task to calculate the significance of measured rayleigh spectrum.

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase verbose output
  -V, --version         show program's version number and exit
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
                        Directory for all output
  --gps-time GPS_TIME
  --channel CHANNEL
  --frame-type FRAME_TYPE
  --total-duration TOTAL_DURATION
  --chunk-duration CHUNK_DURATION
  --after-event-time AFTER_EVENT_TIME
  --pvalue-threshold PVALUE_THRESHOLD

Example config

[rayleigh]
description = rayleigh spectrogram
librarian = detchar@ligo.org
include_in_dag = True
tier = 1
question = Are known sources of noise without auxiliary witnesses active?
# use the iterator construct to run on each ifo
iterate = l1
executable = dqr-rayleigh
request_memory = 400MB
arguments = "--gps-time ${t_0} -v --output-dir ${outdir}   --channel ${channel} --frame-type ${frame_type}  --total-duration 1024 --chunk-duration 8 --after-event-time 8"

Example results page

Work in progress