================ GSpyNetTree task ================ GSpyNetTree, the **G**\ ravity **Spy** Convolutional Neural **Net**\ work Decision **Tree**, is a data quality report task that uses machine learning to determine whether a glitch is present at the time of a candidate event. GSpyNetTree leverages a decision tree of multilabel CNN classifiers, sorted via total estimated gravitational-wave (GW) candidate mass, and trained with morphologically similar glitches. This task is based on `Alvarez-Lopez et al. 2023 `_, and a new paper on the O4-version of GSpyNetTree is in preparation. ------------ Requirements ------------ This task requires the following packages that may not be included in the requirements list for *dqrtasks*: * `gwpy` * `gwdetchar` * `tensorflow`` ----------- Description ----------- GSpyNetTree leverages the `InceptionV3 architecture `_ to classify GW event candidates. GSpyNetTree first intakes the total mass of the candidate, as reported in the preferred event of the associated superevent on GraceDB. As the total mass :math:`M` of an event affects its morphological appearance, GSpyNetTree has three different classifiers: the low-mass (LM) classifier (i.e., events with :math:`M < 50 M_\odot`), the high-mass (HM) classifier (i.e., :math:`50 M_\odot \leq M < 250 M_\odot`), and the extremely high-mass (EHM) classifier (i.e., :math:`M \geq 250 M_\odot`). In addition to the most common glitches in all detectors (namely, light scattering, fast scattering, and low-frequency lines), each of the classifiers is trained with morphologically similar glitches (which vary depending on the classifier), given below: - **Low-mass classifier**: Low-frequency blip, blip, scratchy, and koi fish. (Though not morphologically similar, the koi fish class is included to account for very loud glitches that might overlap with a low-mass GW signal). - **High-mass classifier:** Low-frequency blip, blip, tomte, and koi fish. - **Extremely high-mass classifier:** Low-frequency blip, and blip. For the GW signals, GSpyNetTree generates GW simulations using LALSuite’s inspiral injection function and the waveform model IMRPhenomPv2. GSpyNetTree’s GW examples are uniformly drawn from a total merger mass range of :math:`5 M_\odot` to :math:`350 M_\odot`, with individual masses ranging from :math:`2 M_\odot` to :math:`175 M_\odot`, an signal-to-noise ratio (SNR) range of 8 to 35, and individual component spins ranging from 0.05 to 0.95. In addition to these simulations, GSpyNetTree also considers a “No_Glitch” class in all classifiers. No_Glitch examples are clean detector times in which no data quality issues were identified. These clean times are similar to low SNR signals (particularly for low-mass GW events) in a `Q-transform `_, a time-frequency spectrogram used for classification. If a given superevent is classified as a GW or No_Glitch, no data quality issue is flagged. GSpyNetTree leverages a multilabel architecture for its CNNs, which means it also considers cases where a GW candidate and a glitch overlap in time (and frequency). With a multilabel architecture, GSpyNetTree is able to predict 0 or more labels for each candidate, by returning a probability ranging from 0 to 1 for each considered class. This way, the sum of the probabilities of all labels is not 1 (as occurs for multiclass classifiers, where the classes are mutually exclusive). Instead, the probability of each label can take any value from 0 to 1, and a label is said to be predicted by GSpyNetTree if its probability is greater than or equal to 0.5. In the case were no label surpasses the 50% threshold, no labels are predicted and a “human input needed” message is displayed. If GSpyNetTree predicts that a glitch is present (including the case where a GW and/or No_Glitch label is simultaneously predicted with a glitch), GspyNetTree needs to determine if a data quality issue should be flagged. To do this, GSpyNetTree uses the glitch p-value, which ranges from 0 (data quality issue identified) to 1 (no data quality issue identified). A data quality issue is flagged whenever the p-value is below 0.05. The glitch p-value is calculated as 1 - max(all glitch probabilities), such that if the probability of the glitch is very high, the p-value will be near zero and a data quality issue will be flagged. Similarly, in cases were GSpyNetTree is very confident about a GW/No_Glitch prediction, the glitch probabilities are generally very low and the glitch p-value will be almost 1. Note that the GW/No_Glitch probability is not used to calculate the glitch p-value. -------------------- Example command-line -------------------- This is the help message shown when running GSpyNetTree: .. code:: bash $ dqr-gspynettree --help usage: dqr-gspynettree [-h] [--log-level {DEBUG,INFO,WARNING,ERROR}] [--log-file LOG_FILE] --output-dir OUTPUT_DIR [--id ID] --ifo {H1,L1,V1} --channel CHANNEL --gps GPS --start START --end END --mtotal MTOTAL [--frametype FRAMETYPE] [--p-value-threshold P_VALUE_THRESHOLD] [-V] [--lm-model LM_MODEL] [--hm-model HM_MODEL] [--ehm-model EHM_MODEL] GSpyNetTree: Gravity Spy Convolutional Neural Network Decision Tree optional arguments: -h, --help show this help message and exit --log-level {DEBUG,INFO,WARNING,ERROR} log level --log-file LOG_FILE write logs to file (default: log file in output-dir) --output-dir OUTPUT_DIR output directory --id ID identifier for event of interest --ifo {H1,L1,V1} target detector for event of interest (ex. H1) --channel CHANNEL target channel for analysis (ex. H1:GDS-CALIB_STRAIN) --gps GPS GPS time for event of interest --start START GPS start time for event of interest --end END GPS end time for event of interest --mtotal MTOTAL CBC total mass for event of interest --frametype FRAMETYPE Data frametype --p-value-threshold P_VALUE_THRESHOLD Defined threshold for the p-value (Default: 0.05) -V, --version show the program version number and exit --lm-model LM_MODEL saved TensorFlow model for low mass classifier --hm-model HM_MODEL saved TensorFlow model for high mass classifier --ehm-model EHM_MODEL saved TensorFlow model for extreme high mass classifier Classifier for glitch-GW discrimination based on strain data qscans. -------------- Example config -------------- :: [gspynettree] description = Gravity Spy Convolutional Neural Network Decision Tree librarian = sofia.alvarez@ligo.org tier = 1 question = Is the superevent a glitch or does it overlap with a glitch? iterate = l1 h1 v1 executable = gspynettree arguments = "--output-dir ${outdir} --id ${graceid} --ifo ${ifo} --channel ${channel} --gps ${t_0} --start ${t_start} --end ${t_end} --mtotal ${mtotal}" -------------------- Example results page for a GW with no data quality issues recognized by the task -------------------- Full webpage can be accessed `here <./_static/task_examples/gspynettree/GW_sample/index.html>`_. .. raw:: html -------------------- Example results page for a fast scattering glitch -------------------- Full webpage can be accessed `here <./_static/task_examples/gspynettree/FS_sample/index.html>`_. .. raw:: html