find glitches stand on nan data segements

Hi, I’m recently using machaine learning to deal with glitch classification, which need to obtain glitch time series data.
So, I do the following steps,

  1. download the csv file ‘trainingset_v1d1_metadata.csv’ (only 5.12M Bytes)from the page in Gravity Spy Training Set. The file contains some columns, such as,
    event_time, ifo, peak_time, peak_time_ns, start_time, start_time_ns, duration, search, process_id … etc.
  2. use the column start_time, for example, line 2, the value is 1134216192, it means a GPS time. Use this GPS time to download the file ‘https://gwosc.org//archive/data/O1/1133510656/L-L1_LOSC_4_V1-1134215168-4096.hdf5’ from GWOSC site.
  3. Select the glitch time series from this file by the start_time=1134216192, but it is nan data segament, why this happened?
    And I find that there’re many glitches which falls in the nan data (about 15% in total O1 glitches 8538 samples).
    Can anyone tell me how can i obtain the whole and complete glitch time series data?
    Thank you very much for your attention. Looking forward to your reply.

Hi @Shealylzu

Thank you for your question!

We only release strain data at times when the instruments are in ANLYSIS_READY state, meaning the data may be searched for astrophysical signals.

Some of the glitch times in the Gravity Spy data set do not fall in these times, so the data are not available.

You can get a segment list of times when data are available from the Timeline App

For example, for O3a, the H1 segment list is here:https://gwosc.org/timeline/segments/O3a_16KHZ_R1/H1_DATA/1238166018/15811200/

You should compare these times against the Gravity Spy list, and pick out glitches in times where data are available.

Good luck!

1 Like