Tutorial Questions - Day 2

virtuoso · May 17, 2023, 12:15pm

Yes dividing the Fourier Transform of your signal by the square root of the PSD is indeed the whitening. The interpolation is done only for matching the array of the original PSD with that of the series to be whitened: indeed, if your time-domain series has N samples, its Fourier Transform is associated with N equally spaced frequencies ranging from -sampling_rate/2 to +sampling_rate/2; your original PSD is likely to have a different number of points, let’s say K frequencies equally spaced between -sampling_rate/2 to +sampling_rate/2, so you have to perform interpolation to have an array matching the N equally spaced frequencies

Agata · May 17, 2023, 12:16pm

Hi @Eileen-Meyer-UMBC , yes, we are collecting the presentations and we will add them in thinkific

GBRK5 · May 18, 2023, 10:57am

XLAL Error - XLALFrStreamFileOpen (LALFrStream.c:128): No files in stream file cache
XLAL Error - XLALFrStreamFileOpen (LALFrStream.c:128): Invalid argument
XLAL Error - XLALFrStreamCacheOpen (LALFrStream.c:233): Internal function call failed: Invalid argument

RuntimeError Traceback (most recent call last)
in <cell line: 15>()
14 afile = files
15 for afile in files:
—> 16 ts = read_frame(afile, channel_name, start, end)
17 print(“File {}”.format(afile))
18

/usr/local/lib/python3.10/dist-packages/pycbc/frame/frame.py in read_frame(location, channels, start_time, end_time, duration, check_integrity, sieve)
203 None, None, None)
204
→ 205 stream = lalframe.FrStreamCacheOpen(cum_cache)
206 stream.mode = lalframe.FR_STREAM_VERBOSE_MODE
207

RuntimeError: Internal function call failed: Invalid argument
I’m getting this error in Tutorial 2.2. Please suggest how to correct this ?

Arunan23 · May 19, 2023, 8:47am

Hi,

psd = pycbc.psd.aLIGOZeroDetHighPower(flen, delta_f, flow).

Can anybody explain what aLIGOZeroDetHighPower doing? What flow = 10, flen means?

“ts = pycbc.noise.noise_from_psd(data_length*sample_rate, delta_t, psd, seed=127)”. Here also what seed = 127 guessed?
Can you please tell me that how you are guessing those 10, 127 values?

Thank you.

jonah · May 19, 2023, 3:39pm

@GBRK5 Usually, an error like that means the software can’t find the file or channel that you requested. You should check that the filename and channel name you used don’t have any typos.

AmbicaG · May 22, 2023, 8:12am

What is the role of

data_whitened = (ts.to_frequencyseries() / psd_td**0.5).to_timeseries()
hp1_whitened = (hp1.to_frequencyseries() / psd_hp1**0.5).to_timeseries() * 1E-21

mattia_emma · May 22, 2023, 8:26am

Hi @AmbicaG , in this lines we take the timeseries data, transform it to the frequency domain, whiten it and transform it back to a timeseries. This is done as it is easier to perform the whitening in the frequency domain. In the second line, the same is done and then we multiply the timeseries by the factor 10^(-21) to create a realistic strain as it would be seen by our detectors. Multiplying the strain by this factor is equivalent to placing the source at 10 Mpc distance from us.

AmbicaG · May 22, 2023, 8:41am

Why is there a division by the square root of the PSD?

mattia_emma · May 22, 2023, 9:15am

Hi @AmbicaG, the square root of the psd is called asd, the amplitude spectral density and is the actual quantity used to whiten the data. The division itself allows to mitigate noise effects. Here some references: Gravitational-wave sensitivity curves | Christopher Berry, Spectral density - Wikipedia and https://arxiv.org/pdf/1408.0740.pdf . Hope this helps

kopty · May 22, 2023, 8:51pm

While working on file Tuto_2.3_Signal_consistency_and_significance.ipynb notebook, I noted that, here, we are using pycbc.filter.matched_filter to do matched filtering of data with frequency domain template (generated via pycbc.waveform.get_fd_waveform). Everything works fine.

I noticed previously in Tuto_2.2_Matched_Filtering_In_action.ipynb notebook, we did the same with pycbc.waveform.get_td_waveform i.e., with time domain template.

Wondering if that pycbc.filter.matched_filter can handle both, I tried using time domain template in the 3rd tutorial Tuto_2.3_Signal_consistency_and_significance.ipynb, but turns out that there is no SNR peak at the signal.

Look at the code

hp, _ = get_fd_waveform(approximant="IMRPhenomD",
                         mass1=cmass, mass2=cmass,
                         f_lower=20.0, delta_f=data[ifo].delta_f)
hp.resize(len(psd[ifo]))
#? only 1 template is generated, since delta_f is same for each ifo

# For each observatory use this template to calculate the SNR time series
snr = {}
for ifo in ifos:
    snr[ifo] = matched_filter(hp, data[ifo], psd=psd[ifo], low_frequency_cutoff=20)
    snr[ifo] = snr[ifo].crop(5, 4)

This is what was given in the tutorial notebook file.

Now, what I did was

hp, _ = get_td_waveform(approximant="IMRPhenomD",
                         mass1=cmass, mass2=cmass,
                         f_lower=20.0, delta_t=data[ifo].delta_t)
hp.resize(len(data[ifo]))

# For each observatory use this template to calculate the SNR time series
snr = {}
for ifo in ifos:
    snr[ifo] = matched_filter(hp, data[ifo], psd=psd[ifo], low_frequency_cutoff=20)
    snr[ifo] = snr[ifo].crop(5, 4)

But, now the SNR plot generated this

I’m unsure as to what the situation is.

AmbicaG · May 23, 2023, 6:09am

Hi, I’m not sure I understand the issue of filter “wrapping around the input” in the context of Tutorial 2.2. Can you explain more clearly?

AmbicaG · May 23, 2023, 6:17am

I went through the documentation but didn’t understand what the method inverse_spectrum_truncation does.

AmbicaG · May 23, 2023, 6:41am

I have two more questions:

“We need to account for both the length of the template and 1 / PSD.” I do not understand this either. I know this has to do with the spikes appearing in the SNR plot but perhaps you could explain more clearly what’s going on?
I don’t understand the point of using aligned = (aligned.to_frequencyseries() * snrp).to_timeseries() after aligned /= sigma(aligned, psd=psd, low_frequency_cutoff=20.0). Why are we scaling the template amplitude twice?

Daviscurry · May 23, 2023, 5:47pm

In tutorial 2.2, there a line of code for Q-transform: t, f, p = data.whiten(4, 4).qtransform(.001, logfsteps=100, qrange=(8, 8), frange=(20, 512)). What do the two parameters(.001 and logfstep) in the ‘qtransform’ function represent

kopty · May 24, 2023, 2:44pm

Okay, I could understand what the problem was

I forgot to do

hp = hp.cyclic_time_shift(hp.start_time)

After doing that step (for the td_waveform), I could get the signal at the exact same GPS time, the SNR value was a bit off (0.00114% change) (checked for Livingston peak only since it was the most significant one).

That is okay, but, to note that after the generation of the td_template, the generated start time was (hp.start_time value) -3.8s, which we shifted to 0s by making cyclic_time_shift.

Since the starting time was just some constant sec, the snr peak should occur at that time delay (from the actual signal time), right? i.e., the actual signal time + (-3.8s). But, somehow, that was not the case found here when I didn’t do the cyclic_time_shift step (the peak was totally out of the data signal, which is 27.75s long, and the signal somewhat lies at the center)!

I have checked with the Tutorial 2.3 Challenge question section, that it happens like that, the peak which is at 104.00341796875s got shifted to 100.20361328125s if I don’t use the
hp = hp.cyclic_time_shift(hp.start_time), and the time difference is -3.7998s, while the hp.start_time is -3.8s (also, the SNR got changed by 1.249%, which is not where we’re interested now, but given just for completeness).

jonah · May 26, 2023, 5:10pm

Thank you for your question. The issue of wrapping data is explained in the paragraph above:

“Note the spike in the data at the boundaries. This is caused by the highpass and resampling stages filtering the data. When the filter is applied to the boundaries, it wraps around to the beginning of the data. Since the data itself has a discontinuity (i.e. it is not cyclic) the filter itself will ring off for a time up to the length of the filter.”

The high pass and resampling filters both leave numerical garbage at the beginning and end of the time series. To remove the garbage, the ends of the data are trimmed after filtering, like this:

# Remove 2 seconds of data from both the beginning and end
conditioned = strain.crop(2, 2)

Removing 2 seconds of data from the ends after filtering removes the numerical garbage that is created by the filtering process.

These “wrapping” features are a common issue in signal processing. The Fourier Transform assumes that time-series data are periodic. So, anytime time-series data are transformed into the frequency domain, it can introduce a discontinuity at the beginning and end of the segment.

jonah · May 26, 2023, 5:13pm

Great question! The explanation is in the comment above the line of code. The method is removing all content below 15 Hz. This is important for LIGO data, because the data are not meaningful at very low frequencies.

# 1/PSD will now act as a filter with an effective length of 4 seconds
# Since the data has been highpassed above 15 Hz, and will have low values
# below this we need to inform the function to not include frequencies
# below this frequency.

jonah · May 26, 2023, 5:20pm

@AmbicaG For question 2., the answer is that we are scaling the template to the correct amplitude in two steps.

The first step divides out the amplitude of the template, so that it is scaled to SNR=1

The second step multiplies by the recovered SNR, so that the template will have the same SNR as the signal in the data.

AmbicaG · May 27, 2023, 5:20am

How is the filter length determined?

singhmukesh · May 28, 2023, 2:50pm

@AmbicaG The division by the square root of the PSD flattens the response of the detector across the sensitivity band (hence the name whitening, it means no frequency dependence). You can also think it in another way: the GW detectors are less sensitive at lower and higher frequencies, the division by sqrt(PSD) downweights these frequencies in the data. I hope this makes sense!

Topic		Replies	Views
Overplot the Numerical Relativity simulation waveform and the real extracted GW data Data Analysis	4	94	January 25, 2025
Understanding waveform modelling Data Analysis	2	294	June 15, 2023
Simulated Data Source Location Data Analysis	6	136	February 7, 2024
How to select the psd-model when PEing GW150915 real data with PyCBC pycbc_inference executables? Data Analysis	2	65	December 18, 2024
Generating Synthetic GW data Data Analysis	8	275	December 9, 2023

Tutorial Questions - Day 2

XLAL Error - XLALFrStreamFileOpen (LALFrStream.c:128): No files in stream file cache XLAL Error - XLALFrStreamFileOpen (LALFrStream.c:128): Invalid argument XLAL Error - XLALFrStreamCacheOpen (LALFrStream.c:233): Internal function call failed: Invalid argument

Related topics

XLAL Error - XLALFrStreamFileOpen (LALFrStream.c:128): No files in stream file cache
XLAL Error - XLALFrStreamFileOpen (LALFrStream.c:128): Invalid argument
XLAL Error - XLALFrStreamCacheOpen (LALFrStream.c:233): Internal function call failed: Invalid argument