Missing entries in GWTC-4.0 candidate data zenodo entry

I think there is an error in this zenodo entry

related to the GWTC-4.0 catalog paper
https://arxiv.org/pdf/2508.18082v2
While the notebook in the zenodo entry states there should be ~1300 events in the files, in reality there are 347. Of the 129 events in Table 1&2 on the paper, only 73 appear in the zenodo entry, and 56 are missing.

I think the hpIGWN-GWTC4p0-1a206db3d_721-SearchSummaryTable.hdf5 file, because I do find the trigger files in the individual pipeline subfolders.

I’ll leave a short code to reproduce the problem & what it prints in the comments.

import gwosc
from gwosc import api
from gwpy.table import EventTable

# get GWTC-4.0 from gwosc api
catalog = gwosc.api.fetch_catalog_json("GWTC-4.0")
gwtc4_events = [v["commonName"] for v in catalog["events"].values()]

# manually copy-paste the events in tables 1&2 from https://arxiv.org/pdf/2508.18082v2
with open("table1.txt", "r") as f:
    table1 = [line.strip() for line in f.readlines()]
with open("table2.txt", "r") as f:
    table2 = [line.strip() for line in f.readlines()]
tables = sorted(table1 + table2)

# see the tables match the gwosc.api catalog
print(
    f"{len(tables)=}, \n{len(gwtc4_events)=}, \n{len(set(tables).intersection(set(gwtc4_events)))=}"
)

# load the Search Summary from http://doi.org/10.5281/zenodo.17014083
Summary_Path = "candidate_data_release/search_results/IGWN-GWTC4p0-1a206db3d_721-SearchSummaryTable.hdf5"
events_table = EventTable.read(Summary_Path)
search_summary_events = list(events_table["gw_name"])

print(f"{len(search_summary_events)=}")
common_events = sorted(list(set(gwtc4_events).intersection(set(search_summary_events))))
print(f"{len(common_events)=}")

missing_events = list(set(gwtc4_events).difference(set(search_summary_events)))

print(f"{len(missing_events)} GWTC-4.0 events missing from zenodo entry: ")
for ev in missing_events:
    print(ev)

(dot-pe) jonatahm@wismac's-Mac O4a_events % python reproduce_error.py
len(tables)=129, 
len(gwtc4_events)=129, 
len(set(tables).intersection(set(gwtc4_events)))=129
WARNING: path= was not specified but multiple tables are present, reading in first available table (path=CWB) [astropy.io.misc.hdf5]
len(search_summary_events)=347
len(common_events)=73
56 GWTC-4.0 events missing from zenodo entry: 
GW230618_102550
GW231102_232433
GW231005_144455
GW231114_043211
GW230731_215307
GW230823_142524
GW230723_101834
GW240109_050431
GW231113_122623
GW230814_230901
GW231118_090602
GW231204_090648
GW230726_002940
GW230630_234532
GW230531_141100
GW230603_174756
GW230708_053705
GW230920_064709
GW230624_214944
GW230911_195324
GW231002_143916
GW230729_082317
GW230630_070659
GW230717_102139
GW240104_164932
GW230830_064744
GW231104_133418
GW230606_024545
GW230817_212349
GW231113_200417
GW231231_154016
GW231013_135504
GW230904_051013
GW231020_142947
GW230904_152545
GW230704_021211
GW231008_142521
GW230902_224555
GW231120_022103
GW231224_024321
GW240105_151143
GW231223_075055
GW231118_005626
GW230518_125908
GW231029_111508
GW230625_211655
GW231110_040320
GW231026_130704
GW230902_172430
GW230702_162025
GW231018_233037
GW230529_181500
GW230728_083628
GW230706_104333
GW231223_202619
GW230831_134621

In case you haven’t already realized this, the

WARNING: path= was not specified but multiple tables are present, reading in first available table (path=CWB) [astropy.io.misc.hdf5]

message explains why you’re getting unexpected results. As mentioned in the notebook in Zenodo, you need to use

events_table = EventTable.read(Summary_Path, path='search_summary')

to get the full set of 1382 candidates. If you don’t include that, then you get the first table in the HDF5 file, which happens to be the one for the minimally modeled cWB search, which is only sensitive to certain signals (generally higher-mass binary black holes–the cWB search looks for excess power without using a waveform model, so it is not so sensitive to long waveforms where the power is spread out in time).

If I make that change, then I get the expected results of

len(search_summary_events)=1382
len(common_events)=129
0 GWTC-4.0 events missing from zenodo entry:

from your script.

2 Likes

Thank you for the reply! This worked.