Jobsub ID 266021.9@dunegpschedd01.fnal.gov
| Jobsub ID | 266021.9@dunegpschedd01.fnal.gov |
| Workflow ID | 11076 |
| Stage ID | 1 |
| User name | ykermaid@fnal.gov |
| HTCondor Group | group_dune.prod_mcsim |
| Requested | Processors | 2 |
| GPU | No |
| RSS bytes | 4193255424 (3999 MiB) |
| Wall seconds limit | 18000 (5 hours) |
| Submitted time | 2025-12-10 12:12:55 |
| Site | NL_SURFsara |
| Entry | DUNE_SurfSARA_arc01 |
| Last heartbeat | 2025-12-10 13:05:30 |
| From worker node | Hostname | wn-da-13.gina.surf.nl |
| cpuinfo | AMD EPYC 7702P 64-Core Processor |
| OS release | Scientific Linux release 7.9 (Nitrogen) |
| Processors | 2 |
| RSS bytes | 4194304000 (4000 MiB) |
| Wall seconds limit | 129600 (36 hours) |
| GPU | |
| Inner Apptainer? | True |
| Job state | outputting_failed |
| Started | 2025-12-10 12:13:54 |
| Input files | vd-protodune:np02vd_raw_run041215_7066_df-s04-d1_dw_0_20251207T234530.hdf5
|
| Jobscript | Exit code | 1 |
| Real time | 0m (0s) |
| CPU time | 0m (0s = 0%) |
| Max RSS bytes | 0 (0 MiB) |
| Outputting started | |
| Output files | |
| Finished | 2025-12-10 13:05:30 |
| List job events Cached HTCondor job logs |
Jobscript log (last 10,000 characters)
3
wclsFrameSaver: no samples within desired window for channel 11444
wclsFrameSaver: no samples within desired window for channel 11444
wclsFrameSaver: no samples within desired window for channel 11445
wclsFrameSaver: no samples within desired window for channel 11445
wclsFrameSaver: no samples within desired window for channel 11446
wclsFrameSaver: no samples within desired window for channel 11446
wclsFrameSaver: no samples within desired window for channel 11490
wclsFrameSaver: no samples within desired window for channel 11804
wclsFrameSaver: no samples within desired window for channel 12127
FrameSaver: q=1.63753e+07 n=1044153 tag=wiener
0 X, 0 U, 0 V bad channels
Finding XUV coincidences...
C:0 T:0 155 XUs and 106 XVs -> 4 XUVs
C:0 T:1 227 XUs and 194 XVs -> 4 XUVs
C:0 T:2 398 XUs and 319 XVs -> 8 XUVs
C:0 T:3 1948 XUs and 1678 XVs -> 72 XUVs
C:0 T:4 473 XUs and 430 XVs -> 22 XUVs
C:0 T:5 264 XUs and 269 XVs -> 55 XUVs
C:0 T:6 654 XUs and 771 XVs -> 28 XUVs
C:0 T:7 608 XUs and 716 XVs -> 56 XUVs
C:0 T:8 26 XUs and 38 XVs -> 4 XUVs
C:0 T:9 75 XUs and 57 XVs -> 1 XUVs
C:0 T:10 79 XUs and 66 XVs -> 0 XUVs
C:0 T:11 18 XUs and 42 XVs -> 0 XUVs
C:0 T:12 1617 XUs and 1798 XVs -> 74 XUVs
C:0 T:13 1976 XUs and 2109 XVs -> 193 XUVs
C:0 T:14 156 XUs and 171 XVs -> 17 XUVs
C:0 T:15 1177 XUs and 1188 XVs -> 114 XUVs
652 XUVs total
535 collection wire objects
652 potential space points
Neighbour search...
6662 tests to find 3948 neighbours
Iterating with no regularization...
Begin: 7.23624e+08
0 7.18841e+08
1 7.18669e+08
Now with regularization...
Begin: 7.15898e+08
0 7.15881e+08
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 0:
#000: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dio.c line 179 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#001: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 2011 in H5VL_dataset_read(): dataset read failed
major: Virtual Object Layer
minor: Read failed
#002: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 1978 in H5VL__dataset_read(): dataset read failed
major: Virtual Object Layer
minor: Read failed
#003: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLnative_dataset.c line 166 in H5VL__native_dataset_read(): can't read data
major: Dataset
minor: Read failed
#004: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dio.c line 545 in H5D__read(): can't read data
major: Dataset
minor: Read failed
#005: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 600 in H5D__contig_read(): contiguous read failed
major: Dataset
minor: Read failed
#006: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dselect.c line 465 in H5D__select_read(): read error
major: Dataspace
minor: Read failed
#007: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dselect.c line 220 in H5D__select_io(): read error
major: Dataspace
minor: Read failed
#008: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 924 in H5D__contig_readvv(): can't perform vectorized sieve buffer read
major: Dataset
minor: Can't operate on object
#009: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VM.c line 1401 in H5VM_opvv(): can't perform operation
major: Internal error (too specific to document in detail)
minor: Can't operate on object
#010: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 729 in H5D__contig_readvv_sieve_cb(): block read failed
major: Dataset
minor: Read failed
#011: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fio.c line 105 in H5F_shared_block_read(): read through page buffer failed
major: Low-level I/O
minor: Read failed
#012: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5PB.c line 721 in H5PB_read(): read through metadata accumulator failed
major: Page Buffering
minor: Read failed
#013: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Faccum.c line 252 in H5F__accum_read(): driver read request failed
major: Low-level I/O
minor: Read failed
#014: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDint.c line 189 in H5FD_read(): driver read request failed
major: Virtual File Layer
minor: Read failed
#015: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDsec2.c line 755 in H5FD__sec2_read(): file read failed: time = Wed Dec 10 13:58:27 2025
, filename = 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/vd-protodune/b6/0a/np02vd_raw_run041215_7066_df-s04-d1_dw_0_20251207T234530.hdf5', file descriptor = 24, errno = 113, error message = 'No route to host', buf = 0x1b28bb40, total read size = 979272, bytes this sub-read = 979272, bytes actually read = 18446744073709551615, offset = 0
major: Low-level I/O
minor: Read failed
==================================================================================================================================
TimeTracker printout (sec) Min Avg Max Median RMS nEvts
==================================================================================================================================
Full event 7.87e-05 252.519 340.625 273.9 89.5999 10
----------------------------------------------------------------------------------------------------------------------------------
source:HDF5RawInput3(read) 5.6136e-05 8.02399e-05 0.000177388 7.1857e-05 3.32863e-05 10
produce:tpcrawdecoder:PDVDTPCReader 78.0949 100.254 125.045 98.3207 14.2983 9
produce:triggerrawdecoder:PDVDTriggerReader4 0.339599 0.404993 0.569281 0.374397 0.0695262 9
produce:pdvddaphne:DAPHNEReaderPDVD 0.000288509 0.000360347 0.000686346 0.000326671 0.000117073 9
produce:ophit:OpHitFinder 5.25e-05 0.000107654 0.000519268 5.6337e-05 0.00014555 9
produce:opflash:OpFlashFinderVerticalDrift 3.7703e-05 7.14239e-05 0.000318767 4.123e-05 8.74637e-05 9
produce:wclsdatavd:WireCellToolkit 71.0136 122.406 157.164 131.53 26.2921 9
produce:gaushit:GausHitFinder 1.04128 1.4405 1.9889 1.33656 0.295572 9
produce:nhitsfilter:NumberOfHitsFilter 0.000225769 0.000344124 0.000485325 0.000328765 8.26657e-05 9
produce:reco3d:SpacePointSolver 5.71899 9.45662 18.1417 9.1644 3.51467 9
produce:hitpdune:DisambigFromSpacePoints 0.106437 0.197539 0.347682 0.183136 0.0693738 9
produce:pandora:StandardPandora 25.8971 41.4849 100.239 32.5338 21.9236 9
produce:pandoraTrack:LArPandoraTrackCreation 0.291993 0.817992 2.24642 0.572783 0.556004 9
produce:pandoraGnocalo:GnocchiCalorimetry 0.0181874 0.0252473 0.0415262 0.0222905 0.00698061 9
[art]:TriggerResults:TriggerResultInserter 1.4778e-05 3.2023e-05 8.413e-05 2.3745e-05 2.1571e-05 9
end_path:out1:RootOutput 4.679e-06 7.50889e-06 2.1742e-05 5.2e-06 5.23949e-06 9
end_path:out1:RootOutput(write) 3.804 4.04072 4.68199 3.97503 0.270207 9
==================================================================================================================================
====================================================================================================
MemoryTracker summary (base-10 MB units used)
Peak virtual memory usage (VmPeak) : 5337.55 MB
Peak resident set size usage (VmHWM): 3316.05 MB
Details saved in: 'mem.db'
====================================================================================================
%MSG-s ArtException: PostEndJob 10-Dec-2025 13:58:28 CET ModuleEndJob
---- EventProcessorFailure BEGIN
EventProcessor: an exception occurred during current event processing
---- ScheduleExecutionFailure BEGIN
Path: ProcessingStopped.
---- StdException BEGIN
An exception was thrown while processing module PDVDTPCReader/tpcrawdecoder run: 41215 subRun: 1 event: 329773
Error during HDF5 Read:
---- StdException END
Exception going through path produce
---- ScheduleExecutionFailure END
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 1.
Error in reco1