justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 46973.179@dunegpschedd01.fnal.gov

Jobsub ID46973.179@dunegpschedd01.fnal.gov
Workflow ID2650
Stage ID1
User nameykermaid@fnal.gov
HTCondor Groupgroup_dune.prod_mcsim
RequestedProcessors1
GPUNo
RSS bytes4193255424 (3999 MiB)
Wall seconds limit18000 (5 hours)
Submitted time2025-09-16 11:56:55
SiteNL_SURFsara
EntryDUNE_SurfSARA_arc02
Last heartbeat2025-09-16 13:28:05
From worker nodeHostnamewn-da-05.gina.surf.nl
cpuinfoAMD EPYC 7702P 64-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statejobscript_error
Started2025-09-16 12:55:13
Input filesvd-protodune:np02vd_raw_run039275_0389_df-s04-d2_dw_0_20250902T032311.hdf5
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Max RSS bytes0 (0 MiB)
Outputting started 
Output files
Finished2025-09-16 13:28:05
Saved logsjustin-logs:46973.179-dunegpschedd01.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

] Timer: WireCell::SigProc::ChannelSelector : 0 sec
[15:22:19.871] I [ timer  ] Timer: wcls::RawFrameSource : 0 sec
[15:22:19.871] I [ timer  ] Timer: wcls::FrameSaver : 0 sec
[15:22:19.871] I [ timer  ] Timer: Total node execution : 89.9800003040582 sec
wclsFrameSaver saving cooked to 10000 ticks
wclsFrameSaver: saving 60683 traces tagged "gauss"
FrameSaver: q=1.07972e+07 n=1320215 tag=gauss
wclsFrameSaver: saving 74124 traces tagged "wiener"
FrameSaver: q=1.14247e+07 n=1275141 tag=wiener
0 X, 0 U, 0 V bad channels
Finding XUV coincidences...
C:0 T:0 294 XUs and 370 XVs -> 13 XUVs
C:0 T:1 780 XUs and 1285 XVs -> 55 XUVs
C:0 T:2 382 XUs and 419 XVs -> 7 XUVs
C:0 T:3 216 XUs and 250 XVs -> 24 XUVs
C:0 T:4 1619 XUs and 2610 XVs -> 112 XUVs
C:0 T:5 222 XUs and 307 XVs -> 12 XUVs
C:0 T:6 1289 XUs and 2300 XVs -> 53 XUVs
C:0 T:7 1582 XUs and 1439 XVs -> 73 XUVs
C:0 T:8 15060 XUs and 33034 XVs -> 3402 XUVs
C:0 T:9 4962 XUs and 5807 XVs -> 737 XUVs
C:0 T:10 1177 XUs and 1783 XVs -> 258 XUVs
C:0 T:11 860 XUs and 1032 XVs -> 52 XUVs
C:0 T:12 8989 XUs and 10369 XVs -> 697 XUVs
C:0 T:13 1806 XUs and 1804 XVs -> 100 XUVs
C:0 T:14 3057 XUs and 3833 XVs -> 336 XUVs
C:0 T:15 11547 XUs and 14805 XVs -> 940 XUVs
6871 XUVs total
2200 collection wire objects
6871 potential space points
Neighbour search...
403677 tests to find 186136 neighbours
Iterating with no regularization...
Begin: 7.25801e+08
0 6.50641e+08
1 6.45312e+08
2 6.44535e+08
3 6.44293e+08
Now with regularization...
Begin: 6.30411e+08
0 6.30239e+08
BdtBeamParticleIdTool::SliceFeatures::GetLeadingCaloHits - empty calo hit list
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 0:
  #000: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dio.c line 179 in H5Dread(): can't read data
    major: Dataset
    minor: Read failed
  #001: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 2011 in H5VL_dataset_read(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #002: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 1978 in H5VL__dataset_read(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #003: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLnative_dataset.c line 166 in H5VL__native_dataset_read(): can't read data
    major: Dataset
    minor: Read failed
  #004: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dio.c line 545 in H5D__read(): can't read data
    major: Dataset
    minor: Read failed
  #005: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 600 in H5D__contig_read(): contiguous read failed
    major: Dataset
    minor: Read failed
  #006: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dselect.c line 465 in H5D__select_read(): read error
    major: Dataspace
    minor: Read failed
  #007: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dselect.c line 220 in H5D__select_io(): read error
    major: Dataspace
    minor: Read failed
  #008: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 924 in H5D__contig_readvv(): can't perform vectorized sieve buffer read
    major: Dataset
    minor: Can't operate on object
  #009: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VM.c line 1401 in H5VM_opvv(): can't perform operation
    major: Internal error (too specific to document in detail)
    minor: Can't operate on object
  #010: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 729 in H5D__contig_readvv_sieve_cb(): block read failed
    major: Dataset
    minor: Read failed
  #011: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fio.c line 105 in H5F_shared_block_read(): read through page buffer failed
    major: Low-level I/O
    minor: Read failed
  #012: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5PB.c line 721 in H5PB_read(): read through metadata accumulator failed
    major: Page Buffering
    minor: Read failed
  #013: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Faccum.c line 252 in H5F__accum_read(): driver read request failed
    major: Low-level I/O
    minor: Read failed
  #014: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDint.c line 189 in H5FD_read(): driver read request failed
    major: Virtual File Layer
    minor: Read failed
  #015: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDsec2.c line 755 in H5FD__sec2_read(): file read failed: time = Tue Sep 16 15:27:10 2025
, filename = 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/vd-protodune/11/47/np02vd_raw_run039275_0389_df-s04-d2_dw_0_20250902T032311.hdf5', file descriptor = 24, errno = 113, error message = 'No route to host', buf = 0x66114860, total read size = 1101672, bytes this sub-read = 1101672, bytes actually read = 18446744073709551615, offset = 0
    major: Low-level I/O
    minor: Read failed

==================================================================================================================================
TimeTracker printout (sec)                          Min           Avg           Max         Median          RMS         nEvts   
==================================================================================================================================
Full event                                       7.905e-05      291.282       423.678       316.853       139.715         6     
----------------------------------------------------------------------------------------------------------------------------------
source:HDF5RawInput3(read)                      7.5032e-05    0.000102789   0.000207355   8.2081e-05    4.71236e-05       6     
produce:tpcrawdecoder:PDVDTPCReader               61.2281       91.3841       118.28        95.6808       22.4886         5     
produce:triggerrawdecoder:PDVDTriggerReader4     0.0346147     0.0364222     0.0375325     0.0364482    0.00101946        5     
produce:pdvddaphne:DAPHNEReaderPDVD               13.3636       14.7494       16.4471       14.6644       1.18337         5     
produce:ophit:OpHitFinder                        0.0608352     0.0682446     0.0785672     0.0689228    0.00660319        5     
produce:opflash:OpFlashFinderVerticalDrift       0.0180033     0.0198138     0.0234505     0.0192629    0.00190393        5     
produce:wclsdatavd:WireCellToolkit                76.2807       93.7871       114.173       92.3343       12.1546         5     
produce:gaushit:GausHitFinder                     1.44084       1.99341       2.39751       2.02265      0.309676         5     
produce:nhitsfilter:NumberOfHitsFilter          0.000471317   0.000878113   0.00138723    0.000902558   0.000316066       5     
produce:reco3d:SpacePointSolver                   16.3221       24.7501       31.0798       25.2758       4.76068         5     
produce:hitpdune:DisambigFromSpacePoints         0.240149      0.379147      0.521522      0.365489      0.0904352        5     
produce:pandora:StandardPandora                   49.4402       114.15        154.912       131.18        39.7135         5     
produce:pandoraTrack:LArPandoraTrackCreation      1.04603       2.10204       2.55648       2.3441       0.548061         5     
produce:pandoraGnocalo:GnocchiCalorimetry        0.0370846     0.0492406     0.0636217     0.0487962    0.00867036        5     
[art]:TriggerResults:TriggerResultInserter      2.1702e-05    3.18348e-05    6.777e-05    2.2692e-05    1.79901e-05       5     
end_path:out1:RootOutput                         5.511e-06    9.7508e-06    2.5088e-05     6.161e-06    7.67481e-06       5     
end_path:out1:RootOutput(write)                   5.69899       5.98648       6.2257        5.99113      0.174627         5     
==================================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 4942.2 MB
  Peak resident set size usage (VmHWM): 3053.31 MB
  Details saved in: 'mem.db'
====================================================================================================
%MSG-s ArtException:  PostEndJob 16-Sep-2025 15:27:11 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- StdException BEGIN
      An exception was thrown while processing module PDVDTPCReader/tpcrawdecoder run: 39275 subRun: 1 event: 140152
      Error during HDF5 Read:  
    ---- StdException END
    Exception going through path produce
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 1.
Error in reco1
justIN time: 2025-09-18 22:23:33 UTC       justIN version: 01.05.00