justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 46970.153@dunegpschedd01.fnal.gov

Jobsub ID46970.153@dunegpschedd01.fnal.gov
Workflow ID2650
Stage ID1
User nameykermaid@fnal.gov
HTCondor Groupgroup_dune.prod_mcsim
RequestedProcessors1
GPUNo
RSS bytes4193255424 (3999 MiB)
Wall seconds limit18000 (5 hours)
Submitted time2025-09-16 11:44:54
SiteNL_SURFsara
EntryDUNE_SurfSARA_arc03
Last heartbeat2025-09-16 13:27:57
From worker nodeHostnamewn-lb-02.gina.surf.nl
cpuinfoAMD EPYC 9754 128-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statejobscript_error
Started2025-09-16 12:12:55
Input filesvd-protodune:np02vd_raw_run039275_0335_df-s04-d0_dw_0_20250902T022412.hdf5
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Max RSS bytes0 (0 MiB)
Outputting started 
Output files
Finished2025-09-16 13:27:57
Saved logsjustin-logs:46970.153-dunegpschedd01.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

red window for channel 10417
wclsFrameSaver: no samples within desired window for channel 10441
wclsFrameSaver: no samples within desired window for channel 10724
wclsFrameSaver: no samples within desired window for channel 10877
wclsFrameSaver: no samples within desired window for channel 10883
wclsFrameSaver: no samples within desired window for channel 10964
wclsFrameSaver: no samples within desired window for channel 10977
wclsFrameSaver: no samples within desired window for channel 11169
FrameSaver: q=7.8752e+06 n=1025721 tag=wiener
0 X, 0 U, 0 V bad channels
Finding XUV coincidences...
C:0 T:0 730 XUs and 803 XVs -> 24 XUVs
C:0 T:1 1143 XUs and 1412 XVs -> 26 XUVs
C:0 T:2 956 XUs and 1158 XVs -> 56 XUVs
C:0 T:3 527 XUs and 741 XVs -> 30 XUVs
C:0 T:4 1473 XUs and 1635 XVs -> 50 XUVs
C:0 T:5 639 XUs and 712 XVs -> 50 XUVs
C:0 T:6 924 XUs and 1256 XVs -> 54 XUVs
C:0 T:7 584 XUs and 742 XVs -> 23 XUVs
C:0 T:8 2018 XUs and 3598 XVs -> 79 XUVs
C:0 T:9 1101 XUs and 1114 XVs -> 48 XUVs
C:0 T:10 2649 XUs and 2488 XVs -> 187 XUVs
C:0 T:11 840 XUs and 894 XVs -> 46 XUVs
C:0 T:12 2855 XUs and 2775 XVs -> 135 XUVs
C:0 T:13 751 XUs and 853 XVs -> 54 XUVs
C:0 T:14 2818 XUs and 2628 XVs -> 140 XUVs
C:0 T:15 687 XUs and 540 XVs -> 44 XUVs
1046 XUVs total
921 collection wire objects
1046 potential space points
Neighbour search...
6874 tests to find 2836 neighbours
Iterating with no regularization...
Begin: 3.68514e+08
0 3.62172e+08
1 3.62072e+08
Now with regularization...
Begin: 3.57935e+08
0 3.57931e+08
BdtBeamParticleIdTool::SliceFeatures::GetLeadingCaloHits - empty calo hit list
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 0:
  #000: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dio.c line 179 in H5Dread(): can't read data
    major: Dataset
    minor: Read failed
  #001: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 2011 in H5VL_dataset_read(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #002: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLcallback.c line 1978 in H5VL__dataset_read(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #003: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VLnative_dataset.c line 166 in H5VL__native_dataset_read(): can't read data
    major: Dataset
    minor: Read failed
  #004: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dio.c line 545 in H5D__read(): can't read data
    major: Dataset
    minor: Read failed
  #005: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 600 in H5D__contig_read(): contiguous read failed
    major: Dataset
    minor: Read failed
  #006: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dselect.c line 465 in H5D__select_read(): read error
    major: Dataspace
    minor: Read failed
  #007: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dselect.c line 220 in H5D__select_io(): read error
    major: Dataspace
    minor: Read failed
  #008: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 924 in H5D__contig_readvv(): can't perform vectorized sieve buffer read
    major: Dataset
    minor: Can't operate on object
  #009: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5VM.c line 1401 in H5VM_opvv(): can't perform operation
    major: Internal error (too specific to document in detail)
    minor: Can't operate on object
  #010: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Dcontig.c line 729 in H5D__contig_readvv_sieve_cb(): block read failed
    major: Dataset
    minor: Read failed
  #011: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Fio.c line 105 in H5F_shared_block_read(): read through page buffer failed
    major: Low-level I/O
    minor: Read failed
  #012: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5PB.c line 721 in H5PB_read(): read through metadata accumulator failed
    major: Page Buffering
    minor: Read failed
  #013: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5Faccum.c line 252 in H5F__accum_read(): driver read request failed
    major: Low-level I/O
    minor: Read failed
  #014: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDint.c line 189 in H5FD_read(): driver read request failed
    major: Virtual File Layer
    minor: Read failed
  #015: /scratch/workspace/build-single/BUILDTYPE/prof/QUAL/e26/label1/swarm/label2/SLF7/build/hdf5/v1_12_2a/source/hdf5-1.12.2/src/H5FDsec2.c line 755 in H5FD__sec2_read(): file read failed: time = Tue Sep 16 15:27:10 2025
, filename = 'root://eospublic.cern.ch:1094//eos/experiment/neutplatform/protodune/dune/vd-protodune/21/72/np02vd_raw_run039275_0335_df-s04-d0_dw_0_20250902T022412.hdf5', file descriptor = 24, errno = 113, error message = 'No route to host', buf = 0x77f24dd0, total read size = 1130472, bytes this sub-read = 1130472, bytes actually read = 18446744073709551615, offset = 0
    major: Low-level I/O
    minor: Read failed

==================================================================================================================================
TimeTracker printout (sec)                          Min           Avg           Max         Median          RMS         nEvts   
==================================================================================================================================
Full event                                      0.000110725     390.507       806.546       375.117       189.524        11     
----------------------------------------------------------------------------------------------------------------------------------
source:HDF5RawInput3(read)                      8.1954e-05    0.000109231   0.000277404   9.2769e-05    5.38196e-05      11     
produce:tpcrawdecoder:PDVDTPCReader               71.8657       101.785       143.24        98.7471       18.5262        10     
produce:triggerrawdecoder:PDVDTriggerReader4     0.0341637     0.0414206     0.0590722     0.0384134    0.00719102       10     
produce:pdvddaphne:DAPHNEReaderPDVD               12.2173       15.6261       19.0195       15.9332       2.01812        10     
produce:ophit:OpHitFinder                        0.0689266     0.0797108     0.0874699     0.0821716    0.00626757       10     
produce:opflash:OpFlashFinderVerticalDrift       0.0152304     0.018883      0.022752      0.0187835    0.00228062       10     
produce:wclsdatavd:WireCellToolkit                82.2412       112.959       132.423       115.142       16.1078        10     
produce:gaushit:GausHitFinder                     1.38392       1.98964       2.86838       1.94501      0.400384        10     
produce:nhitsfilter:NumberOfHitsFilter          0.000641077   0.00135867    0.00217807    0.00131945    0.000460459      10     
produce:reco3d:SpacePointSolver                   16.8042       28.8482       45.0156       28.9275       8.43235        10     
produce:hitpdune:DisambigFromSpacePoints         0.211316       0.45775      0.725861      0.473203       0.17416        10     
produce:pandora:StandardPandora                   45.0416       154.721       488.496       120.053       123.664        10     
produce:pandoraTrack:LArPandoraTrackCreation      2.77629       5.82981       10.5909       4.58487       2.70403        10     
produce:pandoraGnocalo:GnocchiCalorimetry        0.0360685     0.0569226     0.0733233     0.0625497     0.011655        10     
[art]:TriggerResults:TriggerResultInserter      4.5538e-05    8.29887e-05   0.000233288   6.6238e-05    5.2939e-05       10     
end_path:out1:RootOutput                         8.082e-06    1.40029e-05    3.994e-05    1.0645e-05    9.10435e-06      10     
end_path:out1:RootOutput(write)                   6.68292       7.01766       7.43543       6.98276      0.249871        10     
==================================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 5103.42 MB
  Peak resident set size usage (VmHWM): 3217.31 MB
  Details saved in: 'mem.db'
====================================================================================================
%MSG-s ArtException:  PostEndJob 16-Sep-2025 15:27:11 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- StdException BEGIN
      An exception was thrown while processing module PDVDTPCReader/tpcrawdecoder run: 39275 subRun: 1 event: 120810
      Error during HDF5 Read:  
    ---- StdException END
    Exception going through path produce
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 1.
Error in reco1
justIN time: 2025-09-19 01:29:41 UTC       justIN version: 01.05.00