Jobsub ID 40859.1@dunegpschedd02.fnal.gov
Jobsub ID | 40859.1@dunegpschedd02.fnal.gov |
Workflow ID | 2537 |
Stage ID | 1 |
User name | ykermaid@fnal.gov |
HTCondor Group | group_dune.prod_mcsim |
Requested | Processors | 1 |
GPU | No |
RSS bytes | 4193255424 (3999 MiB) |
Wall seconds limit | 18000 (5 hours) |
Submitted time | 2025-09-17 07:10:54 |
Site | NL_NIKHEF |
Entry | VIRGO_NL_NIKHEF_klomp |
Last heartbeat | 2025-09-17 09:05:38 |
From worker node | Hostname | wn-snel-028.farm.nikhef.nl |
cpuinfo | AMD EPYC 7H12 64-Core Processor |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 4194304000 (4000 MiB) |
Wall seconds limit | 129600 (36 hours) |
GPU | |
Inner Apptainer? | True |
Job state | jobscript_error |
Started | 2025-09-17 07:12:01 |
Input files | vd-protodune:np02vd_raw_run039273_1446_df-s05-d2_dw_0_20250901T055006.hdf5
|
Jobscript | Exit code | 1 |
Real time | 0m (0s) |
CPU time | 0m (0s = 0%) |
Max RSS bytes | 0 (0 MiB) |
Outputting started | |
Output files | |
Finished | 2025-09-17 09:05:38 |
Saved logs | justin-logs:40859.1-dunegpschedd02.fnal.gov.logs.tgz |
List job events Cached HTCondor job logs |
Jobscript log (last 10,000 characters)
CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 17-Sep-2025 10:56:30 CEST run: 39273 subRun: 1 event: 520896
Error! unrecognized channel number -1. Ignoring pulse
%MSG
RawFrameSource: got 12288 raw::RawDigit objects
input nticks=10000 keeping as is
[10:56:30.580] D [ main ] executing 1 apps, thread limit 0:
[10:56:30.580] D [ main ] executing 1 apps, thread limit 0:
[10:56:30.580] D [ main ] executing app: "Pgrapher"
[10:56:30.580] D [ pgraph ] <Pgrapher:> executing graph
[10:56:30.580] D [ pgraph ] executing with 26 nodes
[10:56:30.581] D [ glue ] <FrameFanout:nfsp> call=32: input: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig" ] 0 tagged trace sets:[ ] cmm:[ ] output 0: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig0" ] 0 tagged trace sets:[ ] cmm:[ ] output 1: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig1" ] 0 tagged trace sets:[ ] cmm:[ ] output 2: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig2" ] 0 tagged trace sets:[ ] cmm:[ ] output 3: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig3" ] 0 tagged trace sets:[ ] cmm:[ ] output 4: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig4" ] 0 tagged trace sets:[ ] cmm:[ ] output 5: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig5" ] 0 tagged trace sets:[ ] cmm:[ ] output 6: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig6" ] 0 tagged trace sets:[ ] cmm:[ ] output 7: frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ]
[10:56:30.581] W [ glue ] <ChannelSelector:chsel7> Untagged summary not supported, summary will be dropped.
[10:56:30.582] D [ glue ] <ChannelSelector:chsel7> input frame: ident=520896 time=41 tick=512 with 12288 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ] output: frame: ident=520896 time=41 tick=512 with 1536 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ]
[10:56:30.582] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=32 input frame: frame: ident=520896 time=41 tick=512 with 1536 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ]
[10:56:30.582] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=32 init nticks=10000 tbinmin=0 tbinmax=10000
[10:56:30.619] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=32 load plane index: 0, ntraces=1536, input bad regions: 0
[10:56:32.291] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=32 load plane index: 1, ntraces=1536, input bad regions: 0
[10:56:34.001] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=32 load plane index: 2, ntraces=1536, input bad regions: 0
==================================================================================================================================
TimeTracker printout (sec) Min Avg Max Median RMS nEvts
==================================================================================================================================
Full event 7.2436e-05 356.037 476.277 348.869 102.691 17
----------------------------------------------------------------------------------------------------------------------------------
source:HDF5RawInput3(read) 6.4231e-05 8.35692e-05 0.000203222 7.2856e-05 3.18891e-05 17
produce:tpcrawdecoder:PDVDTPCReader 135.832 168.52 191.55 168.263 14.2404 17
produce:triggerrawdecoder:PDVDTriggerReader4 0.0342508 0.0443789 0.131771 0.034622 0.0238168 17
produce:pdvddaphne:DAPHNEReaderPDVD 14.0191 17.6697 19.8256 18.0281 1.51105 17
produce:ophit:OpHitFinder 0.0596254 0.0697618 0.0822998 0.0690244 0.00724802 17
produce:opflash:OpFlashFinderVerticalDrift 0.0096931 0.0216657 0.0294973 0.0225762 0.00495854 17
produce:wclsdatavd:WireCellToolkit 61.6916 70.245 78.7575 69.7234 5.77097 16
produce:gaushit:GausHitFinder 1.2381 1.53471 1.96345 1.52627 0.196435 16
produce:nhitsfilter:NumberOfHitsFilter 0.000346861 0.000532879 0.00117251 0.000440231 0.000242417 16
produce:reco3d:SpacePointSolver 15.1781 21.2912 29.0636 21.0081 4.19888 16
produce:hitpdune:DisambigFromSpacePoints 0.211112 0.333935 0.521582 0.331713 0.0927072 16
produce:pandora:StandardPandora 35.4761 91.5882 167.172 76.3556 36.7288 16
produce:pandoraTrack:LArPandoraTrackCreation 1.08623 1.53822 2.69526 1.41617 0.383765 16
produce:pandoraGnocalo:GnocchiCalorimetry 0.0304423 0.037089 0.0485405 0.035459 0.00515275 16
[art]:TriggerResults:TriggerResultInserter 1.4797e-05 1.88861e-05 4.6117e-05 1.67915e-05 7.21068e-06 16
end_path:out1:RootOutput 3.196e-06 4.71763e-06 1.7863e-05 3.827e-06 3.42745e-06 16
end_path:out1:RootOutput(write) 4.71051 4.92962 5.23067 4.9227 0.143591 16
==================================================================================================================================
====================================================================================================
MemoryTracker summary (base-10 MB units used)
Peak virtual memory usage (VmPeak) : 8589.54 MB
Peak resident set size usage (VmHWM): 6635.56 MB
Details saved in: 'mem.db'
====================================================================================================
%MSG-s ArtException: PostEndJob 17-Sep-2025 10:57:12 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
EventProcessor: an exception occurred during current event processing
---- ScheduleExecutionFailure BEGIN
Path: ProcessingStopped.
---- BadAlloc BEGIN
A bad_alloc exception was thrown while processing module WireCellToolkit/wclsdatavd run: 39273 subRun: 1 event: 520896
The job has probably exhausted the virtual memory available to the process.
---- BadAlloc END
Exception going through path produce
---- ScheduleExecutionFailure END
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 1.
Error in reco1