Jobsub ID 47001.0@dunegpschedd01.fnal.gov
Jobsub ID | 47001.0@dunegpschedd01.fnal.gov |
Workflow ID | 2326 |
Stage ID | 1 |
User name | ykermaid@fnal.gov |
HTCondor Group | group_dune.prod.mcsim |
Requested | Processors | 1 |
GPU | No |
RSS bytes | 4193255424 (3999 MiB) |
Wall seconds limit | 18000 (5 hours) |
Submitted time | 2025-09-16 14:37:01 |
Site | NL_NIKHEF |
Entry | VIRGO_NL_NIKHEF_klomp |
Last heartbeat | 2025-09-16 15:13:04 |
From worker node | Hostname | wn-pep-014.farm.nikhef.nl |
cpuinfo | Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz |
OS release | Scientific Linux release 7.9 (Nitrogen) |
Processors | 1 |
RSS bytes | 4194304000 (4000 MiB) |
Wall seconds limit | 129600 (36 hours) |
GPU | |
Inner Apptainer? | True |
Job state | jobscript_error |
Started | 2025-09-16 14:58:13 |
Input files | vd-protodune:np02vd_raw_run039324_2057_df-s02-d3_dw_0_20250907T114720.hdf5
|
Jobscript | Exit code | 1 |
Real time | 0m (0s) |
CPU time | 0m (0s = 0%) |
Max RSS bytes | 0 (0 MiB) |
Outputting started | |
Output files | |
Finished | 2025-09-16 15:13:04 |
Saved logs | justin-logs:47001.0-dunegpschedd01.fnal.gov.logs.tgz |
List job events Cached HTCondor job logs |
Jobscript log (last 10,000 characters)
: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
%MSG-e OpHitFinder: OpHitFinder:ophit@BeginModule 16-Sep-2025 17:11:03 CEST run: 39324 subRun: 1 event: 1077725
Error! unrecognized channel number -1. Ignoring pulse
%MSG
RawFrameSource: got 12288 raw::RawDigit objects
input nticks=6400 keeping as is
[17:11:03.619] D [ main ] executing 1 apps, thread limit 0:
[17:11:03.619] D [ main ] executing 1 apps, thread limit 0:
[17:11:03.619] D [ main ] executing app: "Pgrapher"
[17:11:03.619] D [ pgraph ] <Pgrapher:> executing graph
[17:11:03.619] D [ pgraph ] executing with 26 nodes
[17:11:03.621] D [ glue ] <FrameFanout:nfsp> call=8: input: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig" ] 0 tagged trace sets:[ ] cmm:[ ] output 0: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig0" ] 0 tagged trace sets:[ ] cmm:[ ] output 1: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig1" ] 0 tagged trace sets:[ ] cmm:[ ] output 2: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig2" ] 0 tagged trace sets:[ ] cmm:[ ] output 3: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig3" ] 0 tagged trace sets:[ ] cmm:[ ] output 4: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig4" ] 0 tagged trace sets:[ ] cmm:[ ] output 5: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig5" ] 0 tagged trace sets:[ ] cmm:[ ] output 6: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig6" ] 0 tagged trace sets:[ ] cmm:[ ] output 7: frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ]
[17:11:03.622] W [ glue ] <ChannelSelector:chsel7> Untagged summary not supported, summary will be dropped.
[17:11:03.623] D [ glue ] <ChannelSelector:chsel7> input frame: ident=1077725 time=12 tick=512 with 12288 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ] output: frame: ident=1077725 time=12 tick=512 with 1536 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ]
[17:11:03.623] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=8 input frame: frame: ident=1077725 time=12 tick=512 with 1536 traces. frame tags:[ "orig7" ] 0 tagged trace sets:[ ] cmm:[ ]
[17:11:03.623] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=8 init nticks=6400 tbinmin=0 tbinmax=6400
[17:11:03.658] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=8 load plane index: 0, ntraces=1536, input bad regions: 0
[17:11:04.915] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=8 load plane index: 1, ntraces=1536, input bad regions: 0
[17:11:06.264] D [sigproc ] <OmnibusSigProc:anode7sigproc7> call=8 load plane index: 2, ntraces=1536, input bad regions: 0
==================================================================================================================================
TimeTracker printout (sec) Min Avg Max Median RMS nEvts
==================================================================================================================================
Full event 6.65e-05 138.568 218.553 146.814 75.346 5
----------------------------------------------------------------------------------------------------------------------------------
source:HDF5RawInput3(read) 5.8638e-05 8.17146e-05 0.000155841 6.6302e-05 3.71843e-05 5
produce:tpcrawdecoder:PDVDTPCReader 33.4621 41.3565 51.3055 37.8719 7.3826 5
produce:triggerrawdecoder:PDVDTriggerReader4 0.0341222 0.0343467 0.0348759 0.0342713 0.00027045 5
produce:pdvddaphne:DAPHNEReaderPDVD 5.59562 8.45117 10.6865 8.36767 1.7982 5
produce:ophit:OpHitFinder 0.0377089 0.0407291 0.0442785 0.040804 0.00231354 5
produce:opflash:OpFlashFinderVerticalDrift 0.00712139 0.0104172 0.0155579 0.00939014 0.00329813 5
produce:wclsdatavd:WireCellToolkit 46.528 55.377 77.7637 48.6081 12.9539 4
produce:gaushit:GausHitFinder 1.2133 1.45512 1.63061 1.48828 0.158599 4
produce:nhitsfilter:NumberOfHitsFilter 0.00036082 0.000487806 0.000708549 0.000440927 0.000136229 4
produce:reco3d:SpacePointSolver 8.694 11.3586 14.1696 11.2855 2.55084 4
produce:hitpdune:DisambigFromSpacePoints 0.137644 0.203172 0.2634 0.205822 0.0585812 4
produce:pandora:StandardPandora 35.024 48.8804 64.7466 47.8756 12.6725 4
produce:pandoraTrack:LArPandoraTrackCreation 0.696369 1.07736 1.72081 0.946136 0.394961 4
produce:pandoraGnocalo:GnocchiCalorimetry 0.0209555 0.0271623 0.0355751 0.0260593 0.00589104 4
[art]:TriggerResults:TriggerResultInserter 2.8728e-05 4.46227e-05 9.0449e-05 2.9657e-05 2.64634e-05 4
end_path:out1:RootOutput 6.392e-06 1.22902e-05 2.8441e-05 7.164e-06 9.33e-06 4
end_path:out1:RootOutput(write) 3.47134 4.02057 5.38859 3.61118 0.79728 4
==================================================================================================================================
====================================================================================================
MemoryTracker summary (base-10 MB units used)
Peak virtual memory usage (VmPeak) : 8589.81 MB
Peak resident set size usage (VmHWM): 6706.38 MB
Details saved in: 'mem.db'
====================================================================================================
%MSG-s ArtException: PostEndJob 16-Sep-2025 17:12:48 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
EventProcessor: an exception occurred during current event processing
---- ScheduleExecutionFailure BEGIN
Path: ProcessingStopped.
---- BadAlloc BEGIN
A bad_alloc exception was thrown while processing module WireCellToolkit/wclsdatavd run: 39324 subRun: 1 event: 1077725
The job has probably exhausted the virtual memory available to the process.
---- BadAlloc END
Exception going through path produce
---- ScheduleExecutionFailure END
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 1.
Error in reco1