justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 14184.3@dunegpschedd02.fnal.gov

Jobsub ID14184.3@dunegpschedd02.fnal.gov
Workflow ID270
Stage ID1
User nameichong@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2025-08-03 15:48:42
SiteNL_NIKHEF
EntryVIRGO_NL_NIKHEF_brug
Last heartbeat2025-08-03 15:52:05
From worker nodeHostnamewn-pep-010.farm.nikhef.nl
cpuinfoIntel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statejobscript_error
Started2025-08-03 15:50:03
Input filesfardet-hd:atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Max RSS bytes0 (0 MiB)
Outputting started 
Output files
Finished2025-08-03 15:52:05
Saved logsjustin-logs:14184.3-dunegpschedd02.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

o track found for track-like PFParticle with ID 4
Begin processing the 90th record. run: 6404651 subRun: 1 event: 32590 at 03-Aug-2025 17:51:47 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 91st record. run: 6404651 subRun: 1 event: 32591 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 92nd record. run: 6404651 subRun: 1 event: 32592 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 93rd record. run: 6404651 subRun: 1 event: 32593 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 94th record. run: 6404651 subRun: 1 event: 32594 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 95th record. run: 6404651 subRun: 1 event: 32595 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 96th record. run: 6404651 subRun: 1 event: 32596 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 97th record. run: 6404651 subRun: 1 event: 32597 at 03-Aug-2025 17:51:48 CEST
Analysing.

Begin processing the 98th record. run: 6404651 subRun: 1 event: 32598 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 99th record. run: 6404651 subRun: 1 event: 32599 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 100th record. run: 6404651 subRun: 1 event: 32600 at 03-Aug-2025 17:51:48 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 5
03-Aug-2025 17:51:48 CEST  Closed input file "root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/fardet-hd/6d/41/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2.root"

========================================================================================================================
TimeTracker printout (sec)                Min           Avg           Max         Median          RMS         nEvts   
========================================================================================================================
Full event                            0.00844359     0.0596289     0.800564      0.0217495     0.122156        100    
------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                0.000429054   0.000710065    0.0013651    0.000664847   0.000194269      100    
end_path:analysistree:AnalysisTree    0.00765732     0.0588158     0.799624      0.0210468     0.122158        100    
========================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 7425.54 MB
  Peak resident set size usage (VmHWM): 1073.36 MB
====================================================================================================
Art has completed and will exit with status 0.
=== End last 100 lines of third lar log file ===
=== Start last 100 lines of lar log file ===
C:0 T:16 31 XUs and 33 XVs -> 23 XUVs
C:0 T:20 1098 XUs and 1432 XVs -> 440 XUVs
C:0 T:21 3 XUs and 3 XVs -> 3 XUVs
466 XUVs total
214 collection wire objects
466 potential space points
Neighbour search...
35978 tests to find 20818 neighbours
Iterating with no regularization...
Begin: 2.1313e+07
0 1.63601e+07
1 1.59616e+07
2 1.58717e+07
3 1.58517e+07
4 1.58423e+07
Now with regularization...
Begin: 1.04966e+07
0 1.04771e+07
1 1.04689e+07
---MC-PARTICLE-MONITORING-----------------------------------------------------------------------

BeamNeutrinos: 

--Primary 0, MCPDG 211, Energy 0.185311, Dist. 7.97116, nMCHits 94 (38, 26, 30)
MCPDG 211, Energy 0.185311, Dist. 7.97116, nMCHits 24 (10, 4, 10)
   \_ MCPDG -11, Energy 0.0472976, Dist. 12.2249, nMCHits 70 (28, 22, 20)

--Primary 1, MCPDG 2212, Energy 1.24161, Dist. 1.3277, nMCHits 59 (26, 21, 12)
MCPDG 2212, Energy 1.24161, Dist. 1.3277, nMCHits 4 (3, 1, 0)
\_ MCPDG 2212, Energy 1.09662, Dist. 17.0642, nMCHits 35 (17, 11, 7)
\_ MCPDG 2212, Energy 1.03697, Dist. 6.94406, nMCHits 20 (6, 9, 5)

--Primary 2, MCPDG -211, Energy 0.275591, Dist. 12.7081, nMCHits 50 (25, 6, 19)
MCPDG -211, Energy 0.275591, Dist. 12.7081, nMCHits 46 (24, 4, 18)
\_ MCPDG 1000170380, Energy 35.359, Dist. 0.00015561, nMCHits 4 (1, 2, 1)
------------------------------------------------------------------------------------------------
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_U_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_V_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_W_v04_03_00.pt'
Operating in training mode.
The eid is 0
Graph saved to training1_CaloHitListW_graph.data
Size of file training1_CaloHitListW_graph.data is 6652 bytes.
The eid is 0
Graph saved to training1_CaloHitListU_graph.data
Size of file training1_CaloHitListU_graph.data is 6796 bytes.
The eid is 0
Graph saved to training1_CaloHitListV_graph.data
Size of file training1_CaloHitListV_graph.data is 6572 bytes.
Operating in inference mode.
Operating in training mode.
The eid is -1
Graph saved to training2_CaloHitListW_graph.data
Size of file training2_CaloHitListW_graph.data is 6652 bytes.
The eid is -1
Graph saved to training2_CaloHitListU_graph.data
Size of file training2_CaloHitListU_graph.data is 6796 bytes.
The eid is -1
Graph saved to training2_CaloHitListV_graph.data
Size of file training2_CaloHitListV_graph.data is 6572 bytes.
Boundary wire vector sizes: 270, 280, 275
minwire 0: 2135
minwire 1: 113
minwire 2: 2252
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max wires due to vertex determination failure: 2188, 2687
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
03-Aug-2025 17:50:48 CEST  Opened output file with pattern "atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2_graph_2025-08-03T_155007Z.root"
03-Aug-2025 17:50:55 CEST  Closed input file "root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/fardet-hd/6d/41/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2.root"
Malformed TimeTracker database.  The TimeEvent table is empty, but
the TimeModule table is not.  This can happen if an exception has
been thrown from a module while processing the first event.  Any
saved database file is suspect and should not be used.

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 8581.64 MB
  Peak resident set size usage (VmHWM): 1680.08 MB
====================================================================================================
%MSG-s ArtException:  PostEndJob 03-Aug-2025 17:50:57 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- BadAlloc BEGIN
      A bad_alloc exception was thrown while processing module CVNEvaluator/cvneva run: 6404651 subRun: 1 event: 32501
      The job has probably exhausted the virtual memory available to the process.
    ---- BadAlloc END
    Exception going through path reco
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 0 entries while recob::PCAxisrecob::Showervoidart::Assns_pandoraShower__Reco2. has 100 entries.
  ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
=== End last 100 lines of lar log file ===
=== Generated output files ===
14184.3_dunegpschedd02.fnal.gov.logs.tgz
RootOutput-6511-e29e-d88d-7dbd.root
ana_tree_hd.root
analysiseid.root
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2_graph_2025-08-03T_155007Z.log
debugprod.log
jobscript.log
reco_hist.root
secondary_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2_graph_2025-08-03T_155007Z.log
third_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_6404651_325_20231202T093812Z_gen_g4_detsim_hitreco__20240507T214343Z_reco2_graph_2025-08-03T_155007Z.log
training1_CaloHitListU.csv
training1_CaloHitListU_graph.data
training1_CaloHitListV.csv
training1_CaloHitListV_graph.data
training1_CaloHitListW.csv
training1_CaloHitListW_graph.data
training2_CaloHitListU.csv
training2_CaloHitListU_graph.data
training2_CaloHitListV.csv
training2_CaloHitListV_graph.data
training2_CaloHitListW.csv
training2_CaloHitListW_graph.data
justIN time: 2025-08-04 16:34:35 UTC       justIN version: 01.04.00