justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 20671.4@dunegpschedd01.fnal.gov

Jobsub ID20671.4@dunegpschedd01.fnal.gov
Workflow ID270
Stage ID1
User nameichong@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2025-08-03 17:10:46
SiteNL_NIKHEF
EntryVIRGO_NL_NIKHEF_juk
Last heartbeat2025-08-03 17:21:16
From worker nodeHostnamewn-pep-013.farm.nikhef.nl
cpuinfoIntel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statejobscript_error
Started2025-08-03 17:11:15
Input filesfardet-hd:atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Max RSS bytes0 (0 MiB)
Outputting started 
Output files
Finished2025-08-03 17:21:16
Saved logsjustin-logs:20671.4-dunegpschedd01.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

cessing the 95th record. run: 66050399 subRun: 1 event: 30495 at 03-Aug-2025 19:20:53 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 6
Begin processing the 96th record. run: 66050399 subRun: 1 event: 30496 at 03-Aug-2025 19:20:54 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 97th record. run: 66050399 subRun: 1 event: 30497 at 03-Aug-2025 19:20:55 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 98th record. run: 66050399 subRun: 1 event: 30498 at 03-Aug-2025 19:20:56 CEST
Analysing.

Warning: there was no vertex found for PFParticle with ID 16
Warning: there was no track found for track-like PFParticle with ID 16
Warning: there were 19 reconstructed PFParticle daughters; only the first 10 being stored in tree
Warning: there was no track found for track-like PFParticle with ID 19
%MSG-e AnalysisTree:limits:  AnalysisTree:analysistree@BeginModule  03-Aug-2025 19:20:58 CEST run: 66050399 subRun: 1 event: 30498
the pandoraTrack track #0 has 2043 hits on calorimetry plane #0, only 2000 stored in tree
%MSG
Begin processing the 99th record. run: 66050399 subRun: 1 event: 30499 at 03-Aug-2025 19:20:58 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 100th record. run: 66050399 subRun: 1 event: 30500 at 03-Aug-2025 19:20:59 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
03-Aug-2025 19:21:00 CEST  Closed input file "root://se1.farm.particle.cz:1094//dune/RSE/fardet-hd/fe/0a/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2.root"

========================================================================================================================
TimeTracker printout (sec)                Min           Avg           Max         Median          RMS         nEvts   
========================================================================================================================
Full event                             0.443938      0.816344       1.40272      0.818767      0.159466        100    
------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                 0.0143685     0.0216376     0.0291581     0.0157915    0.00709734       100    
end_path:analysistree:AnalysisTree     0.429302      0.794581       1.37392      0.795121      0.159693        100    
========================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 7411.84 MB
  Peak resident set size usage (VmHWM): 1060 MB
====================================================================================================
Art has completed and will exit with status 0.
=== End last 100 lines of third lar log file ===
=== Start last 100 lines of lar log file ===
1 3.65194e+07
2 3.64778e+07
3 3.64521e+07
---MC-PARTICLE-MONITORING-----------------------------------------------------------------------

BeamNeutrinos: 

--Primary 0, MCPDG 211, Energy 1.56551, Dist. 89.1635, nMCHits 1069 (378, 362, 329)
MCPDG 211, Energy 1.56551, Dist. 89.1635, nMCHits 333 (111, 112, 110)
\_ MCPDG 2212, Energy 1.21909, Dist. 45.869, nMCHits 114 (58, 30, 26)
\_ MCPDG 2212, Energy 1.16212, Dist. 31.0991, nMCHits 64 (37, 19, 8)
   \_ MCPDG 22, Energy 0.444591, Dist. 1.17902, nMCHits 558 (172, 201, 185)

--Primary 1, MCPDG 211, Energy 0.847186, Dist. 114.398, nMCHits 566 (169, 185, 212)
MCPDG 211, Energy 0.847186, Dist. 114.398, nMCHits 545 (162, 177, 206)
\_ MCPDG 1000020030, Energy 2.91687, Dist. 0.984849, nMCHits 2 (1, 1, 0)
\_ MCPDG 2212, Energy 0.956723, Dist. 0.39345, nMCHits 5 (2, 1, 2)
\_ MCPDG -211, Energy 0.178446, Dist. 2.32982, nMCHits 6 (2, 2, 2)
   \_ MCPDG 2212, Energy 0.988331, Dist. 2.35682, nMCHits 8 (2, 4, 2)

--Primary 2, MCPDG -211, Energy 0.598076, Dist. 31.5793, nMCHits 266 (113, 71, 82)
MCPDG -211, Energy 0.598076, Dist. 31.5793, nMCHits 58 (18, 21, 19)
\_ MCPDG 2212, Energy 1.01201, Dist. 4.56662, nMCHits 17 (7, 3, 7)
   \_ MCPDG 22, Energy 0.160384, Dist. 1.17833, nMCHits 191 (88, 47, 56)

--Primary 3, MCPDG 13, Energy 6.4265, Dist. 54.0753, nMCHits 108 (15, 45, 48)
MCPDG 13, Energy 6.4265, Dist. 54.0753, nMCHits 108 (15, 45, 48)

--Primary 4, MCPDG 211, Energy 0.227133, Dist. 21.33, nMCHits 79 (31, 25, 23)
MCPDG 211, Energy 0.227133, Dist. 21.33, nMCHits 39 (16, 15, 8)
\_ MCPDG -13, Energy 0.109778, Dist. 0.153147, nMCHits 1 (0, 1, 0)
   \_ MCPDG -11, Energy 0.0493128, Dist. 2.68873, nMCHits 39 (15, 9, 15)

--Primary 5, MCPDG 2212, Energy 1.07959, Dist. 14.1143, nMCHits 44 (11, 19, 14)
MCPDG 2212, Energy 1.07959, Dist. 14.1143, nMCHits 44 (11, 19, 14)
------------------------------------------------------------------------------------------------
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_U_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_V_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_W_v04_03_00.pt'
Operating in training mode.
The eid is 0
Graph saved to training1_CaloHitListW_graph.data
Size of file training1_CaloHitListW_graph.data is 25404 bytes.
The eid is 0
Graph saved to training1_CaloHitListU_graph.data
Size of file training1_CaloHitListU_graph.data is 25068 bytes.
The eid is 0
Graph saved to training1_CaloHitListV_graph.data
Size of file training1_CaloHitListV_graph.data is 26188 bytes.
Operating in inference mode.
Operating in training mode.
The eid is -1
Graph saved to training2_CaloHitListW_graph.data
Size of file training2_CaloHitListW_graph.data is 25404 bytes.
The eid is -1
Graph saved to training2_CaloHitListU_graph.data
Size of file training2_CaloHitListU_graph.data is 25068 bytes.
The eid is -1
Graph saved to training2_CaloHitListV_graph.data
Size of file training2_CaloHitListV_graph.data is 26188 bytes.
Boundary wire vector sizes: 1034, 1078, 1048
minwire 0: 522
minwire 1: 1314
minwire 2: 43
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max wires due to vertex determination failure: 2379, 2878
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
03-Aug-2025 19:12:03 CEST  Opened output file with pattern "atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2_graph_2025-08-03T_171119Z.root"
03-Aug-2025 19:18:24 CEST  Closed input file "root://se1.farm.particle.cz:1094//dune/RSE/fardet-hd/fe/0a/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2.root"
Malformed TimeTracker database.  The TimeEvent table is empty, but
the TimeModule table is not.  This can happen if an exception has
been thrown from a module while processing the first event.  Any
saved database file is suspect and should not be used.

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 8586.62 MB
  Peak resident set size usage (VmHWM): 1684.73 MB
====================================================================================================
%MSG-s ArtException:  PostEndJob 03-Aug-2025 19:18:24 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- BadAlloc BEGIN
      A bad_alloc exception was thrown while processing module CVNEvaluator/cvneva run: 66050399 subRun: 1 event: 30401
      The job has probably exhausted the virtual memory available to the process.
    ---- BadAlloc END
    Exception going through path reco
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 0 entries while recob::PCAxisrecob::Showervoidart::Assns_pandoraShower__Reco2. has 100 entries.
  ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
=== End last 100 lines of lar log file ===
=== Generated output files ===
20671.4_dunegpschedd01.fnal.gov.logs.tgz
RootOutput-0301-ab6e-4cc6-3a48.root
ana_tree_hd.root
analysiseid.root
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2_graph_2025-08-03T_171119Z.log
debugprod.log
jobscript.log
reco_hist.root
secondary_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2_graph_2025-08-03T_171119Z.log
third_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_66050399_304_20231207T200529Z_gen_g4_detsim_hitreco__20240510T042543Z_reco2_graph_2025-08-03T_171119Z.log
training1_CaloHitListU.csv
training1_CaloHitListU_graph.data
training1_CaloHitListV.csv
training1_CaloHitListV_graph.data
training1_CaloHitListW.csv
training1_CaloHitListW_graph.data
training2_CaloHitListU.csv
training2_CaloHitListU_graph.data
training2_CaloHitListV.csv
training2_CaloHitListV_graph.data
training2_CaloHitListW.csv
training2_CaloHitListW_graph.data
justIN time: 2025-08-04 14:18:47 UTC       justIN version: 01.04.00