justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 14199.10@dunegpschedd02.fnal.gov

Jobsub ID14199.10@dunegpschedd02.fnal.gov
Workflow ID270
Stage ID1
User nameichong@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2025-08-03 17:16:47
SiteNL_NIKHEF
EntryVIRGO_NL_NIKHEF_brug
Last heartbeat2025-08-03 17:28:55
From worker nodeHostnamewn-pep-010.farm.nikhef.nl
cpuinfoIntel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statejobscript_error
Started2025-08-03 17:18:29
Input filesfardet-hd:atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Max RSS bytes0 (0 MiB)
Outputting started 
Output files
Finished2025-08-03 17:28:55
Saved logsjustin-logs:14199.10-dunegpschedd02.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

nd for PFParticle with ID 1
Warning: there was no track found for track-like PFParticle with ID 1
Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 92nd record. run: 50572032 subRun: 1 event: 112992 at 03-Aug-2025 19:28:28 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 93rd record. run: 50572032 subRun: 1 event: 112993 at 03-Aug-2025 19:28:29 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 94th record. run: 50572032 subRun: 1 event: 112994 at 03-Aug-2025 19:28:30 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 95th record. run: 50572032 subRun: 1 event: 112995 at 03-Aug-2025 19:28:31 CEST
Analysing.

Begin processing the 96th record. run: 50572032 subRun: 1 event: 112996 at 03-Aug-2025 19:28:31 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 97th record. run: 50572032 subRun: 1 event: 112997 at 03-Aug-2025 19:28:32 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 4
Begin processing the 98th record. run: 50572032 subRun: 1 event: 112998 at 03-Aug-2025 19:28:33 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 99th record. run: 50572032 subRun: 1 event: 112999 at 03-Aug-2025 19:28:34 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 100th record. run: 50572032 subRun: 1 event: 113000 at 03-Aug-2025 19:28:35 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
03-Aug-2025 19:28:36 CEST  Closed input file "root://se1.farm.particle.cz:1094//dune/RSE/fardet-hd/26/5a/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2.root"

========================================================================================================================
TimeTracker printout (sec)                Min           Avg           Max         Median          RMS         nEvts   
========================================================================================================================
Full event                             0.446681      0.820284       1.46276      0.830182      0.155764        100    
------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                 0.0144549     0.0258273     0.175801      0.0285911     0.0176046       100    
end_path:analysistree:AnalysisTree     0.430324      0.794347       1.44769       0.80661      0.155446        100    
========================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 7348.91 MB
  Peak resident set size usage (VmHWM): 997.65 MB
====================================================================================================
Art has completed and will exit with status 0.
=== End last 100 lines of third lar log file ===
=== Start last 100 lines of lar log file ===
2 2.05927e+08
3 2.05778e+08
Now with regularization...
Begin: 1.71175e+08
0 1.70879e+08
1 1.70812e+08
---MC-PARTICLE-MONITORING-----------------------------------------------------------------------

BeamNeutrinos: 

--Primary 0, MCPDG 22, Energy 2.29203, Dist. 6.18292, nMCHits 2011 (698, 679, 634)
MCPDG 22, Energy 2.29203, Dist. 6.18292, nMCHits 2011 (698, 679, 634)

--Primary 1, MCPDG 22, Energy 0.532465, Dist. 5.2711, nMCHits 476 (163, 172, 141)
MCPDG 22, Energy 0.532465, Dist. 5.2711, nMCHits 476 (163, 172, 141)

--Primary 2, MCPDG 22, Energy 0.43063, Dist. 55.409, nMCHits 473 (125, 214, 134)
MCPDG 22, Energy 0.43063, Dist. 55.409, nMCHits 473 (125, 214, 134)

--Primary 3, MCPDG -211, Energy 0.826797, Dist. 103.187, nMCHits 464 (76, 217, 171)
MCPDG -211, Energy 0.826797, Dist. 103.187, nMCHits 428 (67, 203, 158)
\_ MCPDG 2212, Energy 1.03124, Dist. 6.87523, nMCHits 29 (7, 11, 11)
\_ MCPDG 2212, Energy 0.940661, Dist. 0.012158, nMCHits 7 (2, 3, 2)

--Primary 4, MCPDG 211, Energy 0.505984, Dist. 51.2355, nMCHits 316 (85, 124, 107)
MCPDG 211, Energy 0.505984, Dist. 51.2355, nMCHits 157 (25, 73, 59)
\_ MCPDG 2212, Energy 0.965938, Dist. 0.825244, nMCHits 5 (2, 2, 1)
\_ MCPDG 211, Energy 0.272816, Dist. 40.2058, nMCHits 108 (41, 37, 30)
      \_ MCPDG -11, Energy 0.0319241, Dist. 8.61557, nMCHits 46 (17, 12, 17)

--Primary 5, MCPDG 211, Energy 0.299437, Dist. 52.9923, nMCHits 183 (35, 85, 63)
MCPDG 211, Energy 0.299437, Dist. 52.9923, nMCHits 135 (22, 67, 46)
   \_ MCPDG -11, Energy 0.0226483, Dist. 8.22166, nMCHits 48 (13, 18, 17)

--Primary 6, MCPDG 22, Energy 0.0851989, Dist. 30.3519, nMCHits 159 (39, 69, 51)
MCPDG 22, Energy 0.0851989, Dist. 30.3519, nMCHits 159 (39, 69, 51)
------------------------------------------------------------------------------------------------
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_U_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_V_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_W_v04_03_00.pt'
Operating in training mode.
The eid is 0
Graph saved to training1_CaloHitListW_graph.data
Size of file training1_CaloHitListW_graph.data is 34636 bytes.
The eid is 0
Graph saved to training1_CaloHitListU_graph.data
Size of file training1_CaloHitListU_graph.data is 33636 bytes.
The eid is 0
Graph saved to training1_CaloHitListV_graph.data
Size of file training1_CaloHitListV_graph.data is 41780 bytes.
Operating in inference mode.
Operating in training mode.
The eid is -1
Graph saved to training2_CaloHitListW_graph.data
Size of file training2_CaloHitListW_graph.data is 34636 bytes.
The eid is -1
Graph saved to training2_CaloHitListU_graph.data
Size of file training2_CaloHitListU_graph.data is 33636 bytes.
The eid is -1
Graph saved to training2_CaloHitListV_graph.data
Size of file training2_CaloHitListV_graph.data is 41780 bytes.
Boundary wire vector sizes: 1714, 1374, 1416
minwire 0: 178
minwire 1: 2233
minwire 2: 0
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
03-Aug-2025 19:19:18 CEST  Opened output file with pattern "atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2_graph_2025-08-03T_171834Z.root"
03-Aug-2025 19:25:59 CEST  Closed input file "root://se1.farm.particle.cz:1094//dune/RSE/fardet-hd/26/5a/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2.root"
Malformed TimeTracker database.  The TimeEvent table is empty, but
the TimeModule table is not.  This can happen if an exception has
been thrown from a module while processing the first event.  Any
saved database file is suspect and should not be used.

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 8588.53 MB
  Peak resident set size usage (VmHWM): 1686.13 MB
====================================================================================================
%MSG-s ArtException:  PostEndJob 03-Aug-2025 19:25:59 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- BadAlloc BEGIN
      A bad_alloc exception was thrown while processing module CVNEvaluator/cvneva run: 50572032 subRun: 1 event: 112901
      The job has probably exhausted the virtual memory available to the process.
    ---- BadAlloc END
    Exception going through path reco
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 0 entries while recob::PCAxisrecob::Showervoidart::Assns_pandoraShower__Reco2. has 100 entries.
  ROOT severity: 2000
---- FatalRootError END
%MSG
Art has completed and will exit with status 1.
=== End last 100 lines of lar log file ===
=== Generated output files ===
14199.10_dunegpschedd02.fnal.gov.logs.tgz
RootOutput-bdcf-2753-6847-f5cf.root
ana_tree_hd.root
analysiseid.root
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2_graph_2025-08-03T_171834Z.log
debugprod.log
jobscript.log
reco_hist.root
secondary_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2_graph_2025-08-03T_171834Z.log
third_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_50572032_1129_20231203T073300Z_gen_g4_detsim_hitreco__20240508T073029Z_reco2_graph_2025-08-03T_171834Z.log
training1_CaloHitListU.csv
training1_CaloHitListU_graph.data
training1_CaloHitListV.csv
training1_CaloHitListV_graph.data
training1_CaloHitListW.csv
training1_CaloHitListW_graph.data
training2_CaloHitListU.csv
training2_CaloHitListU_graph.data
training2_CaloHitListV.csv
training2_CaloHitListV_graph.data
training2_CaloHitListW.csv
training2_CaloHitListW_graph.data
justIN time: 2025-08-04 14:12:07 UTC       justIN version: 01.04.00