justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 20650.7@dunegpschedd01.fnal.gov

Jobsub ID20650.7@dunegpschedd01.fnal.gov
Workflow ID270
Stage ID1
User nameichong@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2025-08-03 16:08:43
SiteNL_NIKHEF
EntryVIRGO_NL_NIKHEF_brug
Last heartbeat2025-08-03 16:14:23
From worker nodeHostnamewn-pep-010.farm.nikhef.nl
cpuinfoIntel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statejobscript_error
Started2025-08-03 16:10:46
Input filesfardet-hd:atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2.root
JobscriptExit code1
Real time0m (0s)
CPU time0m (0s = 0%)
Max RSS bytes0 (0 MiB)
Outputting started 
Output files
Finished2025-08-03 16:14:23
Saved logsjustin-logs:20650.7-dunegpschedd01.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

: 3193 at 03-Aug-2025 18:13:59 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 94th record. run: 74481119 subRun: 1 event: 3194 at 03-Aug-2025 18:14:00 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 0
Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 95th record. run: 74481119 subRun: 1 event: 3195 at 03-Aug-2025 18:14:01 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
Begin processing the 96th record. run: 74481119 subRun: 1 event: 3196 at 03-Aug-2025 18:14:02 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 97th record. run: 74481119 subRun: 1 event: 3197 at 03-Aug-2025 18:14:03 CEST
Analysing.

Warning: there was no vertex found for PFParticle with ID 0
Warning: there was no track found for track-like PFParticle with ID 0
Warning: there was no track found for track-like PFParticle with ID 1
Begin processing the 98th record. run: 74481119 subRun: 1 event: 3198 at 03-Aug-2025 18:14:04 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 4
Begin processing the 99th record. run: 74481119 subRun: 1 event: 3199 at 03-Aug-2025 18:14:05 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 2
Begin processing the 100th record. run: 74481119 subRun: 1 event: 3200 at 03-Aug-2025 18:14:06 CEST
Analysing.

Warning: there was no track found for track-like PFParticle with ID 3
03-Aug-2025 18:14:07 CEST  Closed input file "root://se1.farm.particle.cz:1094//dune/RSE/fardet-hd/ee/36/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2.root"

========================================================================================================================
TimeTracker printout (sec)                Min           Avg           Max         Median          RMS         nEvts   
========================================================================================================================
Full event                             0.445387      0.791368       1.62667      0.817345      0.156476        100    
------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                 0.0144026     0.0222728     0.0455003     0.0217776    0.00808549       100    
end_path:analysistree:AnalysisTree     0.427892      0.768989       1.61198      0.790766      0.157832        100    
========================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 7495.5 MB
  Peak resident set size usage (VmHWM): 1145.97 MB
====================================================================================================
Art has completed and will exit with status 0.
=== End last 100 lines of third lar log file ===
=== Start last 100 lines of lar log file ===
Now with regularization...
Begin: 1.54988e+09
0 1.54917e+09
---MC-PARTICLE-MONITORING-----------------------------------------------------------------------

BeamNeutrinos: 

--Primary 0, MCPDG 11, Energy 6.59114, Dist. 94.1262, nMCHits 4356 (1513, 1676, 1167)
MCPDG 11, Energy 6.59114, Dist. 94.1262, nMCHits 4356 (1513, 1676, 1167)

--Primary 1, MCPDG -211, Energy 0.91005, Dist. 232.904, nMCHits 1220 (469, 300, 451)
MCPDG -211, Energy 0.91005, Dist. 232.904, nMCHits 1172 (451, 281, 440)
\_ MCPDG 2212, Energy 1.00465, Dist. 3.76922, nMCHits 14 (5, 4, 5)
\_ MCPDG -211, Energy 0.196831, Dist. 11.4718, nMCHits 32 (12, 14, 6)
   \_ MCPDG 2212, Energy 0.96771, Dist. 0.914114, nMCHits 2 (1, 1, 0)

--Primary 2, MCPDG 211, Energy 1.34955, Dist. 121.307, nMCHits 1078 (383, 393, 302)
MCPDG 211, Energy 1.34955, Dist. 121.307, nMCHits 280 (153, 118, 9)
\_ MCPDG 2212, Energy 1.08972, Dist. 2.15127, nMCHits 5 (2, 2, 1)
   \_ MCPDG 2212, Energy 1.07399, Dist. 13.3088, nMCHits 53 (12, 23, 18)
\_ MCPDG 211, Energy 0.923777, Dist. 141.431, nMCHits 680 (193, 230, 257)
   \_ MCPDG 2212, Energy 1.05068, Dist. 9.64642, nMCHits 35 (13, 13, 9)
   \_ MCPDG 2212, Energy 1.0294, Dist. 6.69405, nMCHits 21 (10, 4, 7)
   \_ MCPDG 2212, Energy 0.977736, Dist. 1.51534, nMCHits 4 (0, 3, 1)

--Primary 3, MCPDG 2212, Energy 1.31158, Dist. 72.4873, nMCHits 176 (10, 77, 89)
MCPDG 2212, Energy 1.31158, Dist. 72.4873, nMCHits 176 (10, 77, 89)

--Primary 4, MCPDG 22, Energy 0.102341, Dist. 12.6399, nMCHits 160 (41, 64, 55)
MCPDG 22, Energy 0.102341, Dist. 12.6399, nMCHits 160 (41, 64, 55)

--Primary 5, MCPDG 2212, Energy 1.22918, Dist. 48.2043, nMCHits 139 (79, 13, 47)
MCPDG 2212, Energy 1.22918, Dist. 48.2043, nMCHits 139 (79, 13, 47)

--Primary 6, MCPDG 22, Energy 0.0782495, Dist. 34.0688, nMCHits 92 (31, 34, 27)
MCPDG 22, Energy 0.0782495, Dist. 34.0688, nMCHits 92 (31, 34, 27)

--Primary 7, MCPDG -211, Energy 1.24318, Dist. 16.3, nMCHits 67 (18, 20, 29)
MCPDG -211, Energy 1.24318, Dist. 16.3, nMCHits 44 (11, 13, 20)
\_ MCPDG 1000180360, Energy 33.4975, Dist. 0.000306698, nMCHits 6 (2, 2, 2)
\_ MCPDG -211, Energy 0.167009, Dist. 3.48756, nMCHits 17 (5, 5, 7)
------------------------------------------------------------------------------------------------
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_U_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_V_v04_03_00.pt'
Loaded the TorchScript model '/cvmfs/dune.osgstorage.org/pnfs/fnal.gov/usr/dune/persistent/stash//PandoraNetworkData/PandoraNet_Vertex_DUNEFD_HD_Atmos_1_W_v04_03_00.pt'
Operating in training mode.
The eid is 0
Graph saved to training1_CaloHitListW_graph.data
Size of file training1_CaloHitListW_graph.data is 58508 bytes.
The eid is 0
Graph saved to training1_CaloHitListU_graph.data
Size of file training1_CaloHitListU_graph.data is 70228 bytes.
The eid is 0
Graph saved to training1_CaloHitListV_graph.data
Size of file training1_CaloHitListV_graph.data is 73308 bytes.
Operating in inference mode.
Operating in training mode.
The eid is -1
Graph saved to training2_CaloHitListW_graph.data
Size of file training2_CaloHitListW_graph.data is 58508 bytes.
The eid is -1
Graph saved to training2_CaloHitListU_graph.data
Size of file training2_CaloHitListU_graph.data is 70228 bytes.
The eid is -1
Graph saved to training2_CaloHitListV_graph.data
Size of file training2_CaloHitListV_graph.data is 73308 bytes.
Boundary wire vector sizes: 2999, 2877, 2409
minwire 0: 1825
minwire 1: 233
minwire 2: 1940
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
Used alternate method to get min and max tdcs due to vertex determination failure: 0, 499
03-Aug-2025 18:11:35 CEST  Opened output file with pattern "atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2_graph_2025-08-03T_161050Z.root"
03-Aug-2025 18:11:40 CEST  Closed input file "root://se1.farm.particle.cz:1094//dune/RSE/fardet-hd/ee/36/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2.root"
Malformed TimeTracker database.  The TimeEvent table is empty, but
the TimeModule table is not.  This can happen if an exception has
been thrown from a module while processing the first event.  Any
saved database file is suspect and should not be used.

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 8589.5 MB
  Peak resident set size usage (VmHWM): 1686.48 MB
====================================================================================================
%MSG-s ArtException:  PostEndJob 03-Aug-2025 18:11:40 CEST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- BadAlloc BEGIN
      A bad_alloc exception was thrown while processing module CVNEvaluator/cvneva run: 74481119 subRun: 1 event: 3101
      The job has probably exhausted the virtual memory available to the process.
    ---- BadAlloc END
    Exception going through path reco
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 1.
=== End last 100 lines of lar log file ===
=== Generated output files ===
20650.7_dunegpschedd01.fnal.gov.logs.tgz
RootOutput-8a0a-dfc5-e6ae-2ade.root
ana_tree_hd.root
analysiseid.root
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2_graph_2025-08-03T_161050Z.log
debugprod.log
jobscript.log
reco_hist.root
secondary_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2_graph_2025-08-03T_161050Z.log
third_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74481119_31_20231201T114051Z_gen_g4_detsim_hitreco__20240507T194537Z_reco2_graph_2025-08-03T_161050Z.log
training1_CaloHitListU.csv
training1_CaloHitListU_graph.data
training1_CaloHitListV.csv
training1_CaloHitListV_graph.data
training1_CaloHitListW.csv
training1_CaloHitListW_graph.data
training2_CaloHitListU.csv
training2_CaloHitListU_graph.data
training2_CaloHitListV.csv
training2_CaloHitListV_graph.data
training2_CaloHitListW.csv
training2_CaloHitListW_graph.data
justIN time: 2025-08-04 16:40:30 UTC       justIN version: 01.04.00