justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 14065.0@dunegpschedd02.fnal.gov

Jobsub ID14065.0@dunegpschedd02.fnal.gov
Workflow ID253
Stage ID1
User nameichong@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit80000 (22 hours)
Submitted time2025-08-02 22:37:47
SiteNL_SURFsara
EntryDUNE_SurfSARA_arc02
Last heartbeat2025-08-02 22:46:33
From worker nodeHostnamewn-da-12.gina.surf.nl
cpuinfoAMD EPYC 7702P 64-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit129600 (36 hours)
GPU
Inner Apptainer?True
Job statefinished
Started2025-08-02 22:38:54
Input filesfardet-hd:atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2.root
JobscriptExit code0
Real time6m (401s)
CPU time6m (384s = 95%)
Max RSS bytes1981657088 (1889 MiB)
Outputting started2025-08-02 22:45:37
Output fileshttps://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_training1_CaloHitListU_graph.data
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_training1_CaloHitListV_graph.data
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_training1_CaloHitListW_graph.data
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_training2_CaloHitListU_graph.data
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_training2_CaloHitListV_graph.data
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_training2_CaloHitListW_graph.data
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_analysiseid.root
https://fndcadoor.fnal.gov:2880/dune/scratch/users/ichong/fnal/00253/1/001/graph_output_2025-08-02T_223902Z_1_ana_tree_hd.root
Finished2025-08-02 22:46:33
Saved logsjustin-logs:14065.0-dunegpschedd02.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

h_output_2025-08-02T_223902Z_1_training2_CaloHitListV_graph.data
Renamed training2_CaloHitListW_graph.data -> graph_output_2025-08-02T_223902Z_1_training2_CaloHitListW_graph.data
Renamed analysiseid.root -> graph_output_2025-08-02T_223902Z_1_analysiseid.root
Renamed ana_tree_hd.root -> graph_output_2025-08-02T_223902Z_1_ana_tree_hd.root
=== Start last 100 lines of lar log file ===
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0002, LArDLVertexing, STATUS_CODE_NOT_FOUND
Operating in inference mode.
Operating in training mode.
this->CompleteMCHierarchy(mcToHitsMap, hierarchy) return STATUS_CODE_NOT_FOUND
    in function: PrepareTrainingSample
    in file:     /exp/dune/app/users/ichong/larsoft_graph_V1_2025/srcs/larpandoracontent/larpandoradlcontent/LArVertex/DlVertexingAlgorithm.cc line#: 81
iter->second->Run() throw STATUS_CODE_NOT_FOUND
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArDLVertexing, STATUS_CODE_NOT_FOUND

Running Ophitfinder with InputDigiType = 'recob'
Found hits: 146!
Begin processing the 100th record. run: 74577407 subRun: 1 event: 6400 at 03-Aug-2025 00:44:36 CEST
0 X, 0 U, 0 V bad channels
Finding XUV coincidences...
C:0 T:6 4 XUs and 5 XVs -> 3 XUVs
C:0 T:9 43 XUs and 46 XVs -> 29 XUVs
32 XUVs total
18 collection wire objects
32 potential space points
Neighbour search...
138 tests to find 106 neighbours
Iterating with no regularization...
Begin: 241280
0 83663.9
1 80126.3
2 79380.3
3 79254.4
4 79230.5
Now with regularization...
Begin: -12755.6
0 -12828.6
1 -12842.5
2 -12845.3
---MC-PARTICLE-MONITORING-----------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Operating in training mode.
this->CompleteMCHierarchy(mcToHitsMap, hierarchy) return STATUS_CODE_NOT_FOUND
    in function: PrepareTrainingSample
    in file:     /exp/dune/app/users/ichong/larsoft_graph_V1_2025/srcs/larpandoracontent/larpandoradlcontent/LArVertex/DlVertexingAlgorithm.cc line#: 81
iter->second->Run() throw STATUS_CODE_NOT_FOUND
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0002, LArDLVertexing, STATUS_CODE_NOT_FOUND
Operating in inference mode.
Operating in training mode.
this->CompleteMCHierarchy(mcToHitsMap, hierarchy) return STATUS_CODE_NOT_FOUND
    in function: PrepareTrainingSample
    in file:     /exp/dune/app/users/ichong/larsoft_graph_V1_2025/srcs/larpandoracontent/larpandoradlcontent/LArVertex/DlVertexingAlgorithm.cc line#: 81
iter->second->Run() throw STATUS_CODE_NOT_FOUND
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArDLVertexing, STATUS_CODE_NOT_FOUND

Running Ophitfinder with InputDigiType = 'recob'
Found hits: 156!
03-Aug-2025 00:44:38 CEST  Closed output file "atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2_graph_2025-08-02T_223902Z.root"
03-Aug-2025 00:44:38 CEST  Closed input file "root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/fardet-hd/ae/cc/atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2.root"

========================================================================================================================================
TimeTracker printout (sec)                                Min           Avg           Max         Median          RMS         nEvts   
========================================================================================================================================
Full event                                             0.0799712      2.4861        12.1724       2.53418       1.57156        100    
----------------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                                0.00176846    0.00280904     0.0042591    0.00280553    0.00068634       100    
reco:gaushit:GausHitFinder                             0.0171242     0.0647121     0.632577      0.026445      0.0961112       100    
reco:spsolve:SpacePointSolver                         0.000103177    0.054141       1.23308     0.00242934     0.193531        100    
reco:hitfd:DisambigFromSpacePoints                    0.000187207    0.0153271     0.461289     0.000756652    0.0579831       100    
reco:rns:RandomNumberSaver                            1.7854e-05    2.89309e-05   0.000420612   2.27485e-05   3.98193e-05      100    
reco:pandora:StandardPandora                           0.0107513      1.17675       7.82531      0.965507      0.827712        100    
reco:pandoraTrack:LArPandoraTrackCreation             0.000119999   0.000192566   0.00266178    0.00013875    0.000310634      100    
reco:pandoraShower:LArPandoraModularShowerCreation    0.000162519   0.00021358    0.00295745    0.000183751   0.000276122      100    
reco:pandoracalo:Calorimetry                           9.99e-05     0.00013264    0.00152039    0.000111006   0.000140584      100    
reco:pandorapid:Chi2ParticleID                        3.5288e-05    4.7324e-05    0.000767222   3.8508e-05    7.28366e-05      100    
reco:cvnmap:CVNMapper                                 2.0037e-05     0.0197186     0.0612147     0.0211198     0.0157435       100    
reco:cvneva:CVNEvaluator                              1.6151e-05     0.653929       2.82999      0.948212      0.505078        100    
reco:energyrecnumu:EnergyReco                         0.00202451    0.00498937     0.0180374    0.00501354    0.00281988       100    
reco:energyrecnue:EnergyReco                          0.00200497    0.00295338    0.00989356    0.00256016    0.00127312       100    
reco:energyrecnc:EnergyReco                           0.00194275    0.00285845    0.00988879    0.00239653    0.00126608       100    
reco:energyrecnumurange:EnergyReco                    0.00197502    0.00286243    0.00987565    0.00240237    0.00126239       100    
reco:energyrecnumumcs:EnergyReco                      0.00206428    0.00296484     0.0163288    0.00236829    0.00183392       100    
reco:opdec:Deconvolution                               0.0145588     0.155968      0.309788      0.151402      0.0706023       100    
reco:ophitspe:OpHitFinderDeco                          0.011039      0.0157247     0.0455714     0.0147685    0.00410804       100    
reco:opflash:OpFlashFinder                            0.000107184   0.000331061   0.00340258    0.000266533   0.000332003      100    
reco:opslicer:OpSlicer                                3.3504e-05     0.0114406     0.0967713    0.00417684     0.016456        100    
[art]:TriggerResults:TriggerResultInserter            1.0349e-05    1.40272e-05   7.2158e-05    1.2589e-05    6.37283e-06      100    
end_path:out1:RootOutput                               2.926e-06    4.12703e-06   2.0128e-05     3.762e-06    2.17101e-06      100    
end_path:out1:RootOutput(write)                       0.00833858     0.297062      0.646017      0.385842      0.190265        100    
========================================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 11576.5 MB
  Peak resident set size usage (VmHWM): 1981.66 MB
====================================================================================================
Art has completed and will exit with status 0.
=== End last 100 lines of lar log file ===
=== Generated output files ===
14065.0_dunegpschedd02.fnal.gov.logs.tgz
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2_graph_2025-08-02T_223902Z.log
atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2_graph_2025-08-02T_223902Z.root
debugprod.log
graph_output_2025-08-02T_223902Z_1_ana_tree_hd.root
graph_output_2025-08-02T_223902Z_1_analysiseid.root
graph_output_2025-08-02T_223902Z_1_training1_CaloHitListU_graph.data
graph_output_2025-08-02T_223902Z_1_training1_CaloHitListV_graph.data
graph_output_2025-08-02T_223902Z_1_training1_CaloHitListW_graph.data
graph_output_2025-08-02T_223902Z_1_training2_CaloHitListU_graph.data
graph_output_2025-08-02T_223902Z_1_training2_CaloHitListV_graph.data
graph_output_2025-08-02T_223902Z_1_training2_CaloHitListW_graph.data
jobscript.log
justin-processed-pfns.txt
reco_hist.root
secondary_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2_graph_2025-08-02T_223902Z.log
third_atmnu_max_weighted_randompolicy_dune10kt_1x2x6_74577407_63_20231208T053307Z_gen_g4_detsim_hitreco__20240510T055611Z_reco2_graph_2025-08-02T_223902Z.log
training1_CaloHitListU.csv
training1_CaloHitListV.csv
training1_CaloHitListW.csv
training2_CaloHitListU.csv
training2_CaloHitListV.csv
training2_CaloHitListW.csv
justIN time: 2025-08-04 14:17:06 UTC       justIN version: 01.04.00