justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 19778.80@dunegpschedd01.fnal.gov

Jobsub ID19778.80@dunegpschedd01.fnal.gov
Workflow ID168
Stage ID1
User namelwhite86@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit3600 (1 hours)
Submitted time2025-07-31 10:18:46
SiteFR_CCIN2P3
EntryDUNE_FR_CCIN2P3_cccondorce03
Last heartbeat2025-07-31 10:27:26
From worker nodeHostnameccwcondor0041
cpuinfoAMD EPYC 9334 32-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit106200 (29 hours)
GPU
Inner Apptainer?True
Job statefinished
Started2025-07-31 10:19:56
Input filesfardet-hd:anutau_dune10kt_1x2x6_1070_719_20230824T062240Z_gen_g4_detsim_hitreco__20240406T115733Z_reco2.root
JobscriptExit code0
Real time7m (423s)
CPU time4m (295s = 69%)
Max RSS bytes1193037824 (1137 MiB)
Outputting started2025-07-31 10:27:00
Output fileshttps://fndcadoor.fnal.gov:2880/dune/scratch/users/lwhite86/fnal/00168/1/001/trainingFile_anutau_dune10kt_1x2x6_1070_719_20230824T062240Z_gen_g4_detsim_hitreco__20240406T115733Z_reco2.root
Finished2025-07-31 10:27:26
Saved logsjustin-logs:19778.80-dunegpschedd01.fnal.gov.logs.tgz
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

e#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 61st record. run: 1070 subRun: 1 event: 71961 at 31-Jul-2025 12:25:19 CEST
Begin processing the 62nd record. run: 1070 subRun: 1 event: 71962 at 31-Jul-2025 12:25:21 CEST
Begin processing the 63rd record. run: 1070 subRun: 1 event: 71963 at 31-Jul-2025 12:25:23 CEST
Begin processing the 64th record. run: 1070 subRun: 1 event: 71964 at 31-Jul-2025 12:25:26 CEST
Begin processing the 65th record. run: 1070 subRun: 1 event: 71965 at 31-Jul-2025 12:25:34 CEST
Begin processing the 66th record. run: 1070 subRun: 1 event: 71966 at 31-Jul-2025 12:25:35 CEST
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 67th record. run: 1070 subRun: 1 event: 71967 at 31-Jul-2025 12:25:37 CEST
Begin processing the 68th record. run: 1070 subRun: 1 event: 71968 at 31-Jul-2025 12:25:38 CEST
Begin processing the 69th record. run: 1070 subRun: 1 event: 71969 at 31-Jul-2025 12:25:41 CEST
Begin processing the 70th record. run: 1070 subRun: 1 event: 71970 at 31-Jul-2025 12:25:44 CEST
Begin processing the 71st record. run: 1070 subRun: 1 event: 71971 at 31-Jul-2025 12:25:45 CEST
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 72nd record. run: 1070 subRun: 1 event: 71972 at 31-Jul-2025 12:25:47 CEST
Begin processing the 73rd record. run: 1070 subRun: 1 event: 71973 at 31-Jul-2025 12:25:49 CEST
PcaShowerParticleBuildingAlgorithm::OpeningAngle - principal eigenvalue less than or equal to 0.
PcaShowerParticleBuildingAlgorithm::OpeningAngle - principal eigenvalue less than or equal to 0.
Begin processing the 74th record. run: 1070 subRun: 1 event: 71974 at 31-Jul-2025 12:25:54 CEST
Begin processing the 75th record. run: 1070 subRun: 1 event: 71975 at 31-Jul-2025 12:25:55 CEST
Begin processing the 76th record. run: 1070 subRun: 1 event: 71976 at 31-Jul-2025 12:25:57 CEST
Begin processing the 77th record. run: 1070 subRun: 1 event: 71977 at 31-Jul-2025 12:25:58 CEST
Begin processing the 78th record. run: 1070 subRun: 1 event: 71978 at 31-Jul-2025 12:26:00 CEST
Begin processing the 79th record. run: 1070 subRun: 1 event: 71979 at 31-Jul-2025 12:26:02 CEST
Begin processing the 80th record. run: 1070 subRun: 1 event: 71980 at 31-Jul-2025 12:26:04 CEST
Begin processing the 81st record. run: 1070 subRun: 1 event: 71981 at 31-Jul-2025 12:26:07 CEST
Begin processing the 82nd record. run: 1070 subRun: 1 event: 71982 at 31-Jul-2025 12:26:26 CEST
Begin processing the 83rd record. run: 1070 subRun: 1 event: 71983 at 31-Jul-2025 12:26:27 CEST
Begin processing the 84th record. run: 1070 subRun: 1 event: 71984 at 31-Jul-2025 12:26:29 CEST
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 85th record. run: 1070 subRun: 1 event: 71985 at 31-Jul-2025 12:26:30 CEST
Begin processing the 86th record. run: 1070 subRun: 1 event: 71986 at 31-Jul-2025 12:26:32 CEST
Begin processing the 87th record. run: 1070 subRun: 1 event: 71987 at 31-Jul-2025 12:26:33 CEST
CNNTrackShowerCountingAlgorithm::GetCrop1D: reconstructed vertex outside cropped region! 362.459, 106.207, 362.207
CNNTrackShowerCountingAlgorithm::GetCrop1D: reconstructed vertex outside cropped region! 362.459, 106.185, 362.185
CNNTrackShowerCountingAlgorithm::GetCrop1D: reconstructed vertex outside cropped region! 362.459, 106.191, 362.191
Begin processing the 88th record. run: 1070 subRun: 1 event: 71988 at 31-Jul-2025 12:26:36 CEST
Begin processing the 89th record. run: 1070 subRun: 1 event: 71989 at 31-Jul-2025 12:26:38 CEST
Begin processing the 90th record. run: 1070 subRun: 1 event: 71990 at 31-Jul-2025 12:26:40 CEST
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 91st record. run: 1070 subRun: 1 event: 71991 at 31-Jul-2025 12:26:41 CEST
Begin processing the 92nd record. run: 1070 subRun: 1 event: 71992 at 31-Jul-2025 12:26:43 CEST
Begin processing the 93rd record. run: 1070 subRun: 1 event: 71993 at 31-Jul-2025 12:26:45 CEST
Begin processing the 94th record. run: 1070 subRun: 1 event: 71994 at 31-Jul-2025 12:26:47 CEST
Begin processing the 95th record. run: 1070 subRun: 1 event: 71995 at 31-Jul-2025 12:26:48 CEST
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_NOT_INITIALIZED
Begin processing the 96th record. run: 1070 subRun: 1 event: 71996 at 31-Jul-2025 12:26:51 CEST
Begin processing the 97th record. run: 1070 subRun: 1 event: 71997 at 31-Jul-2025 12:26:53 CEST
Begin processing the 98th record. run: 1070 subRun: 1 event: 71998 at 31-Jul-2025 12:26:54 CEST
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 99th record. run: 1070 subRun: 1 event: 71999 at 31-Jul-2025 12:26:56 CEST
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_NOT_INITIALIZED
Begin processing the 100th record. run: 1070 subRun: 1 event: 72000 at 31-Jul-2025 12:26:57 CEST
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_NOT_INITIALIZED
31-Jul-2025 12:26:59 CEST  Closed output file "anutau_dune10kt_1x2x6_1070_719_20230824T062240Z_gen_g4_detsim_hitreco__20240406T115733Z_reco2_reco2.root"
31-Jul-2025 12:26:59 CEST  Closed input file "root://ccxrootdegee.in2p3.fr:1094/pnfs/in2p3.fr/data/dune/disk/fardet-hd/bb/11/anutau_dune10kt_1x2x6_1070_719_20230824T062240Z_gen_g4_detsim_hitreco__20240406T115733Z_reco2.root"

================================================================================================================================
TimeTracker printout (sec)                        Min           Avg           Max         Median          RMS         nEvts   
================================================================================================================================
Full event                                      1.05714       2.52404       24.0991       1.48476       3.39044        100    
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                        0.00220253     0.0165189     0.0377754     0.018161     0.00660801       100    
reco:pandora2:StandardPandora                   1.05382       2.50663       24.0795       1.46439       3.39076        100    
[art]:TriggerResults:TriggerResultInserter    1.2289e-05    2.61801e-05   6.9105e-05    2.2885e-05    1.15478e-05      100    
end_path:out1:RootOutput                       2.854e-06    4.46294e-06   2.2224e-05    3.7505e-06    2.47731e-06      100    
end_path:out1:RootOutput(write)               0.000236029   0.00061863    0.00299646    0.000475081   0.000448777      100    
================================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 2226.24 MB
  Peak resident set size usage (VmHWM): 1193.04 MB
====================================================================================================
Art has completed and will exit with status 0.
lar exit code 0
total 1688
-rw-r--r-- 1 dune001 lbno     218 Jul 31 12:19 all-input-dids.txt
-rw-r--r-- 1 dune001 lbno  354759 Jul 31 12:26 anutau_dune10kt_1x2x6_1070_719_20230824T062240Z_gen_g4_detsim_hitreco__20240406T115733Z_reco2_reco2.root
-rw-r--r-- 1 dune001 lbno       0 Jul 31 12:22 debugprod.log
-rw-r--r-- 1 dune001 lbno   45890 Jul 31 12:27 jobscript.log
-rw-r--r-- 1 dune001 lbno     178 Jul 31 12:27 justin-processed-pfns.txt
drwxr-xr-x 4 dune001 lbno      60 Jul 31 12:21 larpandoracontent
-rw-r--r-- 1 dune001 lbno     519 Jul 31 12:26 reco2_hist.root
-rw-r--r-- 1 dune001 lbno 1306986 Jul 31 12:26 trainingFile_anutau_dune10kt_1x2x6_1070_719_20230824T062240Z_gen_g4_detsim_hitreco__20240406T115733Z_reco2.root
justIN time: 2025-08-04 16:40:09 UTC       justIN version: 01.04.00