justIN           Dashboard       Workflows       Jobs       AWT       Sites       Storages       Docs       Login

Jobsub ID 13473.168@dunegpschedd02.fnal.gov

Jobsub ID13473.168@dunegpschedd02.fnal.gov
Workflow ID177
Stage ID1
User namelwhite86@fnal.gov
HTCondor Groupgroup_dune
RequestedProcessors1
GPUNo
RSS bytes4194304000 (4000 MiB)
Wall seconds limit3600 (1 hours)
Submitted time2025-07-31 21:15:24
SiteUS_UCSD
EntryCMSHTPC_T2_US_UCSD_gw6
Last heartbeat2025-07-31 21:31:53
From worker nodeHostnamemh-7662-1.t2.ucsd.edu
cpuinfoAMD EPYC 7662 64-Core Processor
OS releaseScientific Linux release 7.9 (Nitrogen)
Processors1
RSS bytes4194304000 (4000 MiB)
Wall seconds limit171000 (47 hours)
GPU
Inner Apptainer?True
Job statestalled
Started2025-07-31 21:16:11
Input filesfardet-hd:anu_dune10kt_1x2x6_1083_67_20230824T232046Z_gen_g4_detsim_hitreco__20240223T224441Z_reco2.root
Outputting started2025-07-31 21:31:53
Output files
Finished2025-07-31 22:25:57
List job events     Cached HTCondor job logs

Jobscript log (last 10,000 characters)

PDT
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 61st record. run: 1083 subRun: 1 event: 6761 at 31-Jul-2025 14:21:55 PDT
Begin processing the 62nd record. run: 1083 subRun: 1 event: 6762 at 31-Jul-2025 14:21:58 PDT
PandoraContentApi::GetList(*this, m_inputHitListName, pCaloHitList) return STATUS_CODE_NOT_INITIALIZED
    in function: GetVolumeIdToHitListMap
    in file:     /exp/dune/app/users/lwhite86/DUNE-FD/pandoraEventClassification/srcs/larpandoracontent/larpandoracontent/LArControlFlow/MasterAlgorithm.cc line#: 271
this->GetVolumeIdToHitListMap(volumeIdToHitListMap) return STATUS_CODE_NOT_INITIALIZED
    in function: Run
    in file:     /exp/dune/app/users/lwhite86/DUNE-FD/pandoraEventClassification/srcs/larpandoracontent/larpandoracontent/LArControlFlow/MasterAlgorithm.cc line#: 165
iter->second->Run() throw STATUS_CODE_NOT_INITIALIZED
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0002, LArDLMaster, STATUS_CODE_NOT_INITIALIZED
Begin processing the 63rd record. run: 1083 subRun: 1 event: 6763 at 31-Jul-2025 14:21:59 PDT
Begin processing the 64th record. run: 1083 subRun: 1 event: 6764 at 31-Jul-2025 14:22:02 PDT
Begin processing the 65th record. run: 1083 subRun: 1 event: 6765 at 31-Jul-2025 14:22:05 PDT
Begin processing the 66th record. run: 1083 subRun: 1 event: 6766 at 31-Jul-2025 14:22:10 PDT
Begin processing the 67th record. run: 1083 subRun: 1 event: 6767 at 31-Jul-2025 14:22:15 PDT
Begin processing the 68th record. run: 1083 subRun: 1 event: 6768 at 31-Jul-2025 14:22:18 PDT
Begin processing the 69th record. run: 1083 subRun: 1 event: 6769 at 31-Jul-2025 14:22:38 PDT
Begin processing the 70th record. run: 1083 subRun: 1 event: 6770 at 31-Jul-2025 14:22:42 PDT
Begin processing the 71st record. run: 1083 subRun: 1 event: 6771 at 31-Jul-2025 14:22:45 PDT
Begin processing the 72nd record. run: 1083 subRun: 1 event: 6772 at 31-Jul-2025 14:22:48 PDT
Begin processing the 73rd record. run: 1083 subRun: 1 event: 6773 at 31-Jul-2025 14:22:52 PDT
Begin processing the 74th record. run: 1083 subRun: 1 event: 6774 at 31-Jul-2025 14:22:56 PDT
Begin processing the 75th record. run: 1083 subRun: 1 event: 6775 at 31-Jul-2025 14:22:59 PDT
Begin processing the 76th record. run: 1083 subRun: 1 event: 6776 at 31-Jul-2025 14:23:02 PDT
Begin processing the 77th record. run: 1083 subRun: 1 event: 6777 at 31-Jul-2025 14:23:06 PDT
Begin processing the 78th record. run: 1083 subRun: 1 event: 6778 at 31-Jul-2025 14:23:09 PDT
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 79th record. run: 1083 subRun: 1 event: 6779 at 31-Jul-2025 14:23:12 PDT
PandoraContentApi::GetList(*this, m_inputHitListName, pCaloHitList) return STATUS_CODE_NOT_INITIALIZED
    in function: GetVolumeIdToHitListMap
    in file:     /exp/dune/app/users/lwhite86/DUNE-FD/pandoraEventClassification/srcs/larpandoracontent/larpandoracontent/LArControlFlow/MasterAlgorithm.cc line#: 271
this->GetVolumeIdToHitListMap(volumeIdToHitListMap) return STATUS_CODE_NOT_INITIALIZED
    in function: Run
    in file:     /exp/dune/app/users/lwhite86/DUNE-FD/pandoraEventClassification/srcs/larpandoracontent/larpandoracontent/LArControlFlow/MasterAlgorithm.cc line#: 165
iter->second->Run() throw STATUS_CODE_NOT_INITIALIZED
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0002, LArDLMaster, STATUS_CODE_NOT_INITIALIZED
Begin processing the 80th record. run: 1083 subRun: 1 event: 6780 at 31-Jul-2025 14:23:14 PDT
Begin processing the 81st record. run: 1083 subRun: 1 event: 6781 at 31-Jul-2025 14:23:43 PDT
Begin processing the 82nd record. run: 1083 subRun: 1 event: 6782 at 31-Jul-2025 14:23:47 PDT
Begin processing the 83rd record. run: 1083 subRun: 1 event: 6783 at 31-Jul-2025 14:23:59 PDT
Begin processing the 84th record. run: 1083 subRun: 1 event: 6784 at 31-Jul-2025 14:24:03 PDT
Begin processing the 85th record. run: 1083 subRun: 1 event: 6785 at 31-Jul-2025 14:24:16 PDT
Begin processing the 86th record. run: 1083 subRun: 1 event: 6786 at 31-Jul-2025 14:24:20 PDT
Begin processing the 87th record. run: 1083 subRun: 1 event: 6787 at 31-Jul-2025 14:24:24 PDT
Begin processing the 88th record. run: 1083 subRun: 1 event: 6788 at 31-Jul-2025 14:24:28 PDT
Begin processing the 89th record. run: 1083 subRun: 1 event: 6789 at 31-Jul-2025 14:25:09 PDT
Begin processing the 90th record. run: 1083 subRun: 1 event: 6790 at 31-Jul-2025 14:25:12 PDT
Begin processing the 91st record. run: 1083 subRun: 1 event: 6791 at 31-Jul-2025 14:25:15 PDT
Begin processing the 92nd record. run: 1083 subRun: 1 event: 6792 at 31-Jul-2025 14:25:18 PDT
Begin processing the 93rd record. run: 1083 subRun: 1 event: 6793 at 31-Jul-2025 14:25:22 PDT
Begin processing the 94th record. run: 1083 subRun: 1 event: 6794 at 31-Jul-2025 14:25:25 PDT
Begin processing the 95th record. run: 1083 subRun: 1 event: 6795 at 31-Jul-2025 14:25:30 PDT
Begin processing the 96th record. run: 1083 subRun: 1 event: 6796 at 31-Jul-2025 14:25:34 PDT
Skipping event as it does not have enough hits or associated primary particles to make a training sample
iter->second->Run() throw STATUS_CODE_FAILURE
    in function: RunAlgorithm
    in file:     /scratch/workspace/build-larbase/BUILDTYPE/prof/QUAL/s131-e26/label1/swarm/label2/SLF7/build/pandora/v03_16_00l/src/pandora-v03-16-00/PandoraSDK-v03-04-01/src/Api/PandoraContentApiImpl.cc line#: 235
Failure in algorithm Alg0004, LArCNNTrackShowerCounting, STATUS_CODE_FAILURE
Begin processing the 97th record. run: 1083 subRun: 1 event: 6797 at 31-Jul-2025 14:25:36 PDT
Begin processing the 98th record. run: 1083 subRun: 1 event: 6798 at 31-Jul-2025 14:25:40 PDT
Begin processing the 99th record. run: 1083 subRun: 1 event: 6799 at 31-Jul-2025 14:25:43 PDT
Begin processing the 100th record. run: 1083 subRun: 1 event: 6800 at 31-Jul-2025 14:25:47 PDT
31-Jul-2025 14:25:52 PDT  Closed output file "anu_dune10kt_1x2x6_1083_67_20230824T232046Z_gen_g4_detsim_hitreco__20240223T224441Z_reco2_reco2.root"
31-Jul-2025 14:25:52 PDT  Closed input file "root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/staging/fardet-hd/84/43/anu_dune10kt_1x2x6_1083_67_20230824T232046Z_gen_g4_detsim_hitreco__20240223T224441Z_reco2.root"

================================================================================================================================
TimeTracker printout (sec)                        Min           Avg           Max         Median          RMS         nEvts   
================================================================================================================================
Full event                                     0.399791       3.81053       39.9083       2.49679       5.26481        100    
--------------------------------------------------------------------------------------------------------------------------------
source:RootInput(read)                         0.0626926     0.119335      0.559428      0.113709      0.0464238       100    
reco:pandora2:StandardPandora                  0.300534       3.69016       39.7666       2.36286        5.265         100    
[art]:TriggerResults:TriggerResultInserter     1.282e-05    3.1334e-05    0.000123063   2.41005e-05   2.19137e-05      100    
end_path:out1:RootOutput                       3.131e-06    7.04196e-06    2.066e-05     6.775e-06    2.21469e-06      100    
end_path:out1:RootOutput(write)               0.000243396   0.00067402    0.00614068    0.000542683   0.000629267      100    
================================================================================================================================

====================================================================================================
MemoryTracker summary (base-10 MB units used)

  Peak virtual memory usage (VmPeak)  : 2229.64 MB
  Peak resident set size usage (VmHWM): 1191.24 MB
====================================================================================================
Art has completed and will exit with status 0.
lar exit code 0
total 1636
-rw-r--r--. 1 cuser cuser     210 Jul 31 14:16 all-input-dids.txt
-rw-r--r--. 1 cuser cuser  350780 Jul 31 14:25 anu_dune10kt_1x2x6_1083_67_20230824T232046Z_gen_g4_detsim_hitreco__20240223T224441Z_reco2_reco2.root
-rw-r--r--. 1 cuser cuser       0 Jul 31 14:16 debugprod.log
-rw-r--r--. 1 cuser cuser   48470 Jul 31 14:25 jobscript.log
-rw-r--r--. 1 cuser cuser     181 Jul 31 14:25 justin-processed-pfns.txt
drwxr-xr-x. 4 cuser cuser      48 Jul 31 14:16 larpandoracontent
-rw-r--r--. 1 cuser cuser     519 Jul 31 14:25 reco2_hist.root
-rw-r--r--. 1 cuser cuser 1261246 Jul 31 14:25 trainingFile_anu_dune10kt_1x2x6_1083_67_20230824T232046Z_gen_g4_detsim_hitreco__20240223T224441Z_reco2.root
justIN time: 2025-08-04 14:23:07 UTC       justIN version: 01.04.00