Each job submitted to the gpu partition will be provided exclusive access to an entire GPU node. In order to fully utilize the system, any job submission should attempt to use all eight H100 cards. Also, it is important to keep in mind the characteristics of the EPYC 9534 processor. The NUMA topology for each system maps four GPUs to each processor. For most jobs, this means eight tasks will be used with four processes being mapped to Processor 0 and four processes to Processor 1. This will help to reduce communication time. If a job is being handled within an interactive session, OpenMPI should automatically distribute the processes appropriately. For programs that also use OpenMP, it will also be important to map each of the eight tasks to a L3 Cache on the appropriate processor.