Next-Generation Cluster Management Software
Abstract: Over the last six decades, Los Alamos National Laboratory (LANL) has acquired, accepted, and integrated over 100 new HPC systems, from MANIAC in 1952 to Trinity in 2017. These systems range from small clusters to large supercomputers. The high performance computing (HPC) system architecture has progressively changed over this time as well; from single system images to complex, interdependent service infrastructures within a large HPC system. The authors are proposing a redesign of the current HPC system architecture to help reduce downtime and provide a more resilient architectural design.
Back to HPC Systems Professionals Workshop (HPCSYSPROS18) Archive Listing
Back to Full Workshop Archive Listing