Memory bandwidth

3/17/2023

The Max Series is slated to launch in January 2023.High-end SoC architectures today can contain dozens of processing engines-multiple cores from MIPS and ARM, DSPs from Tensilica and CEVA, and even graphics processors. The new processors have already begun shipping to initial customers, including Argonne. We can gain significant performance improvement by turning on the HBM that's integrated within the CPU because there's still a lot of code that runs on the CPU, even if we've offloaded some of our larger kernels off to the GPU,” he said. “You'd argue, well, we've offloaded everything to the GPU, we don't need that highest end CPU, right? We don't need the HBM memory, right? Wrong. McVeigh said it’s the combination of the two processors that makes it happen. That’s double the speed of Frontier, the current leader in the supercomputer race. Intel is building a supercomputer for the Argonne National Lab with CPU Max and GPU Max processors that, when it goes online in 2023, will exceed 2 exaFLOPs of performance. He said Intel has got a number of system designs being developed by OEM and system solution providers, who will bring out servers with OAM starting in 2023. The PCI Express card is great for use in a standard server even workstation systems, but the OAM modules are really oriented for higher density environments, said McVeigh. The 1550 GPU is a 600-Watt OAM module with 128 Xe cores and 128GB of HBM. The 1350 GPU is a 450-Watt OAM module with 112 Xe cores and 96GB of HBM. The other two configurations employ the Open Compute Project (OCP) accelerator module, known as OAM, which is a faster alternative interface to PCIe cards. Multiple cards can be connected via Intel Xe Link bridges. The 1100 is a 300-Watt double-wide PCIe card with 56 Xe cores and 48GB of HBM2e memory. GPU Max also comes in three configurations the 1100, 1350, and 1550 models. “You might want to do some tuning to utilize that very large cache that you now have, but you don't have to when you get immediate benefits,” said McVeigh. In this mode, no software code changes are required. The third configuration is HBM caching mode, where the HBM acts as a cache for the DDR memory in the system. With both HBM and DDR software needs to be optimized to move data between those different memory regions. The second configuration is called HBM flat mode, which combines HBM in the CPU package with standard DDR5 memory sticks in the system. In a two-socket system, that’s 128GB of memory, which McVeigh said “for many applications and workloads is sufficient.” In this use scenario, applications can run unchanged.

This is how Japan’s Fugaku supercomputer, for some time one of the fastest supercomputers in the world, operates. The first is without DRAM, so the only memory in the system is 64GB of HBM on the CPU Max chip. CPU MaxĬPU Max comes in three server configurations. The focus of CPU Max and GPU Max is around maximizing the bandwidth, maximizing the compute, and maximizing the capabilities and possibilities that they offer for addressing the breadth of workloads, said McVeigh. And our goal is to really go forward and address them holistically.” One is the CPU route and the other is the GPU route, and they each have their own obstacles. “Traditionally, there's been two routes up this summit. “If you look at the overall workloads in the HPC and AI domain, there's a wide diversity of workloads,” said Jeff McVeigh, vice president and general manager of supercomputing at Intel.

0 Comments

Memory bandwidth

Leave a Reply.

Author

Archives

Categories