Mar 21, 2023
NVIDIA Announces H100 NVL — Max Memory Server Card for Large Language Models
Posted by Eric Klien in category: robotics/AI
ChatGPT is currently deployed on A100 chips that have 80 GB of cache each. Nvidia decided this was a bit wimpy so they developed much faster H100 chips (H100 is about twice as fast as A100) that have 94 GB of cache each and then found a way to put two of them on a card with high speed connections between them for a total of 188 GB of cache per card.
So hardware is getting more and more impressive!
While this year’s Spring GTC event doesn’t feature any new GPUs or GPU architectures from NVIDIA, the company is still in the process of rolling out new products based on the Hopper and Ada Lovelace GPUs its introduced in the past year. At the high-end of the market, the company today is announcing a new H100 accelerator variant specifically aimed at large language model users: the H100 NVL.
Continue reading “NVIDIA Announces H100 NVL — Max Memory Server Card for Large Language Models” »