11 minutes to finish training GPT-3! Nvidia H100 sweeps 8 MLPerf benchmark tests, the next generation of graphics cards will be released in 25 years

**Source:**Xinzhiyuan

Introduction: Boss Huang has won again! In the latest MLPerf benchmark test, H100 successfully set 8 test records. According to foreign media, the next generation of consumer-grade graphics cards may be released in 2025.

In the latest MLPerf training benchmark test, the H100 GPU set new records in all eight tests!

Today, the NVIDIA H100 pretty much dominates all categories and is the only GPU used in the new LLM benchmark.

A cluster of 3,584 H100 GPUs completed a large-scale benchmark based on GPT-3 in just 11 minutes.

The MLPerf LLM benchmark is based on OpenAI's GPT-3 model and contains 175 billion parameters.

Lambda Labs estimates that training such a large model requires about 3.14E23 FLOPS of computation.

11 minutes to train GPT-3 how the monster is formed

The highest-ranking system on the LLM and BERT natural language processing (NLP) benchmarks was jointly developed by NVIDIA and Inflection AI.

Hosted by CoreWeave, a cloud service provider specializing in enterprise-grade GPU-accelerated workloads.

The system combines 3,584 NVIDIA H100 accelerators with 896 Intel Xeon Platinum 8462Y+ processors.

Because Nvidia introduced a new Transformer engine in the H100, which is specially designed to accelerate Transformer model training and reasoning, increasing the training speed by 6 times.

The performance that CoreWeave can deliver from the cloud is very close to what Nvidia can deliver from an AI supercomputer running in an on-premises data center.

This is thanks to the low-latency networking of the NVIDIA Quantum-2 InfiniBand network used by CoreWeave.

As the number of H100 GPUs involved in training expands from hundreds to more than 3,000.

Good optimization enables the entire technology stack to achieve near-linear performance scaling in the demanding LLM test.

If the number of GPUs is reduced to half, the time to train the same model increases to 24 minutes.

Showing that the efficiency potential of the overall system, as GPUs increase, is superlinear.

The main reason is that Nvidia has considered this problem from the beginning of GPU design, using NVLink technology to efficiently realize the communication between GPUs.

Of the 90 systems tested, 82 were accelerated using NVIDIA GPUs.

Single card training efficiency

System cluster training time comparison

Intel's review systems used anywhere from 64 to 96 Intel Xeon Platinum 8380 processors and 256 to 389 Intel Habana Gaudi2 accelerators.

However, Intel submitted GPT-3 with a training time of 311 minutes.

Compared with Nvidia, the results are a little bit miserable.

Analyst: Nvidia has too much advantage

Industry analysts believe that Nvidia's technical advantage in GPU is very obvious.

As an AI infrastructure provider, its dominant position in the industry is also reflected in the stickiness of the ecosystem that Nvidia has built up over the years.

The AI community is also very dependent on Nvidia's software.

Almost all AI frameworks are based on the underlying CUDA libraries and tools provided by Nvidia.

And it also offers full-stack AI tools and solutions.

In addition to supporting AI developers, Nvidia continues to invest in enterprise-grade tools for managing workloads and models.

In the foreseeable future, Nvidia's leading position in the industry will be very stable.

Analysts further pointed out that the powerful functions and efficiency of the NVIDIA system for AI training in the cloud, as shown in the MLPerf test results, are the biggest capital of NVIDIA's "war for the future".

Next Generation Ada Lovelace GPU, Released in 2025

Zhiye Liu, a freelance writer at Tom's Hardware, also recently published an article introducing plans for the next generation of Nvidia Ada Lovelace graphics cards.

There is no doubt about the ability of H100 to train large models.

With only 3584 H100s, a GPT-3 model can be trained in just 11 minutes.

At a recent press conference, Nvidia shared a new roadmap detailing next-generation products, including the successor to the GeForce RTX 40-series Ada Lovelace GPUs, the former of which are some of the best gaming graphics cards available today.

According to the roadmap, Nvidia plans to launch the "Ada Lovelace-Next" graphics card in 2025.

If the current naming scheme continues, the next generation of GeForce products should be listed as the GeForce RTX 50 series.

According to the information obtained by the South American hacker organization LAPSU$, Hopper Next is likely to be named Blackwell.

On consumer-grade graphics cards, Nvidia maintains a two-year update rhythm.

They launched Pascal in 2016, Turing in 2018, Ampere in 2020, and Ada Lovelace in 2022.

If the successor of Ada Lovelace will be launched in 2025 this time, Nvidia will undoubtedly break the usual rhythm.

The recent AI explosion has created a huge demand for NVIDIA GPUs, whether it is the latest H100 or the previous generation A100.

According to reports, a major manufacturer has ordered Nvidia GPUs worth $1 billion this year.

Despite export restrictions, my country remains one of Nvidia's largest markets in the world.

(At the Huaqiangbei electronics market in Shenzhen, it is said, you can buy a small number of Nvidia A100s for $20,000 each, twice the usual price.)

In this regard, Nvidia has fine-tuned some AI products and released specific SKUs such as H100 or A800 to meet export requirements.

Zhiye Liu analyzed this. From another perspective, export regulations are actually beneficial to Nvidia, because it means that chip manufacturer customers must buy more variants of the original GPU to obtain the same performance.

This can also understand why Nvidia will give priority to generating computing GPUs instead of gaming GPUs.

Recent reports indicate that Nvidia has ramped up production of compute-grade GPUs.

Not facing serious competition from AMD's RDNA 3 product stack, nor does Intel pose a serious threat to the GPU duopoly, so Nvidia can stall on the consumer side.

More recently, Nvidia has expanded its GeForce RTX 40-series product stack with the GeForce RTX 4060 and GeForce RTX 4060 Ti.

There's potential for a GeForce RTX 4050, along with an RTX 4080 Ti or GeForce RTX 4090 Ti on top, etc.

If forced to, Nvidia can also take out a product from the old Turing version, update Ada Lovelace, give it a "Super" treatment, and further expand the Ada lineup.

Finally, Zhiye Liu said that at least this year or next year, the Lovelace architecture will not really be updated.

References:

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)