ChatGPT-4 will have 100 Trillion parameters

Tram Ho

OpenAI was born with the goal of solving the challenge of achieving General Artificial Intelligence (AGI) – a type of artificial intelligence capable of doing anything that a human can do. This technology has the potential to change the world as we know it. Used correctly, AGI can benefit everyone, but if it falls into the wrong hands, it can also become the most feared weapon. So OpenAI has taken on this mandate to ensure that it benefits everyone equally.

“Our goal is to advance digital intelligence in a way that can most benefit humanity.”

Although there have been many advances in the fields of computer science and artificial intelligence, the problem of achieving general artificial intelligence (AGI) remains one of the largest scientific projects that man has ever attempted. join. Stuart Russell, a computer science professor at Berkeley and a pioneer in the field of artificial intelligence, argues that “focusing on primitive computing power completely misses the point […] We don’t know how to make it work. a really intelligent machine – even if it’s the size of the universe.”

However, OpenAI believes that training large neural networks on big data and mainframes is the best way to achieve AGI. OpenAI believes in the “scale hypothesis”. If given a scalable algorithm, for example Transformer – the underlying architecture behind GPT, there could be a convenient path to AGI by training increasingly larger models based on this algorithm.

And that’s what OpenAI did. They started training larger and larger models to exploit the potential potential in deep learning. Unambiguous first steps in this direction include the release of GPT and GPT-2. These massive language models laid the groundwork for the star of the show: GPT-3. A language model 100 times larger than GPT-2, with 175 billion parameters.

GPT-3 was the largest neural network ever created – and still the largest dense neural network. Its mastery of the language and its innumerable abilities have surprised most people. And although some experts remain skeptical, large language models already feel a lot like humans. It’s a big step for OpenAI researchers to strengthen their beliefs and convince us that AGI is a matter of deep learning.

The three most important factors – algorithms, data and computers.

OpenAI believes in the scaling hypothesis. If there is an algorithm capable of scaling, in this case the underlying architecture of the GPT family – the transducer, there could be a direct path to AGI by training increasingly large models based on this algorithm.

However, large models are only one part of the AGI puzzle. To train them, large data sets and large computing power are required.

Data was no longer an obstacle as the machine learning community began to explore the potential of unsupervised learning. This, combined with the language model that generates and transfers tasks in a few steps, solves the “big data set” problem for OpenAI.

OpenAI needed massive computational resources to train their models, and they partnered with Microsoft to gain access to their powerful GPU and cloud computing infrastructure. However, the GPU is not enough to train OpenAI’s increasingly large models. Therefore, they chose to use a special third-party AI chip, and Cerebras Systems is one of them. The company built the largest chip ever made to train large neural networks in 2019 and is used by OpenAI to take full advantage of this engineering marvel.

One chip and one model – WSE-2 & GPT-4.

Two important pieces of news were revealed in Wired magazine two weeks ago. Cerebras has built the newest chip on the market, the Wafer Scale Engine Two (WSE-2) with 2.6 trillion transistors and 850,000 compute cores. They solved the problem of efficient computational power compression, cooling, and efficient I/O data flow generation. The applications of this chip are very limited, but training large neural networks is one of them. And so, Cerebras approached OpenAI for a partnership.

Second, Andrew Feldman, CEO of Cerebras, has revealed that GPT-4 will have about 100 trillion parameters and will release in the next few years. That shows that OpenAI continues to move forward with extremely large language models. The size of the GPT-4 will be x500 larger than the GPT-3 that shocked the world last year.

What can we expect from GPT-4?

GPT-4’s 100 trillion parameters is a really big number. To better understand it, we can compare it with our brain. In the brain, there are about 80-100 billion neurons (equivalent to GPT-3) and about 100 trillion synapses.

GPT-4 will have the same number of parameters as the brain’s synapses.

Such an increase in the size of a neural network could yield significant advances from GPT-3 that we cannot yet imagine. However, comparing an artificial neural network with a brain is a complicated business, because artificial neurons are built on the basis of biological neurons but are not exactly the same. A recent study in Neuron shows that at least a 5-layer neural network is needed to simulate the behavior of a biological neuron. That means it takes about 1000 artificial neurons to be equivalent to one biological neuron.

Either way, GPT-4 will bring surprises for us. Unlike GPT-3, it can be more than just a language model. Ilya Sutskever, chief scientist at OpenAI, mentioned this when he wrote about diversification in December 2020:

“In 2021, language models will begin to perceive the visual world. Text alone can express a lot of information about the world, but it is incomplete, because we live in a world of images.”

GPT-4 is a major breakthrough in the field of artificial intelligence. With a number of parameters up to 100 trillion, it will have the same capacity as the number of synapses of the brain. This is an amazing step up from GPT-3 with only 12 billion parameters. GPT-4 can be not only a language model, but also capable of manipulating visual concepts through language and even programming. However, determining whether GPT-4 can confer human-like traits such as reasoning and common sense is an unanswered question. Even so, OpenAI has been relentless in its efforts to exploit the hidden capabilities of GPT-3 and come up with specific cases like DALL·E and Codex. GPT-4 promises a combination of vastness and depth of specialized and general systems, and will be a remarkable breakthrough in the field of artificial intelligence.

Source: https://congdongchatgpt.com/d/43-gpt-4-se-co-100-nghin-ty-tham-so-gap-500-lan-kich-thuoc-cua-gpt-3

Share the news now

Source : Viblo