首页 News 正文

Meta confirms that the open-source big model LLaMA 3 will debut next month, and by the end of the year, a "supercomputing library" equivalent to 600000 H100 GPUs will be built | Big Model World

楚一帆
220 0 0

After nearly a year of launching the open-source big model LLaMA 2, Meta's new generation big model LLaMA 3 is about to be released.
At an event held in London on April 9th, Meta confirmed plans to release LLaMA 3 for the first time next month. It is said that the model will have multiple versions with different functionalities.
But Meta did not disclose the parameter size of LLaMA 3. "Over time, our goal is to make Meta AI driven by LLaMA the most useful assistant in the world," said Joelle Pineau, Vice President of Meta Artificial Intelligence Research. "There is still a considerable amount of work to be done to achieve this goal."
According to a report released by technology foreign media The Information on April 8th, as a benchmark model for GPT-4, the large-scale version of LLaMA 3 may have a parameter count of over 140 billion, while the largest LLaMA 2 version has a parameter count of 70 billion. LLaMA 3 will support multimodal processing, which means simultaneously understanding and generating text and images.
It is worth noting that LLaMA 3 will continue Meta's long-standing open source approach. At present, the competition in the open source model industry is becoming increasingly fierce, and open source big models are also evolving to be more and more powerful. Up to now, many companies including Google, Musk's xAI, Mistral AI, StabilityAI, etc. have released open-source big models.
As a leader in the open source modeling industry, Meta's investment in AI infrastructure cannot be underestimated, and currently only Microsoft has a comparable reserve of computing power. According to a technical blog post by Meta, by the end of 2024, the company will purchase an additional 350000 Nvidia H100 GPUs, which, including other GPUs, will have computing power equivalent to nearly 600000 H100 units.
Just next month! LLaMA 3 is about to debut
The number of parameters may reach 140 billion
At an event held in London on April 9th, Meta confirmed plans to release LLaMA 3 for the first time next month. "We hope to launch the new next-generation foundational model suite LLaMA 3 next month, even in a very short period of time," said Nick Clegg, President of Global Affairs at Meta
From Clegg's statement, LLaMA 3 will have multiple versions with different functionalities. "Within this year, we will release a series of models with different functionalities and versatility, and we will start releasing them soon."
Meanwhile, Meta's Chief Product Officer, Chris Cox, added that Meta plans to use LLaMA 3 to support multiple products of Meta.
It is worth noting that LLaMA 3 will continue Meta's long-standing open source approach.
Unlike OpenAI's adherence to a closed source approach and large parameter LLM, Meta chose an open source strategy and miniaturized LLM from the beginning.
In February 2023, Meta publicly released the LLaMA big model on its official website. Similar to the GPT series models, LLaMA is also an autoregressive language model built on the Transformer infrastructure.
LLaMA includes four parameter scales of 7 billion, 13 billion, 33 billion, and 65 billion, aiming to promote the miniaturization and democratization of LLM research. In contrast, the GPT-3 reached a parameter scale of 175 billion at its highest. Meta summarized in the paper at the time that although the volume was more than 10 times smaller, the performance of LLaMA (13 billion parameters) was superior to GPT-3.
Generally speaking, smaller models have lower costs, run faster, and are easier to fine tune. As Meta CEO Zuckerberg stated in a previous earnings conference call, open source models are often safer, more efficient, and more cost-effective to run, constantly subject to community scrutiny and development.
When it comes to open source issues, Zuckerberg also said in an interview with foreign media The Verge, "I tend to believe that one of the biggest challenges is that if what you create is really valuable, it will eventually become very focused and narrow. If you make it more open, then you can solve a lot of the problems that inequality of opportunity and value may bring. Therefore, this is an important component of the entire open source vision."
In addition, small models also make it easy for developers to develop AI software on mobile devices, which is why the LLaMA series models have received widespread attention from developers since they were open-source. Currently, many models on Github are developed based on the LLaMA series models.
By July last year, Meta had released LLaMA 2 again. At that time, Meta also adopted a strategy of starting with small models. Before releasing the large-scale version of LLaMA 2 with 70 billion parameters, Meta first released small versions with 13 billion and 7 billion parameters.
However, according to relevant tests, LLaMA 2 refuses to answer some less controversial questions, such as how to prank friends or how to "kill" car engines. In recent months, Meta has been working to make LLaMA 3 more open and accurate in answering controversial questions.
Although Meta did not disclose the parameter size of LLaMA 3, according to The Information report, as a benchmark model for GPT-4, the large-scale version of LLaMA 3 has a parameter size of over 140 billion, which is twice that of the largest version of LLaMA 2.
In the entire open source model industry, competition is becoming increasingly fierce, and open source big models are also evolving to become stronger.
In February of this year, Google made a rare change from last year's insistence on a closed source strategy for large models and launched the open source large model Gemma; In March, Musk also opened up the Grok-1 model of his xAI company. According to the performance testing documents of Gemma and Grok-1, their performance in multiple benchmark tests such as mathematics, reasoning, and code has exceeded that of LLaMA 2 models of the same scale.
As of now, multiple technology companies including Google, xAI, Mistral AI, DataBricks, and StabilityAI have released open-source big models. An industry insider previously said in an interview with the Daily Economic News, "Open source is the trend, and I believe Meta is leading this trend, followed by smaller companies such as Mistral AI and HuggingFace."
Crazy AGI: Spending $10 billion to hoard chips
By the end of the year, the computing power will be equivalent to approximately 600000 H100
As a leader in the open source modeling industry, Meta's investment in AI infrastructure cannot be underestimated.
In fact, Meta posted a technology blog last month showcasing its computing resources, as well as the details and roadmap for laying out AI infrastructure. The company stated that its long-term vision is to build an open and responsible General Artificial Intelligence (AGI) that can be widely used and benefit everyone.
Meta wrote in her blog, "By the end of 2024, our goal is to continue expanding (AI) infrastructure construction, including 350000 Nvidia H100 GPUs, which is part of its product portfolio. Including others, its computing power is equivalent to nearly 600000 H100." It is reported that currently only Microsoft has a comparable reserve of computing power. According to the price given on Amazon, a single H100 chip costs approximately $30000, and the price of 350000 H100s is $10.5 billion (approximately 76 billion RMB).
In this document, Meta also revealed some cluster details for training LLaMA 3, which consists of 24576 Nvidia H100 GPUs.
According to a report released by market tracking company Omdia last year, Meta and Microsoft are the largest buyers of Nvidia H100 GPUs. According to its estimation, the two companies mentioned above each purchased up to 150000 H100 GPUs in 2023, which is more than three times the number of H100 GPUs purchased by technology companies such as Google, Amazon, and Oracle.
In the aforementioned document, Meta also reiterated its commitment to the consistent open source approach, "Meta has always been committed to open innovation in artificial intelligence software and hardware. We believe that open source hardware and software will always be valuable tools to help the industry solve problems on a large scale."
It is worth mentioning that with his investment in AI, Zuckerberg ranks fourth on Forbes' latest 2024 (38th) Global Billionaires List, with a net asset value of $177 billion, which is also the highest ranking Zuckerberg has ever recorded. In US dollars, Zuckerberg's net asset value has grown the most in the past year, with a total increase of 112.6 billion US dollars, a growth rate of 174.8%.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

  •   知名做空机构香橼研究(Citron Research)周四(11月21日)在社交媒体平台X上发布消息称,该公司已决定做空“比特币大户”微策略(Microstrategy)这家公司,并认为该公司已经将自己变身成为一家比特币投资基金 ...
    caffycat
    昨天 11:18
    支持
    反对
    回复
    收藏
  •   每经AI快讯,11月20日,文远知行宣布旗下自动驾驶环卫车S6与无人扫路机S1分别在新加坡滨海湾海岸大道与滨海艺术中心正式投入运营。据介绍,这是新加坡首个商业化运营的自动驾驶环卫项目。 ...
    star8699
    3 天前
    支持
    反对
    回复
    收藏
  •   上证报中国证券网讯(记者王子霖)11月20日,斗鱼发布2024年第三季度未经审计的财务报告。本季度斗鱼依托丰富的游戏内容生态,充分发挥主播资源和新业务潜力,持续为用户提供高质量的直播内容及游戏服务,进一步 ...
    goodfriendboy
    3 天前
    支持
    反对
    回复
    收藏
  •   人民网北京11月22日电 (记者栗翘楚、任妍)2024广州车展,在新能源汽车占据“半壁江山”的同时,正加速向智能网联新能源汽车全面过渡,随着“端到端”成为新宠,智能驾驶解决方案成为本届广州车展各大车企竞 ...
    3233340
    昨天 17:06
    支持
    反对
    回复
    收藏
楚一帆 注册会员
  • 粉丝

    0

  • 关注

    0

  • 主题

    38