Meta releases strongest open-source model to catch up with GPT-4, Xiaozha: overtake next year
m337283
发表于 2024-7-24 14:01:02
3567
0
0
On July 23rd Pacific Time, Meta (formerly known as Facebook) officially released its Llama 3.1 model, which has three sizes: 8B, 70B, and 405B, and the context length has been increased to 128K. It is worth noting that according to benchmark data provided by Meta, the most highly anticipated 405B (405 billion parameters) is already comparable in performance to GPT-4 under OpenAI and Claude 3 under Anthropic, an artificial intelligence startup. This means that top open source models have officially caught up with top closed source models in terms of performance, and the battle between open and closed sources may come to an end.
In addition to releasing products, Zuckerberg also released an "Open Source Manifesto" called "Open Source Artificial Intelligence is the Way Forward". Zuckerberg stated in the article, "Today, several tech companies are developing leading closed models. But open source is rapidly narrowing the gap
Open source Llama 3.1-405B outperforms closed source GPT-4 in performance
According to official sources, Llama 3.1 was trained on data of over 15 trillion tokens, using 16000 H100 tokens. The pre training data used is up to December 2023. To ensure training stability, only the Transformer model architecture was adjusted instead of the popular Hybrid Expert Model (MoE) architecture.
At present, Llama 3.1 supports language dialogue in various countries, and three sizes of 8B, 70B, and 405B have been released this time, with the context length increased to 128K. Sima Huapeng, founder of silicon-based intelligence, commented that the information processing capability of Llama 3.1 has greatly improved, "for example, it's like going from being able to remember only 4000 Chinese characters to being able to remember 64000 Chinese characters.
For a long time, there have been endless discussions in the industry about open and closed sources. At the World AI Conference this month, Robin Lee, the founder, chairman and CEO of Baidu, said again on the spot that "the commercial closed source model is the best". Robin Lee said that the open source model is valuable in some academic research and teaching fields and can be used to study the working mechanism of the big model and form a theory. But when faced with a fierce business environment, the commercialized closed source model is the most effective way to achieve higher business efficiency and lower costs than peers.
However, according to the benchmark data provided by Meta, the open-source model is also very "capable" this time. Among them, the 405B (405 billion parameters) with the most attention on Llama 3.1 is already comparable in performance to GPT-4 and Claude 3, which means that top open-source models have caught up with flagship closed source models.
It is worth noting that this open source is more thorough. When launching the Llama 3 8B and Llama 3 70B products in April this year, Meta still prohibited developers from using this model to train other generative models. In the new open source protocol released this time, Meta no longer prohibits the use of new models to improve other models.
At the same time as Meta launches a new model, Nvidia also announced the launch of new NVIDIA AI Foundry services and NVIDIA NIM inference microservices, along with the newly launched Llama 3.1 series open source models, providing strong support for generative AI for global enterprises. It is reported that with the help of NVIDIA AI Foundry, enterprises and countries can now use Llama 3.1 and NVIDIA software, computing, and expertise to create custom "super models" for industry use cases in their specific fields.
At the same time as releasing the product, Zuckerberg also released an open letter titled 'Open Source Artificial Intelligence is the Way Forward'. Zuckerberg takes the development of early Linux (operating system kernel) as an example. He proposed that in the early days of high-performance computing, major technology companies invested heavily in developing their own closed source versions of Unix, and it was hard to imagine any other way to develop such advanced software. But ultimately, open-source Linux became popular - initially because it allowed developers to freely modify code at a more affordable price, and over time it became more advanced, secure, and had a wider ecosystem that supported more features than any closed source Unix. Nowadays, Linux is the industry standard foundation for cloud computing and operating systems that run most mobile devices.
Zuckerberg said he believes that artificial intelligence will also develop in a similar way. Today, several technology companies are developing leading closed source models, but open source is rapidly narrowing the gap. Last year, we released Llama 2, which was only comparable to the outdated previous generation model. By this year, Llama 3 has been able to rival the most advanced models and is leading in certain fields. Starting from next year, we expect the future Llama model to become the most advanced model in the industry
We are further developing the image, video, and voice functions of Llama 3
When asked why open source is more beneficial for developers, Zuckerberg listed some phenomena he observed during his research process: for developers, CEOs, and government officials around the world, they need to train, fine tune, and refine their own models; We also need to have some control over the model and do not want to be constrained by a closed supplier. At the same time, I also hope to protect my data and do not want to send data cloud APIs to closed source models; I am more looking forward to investing in ecosystems that will become long-term standards, and many people believe that the development speed of open source models is faster than that of closed source models.
Zuckerberg also mentioned that for Meta, choosing an open source model is more conducive to Meta's vision of continuing to create the best user experience. Regarding the question of whether open source will cause the Llama series of large models to lose their technological advantages, Zuckerberg provided answers from the aspects of the open integrity of the ecosystem and Meta's commercialization path on large models.
Firstly, in order to ensure that we can use the best technology and not be trapped in a closed ecosystem for a long time, Llama needs to develop into a complete ecosystem, including tools, efficiency improvements, chip optimization, and other integrations. If we were the only company using Llama, then this ecosystem would not develop. Secondly, I anticipate that artificial intelligence development will continue to maintain a high level of competition, which means that open sourcing any given model will not lose its huge advantage over the next best model at the time. The path for Llama to become an industry standard is to maintain competitiveness, efficiency, and openness generation after generation; Thirdly, a key difference between Meta and closed source model providers is that selling access to AI models is not our business model. This means that publicly releasing Llama will not weaken our revenue, sustainability, or ability to invest in research like closed source providers do, which is also one of the reasons why some closed source providers continue to lobby the government against open source
Llama internal scientist @ astonzhangAZ also revealed on social media that the research team is currently considering integrating image, video, and voice functions into Llama 3, so that the model can recognize images and videos and support interaction through voice.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Robin Lee's internal speech is exposed! Poured cold water on open source big models
- Robin Lee's latest voice: the open source model will fall behind
- Robin Lee: The open source model will fall behind
- Meta releases the latest open-source big model Llama 3, continuously catching up with OpenAI
- Meta releases the latest open-source big model Llama 3
- IBM announces its "open source" strategy: releasing Granite series models to focus on code generation
- Google releases Gemma 2 open-source AI model
- Meta releases strongest open-source AI model Llama 3.1, Zuckerberg discusses company strategy in detail
- Meta releases' Strongest Open Source Model ', opening a new page in the battle between open source and closed source. The big model may face a reshuffle
- Will DeepMind's open-source biomolecule prediction model win the Nobel Prize and ignite a wave of AI pharmaceuticals?
-
知名做空机构香橼研究(Citron Research)周四(11月21日)在社交媒体平台X上发布消息称,该公司已决定做空“比特币大户”微策略(Microstrategy)这家公司,并认为该公司已经将自己变身成为一家比特币投资基金 ...
- caffycat
- 12 小时前
- 支持
- 反对
- 回复
- 收藏
-
每经AI快讯,11月20日,文远知行宣布旗下自动驾驶环卫车S6与无人扫路机S1分别在新加坡滨海湾海岸大道与滨海艺术中心正式投入运营。据介绍,这是新加坡首个商业化运营的自动驾驶环卫项目。 ...
- star8699
- 前天 19:48
- 支持
- 反对
- 回复
- 收藏
-
上证报中国证券网讯(记者王子霖)11月20日,斗鱼发布2024年第三季度未经审计的财务报告。本季度斗鱼依托丰富的游戏内容生态,充分发挥主播资源和新业务潜力,持续为用户提供高质量的直播内容及游戏服务,进一步 ...
- goodfriendboy
- 前天 20:09
- 支持
- 反对
- 回复
- 收藏
-
人民网北京11月22日电 (记者栗翘楚、任妍)2024广州车展,在新能源汽车占据“半壁江山”的同时,正加速向智能网联新能源汽车全面过渡,随着“端到端”成为新宠,智能驾驶解决方案成为本届广州车展各大车企竞 ...
- 3233340
- 6 小时前
- 支持
- 反对
- 回复
- 收藏