首页 News 正文

Meta releases strongest open-source model to catch up with GPT-4, Xiaozha: overtake next year

m337283
3559 0 0

On July 23rd Pacific Time, Meta (formerly known as Facebook) officially released its Llama 3.1 model, which has three sizes: 8B, 70B, and 405B, and the context length has been increased to 128K. It is worth noting that according to benchmark data provided by Meta, the most highly anticipated 405B (405 billion parameters) is already comparable in performance to GPT-4 under OpenAI and Claude 3 under Anthropic, an artificial intelligence startup. This means that top open source models have officially caught up with top closed source models in terms of performance, and the battle between open and closed sources may come to an end.
In addition to releasing products, Zuckerberg also released an "Open Source Manifesto" called "Open Source Artificial Intelligence is the Way Forward". Zuckerberg stated in the article, "Today, several tech companies are developing leading closed models. But open source is rapidly narrowing the gap
Open source Llama 3.1-405B outperforms closed source GPT-4 in performance
According to official sources, Llama 3.1 was trained on data of over 15 trillion tokens, using 16000 H100 tokens. The pre training data used is up to December 2023. To ensure training stability, only the Transformer model architecture was adjusted instead of the popular Hybrid Expert Model (MoE) architecture.
At present, Llama 3.1 supports language dialogue in various countries, and three sizes of 8B, 70B, and 405B have been released this time, with the context length increased to 128K. Sima Huapeng, founder of silicon-based intelligence, commented that the information processing capability of Llama 3.1 has greatly improved, "for example, it's like going from being able to remember only 4000 Chinese characters to being able to remember 64000 Chinese characters.
For a long time, there have been endless discussions in the industry about open and closed sources. At the World AI Conference this month, Robin Lee, the founder, chairman and CEO of Baidu, said again on the spot that "the commercial closed source model is the best". Robin Lee said that the open source model is valuable in some academic research and teaching fields and can be used to study the working mechanism of the big model and form a theory. But when faced with a fierce business environment, the commercialized closed source model is the most effective way to achieve higher business efficiency and lower costs than peers.
However, according to the benchmark data provided by Meta, the open-source model is also very "capable" this time. Among them, the 405B (405 billion parameters) with the most attention on Llama 3.1 is already comparable in performance to GPT-4 and Claude 3, which means that top open-source models have caught up with flagship closed source models.
It is worth noting that this open source is more thorough. When launching the Llama 3 8B and Llama 3 70B products in April this year, Meta still prohibited developers from using this model to train other generative models. In the new open source protocol released this time, Meta no longer prohibits the use of new models to improve other models.
At the same time as Meta launches a new model, Nvidia also announced the launch of new NVIDIA AI Foundry services and NVIDIA NIM inference microservices, along with the newly launched Llama 3.1 series open source models, providing strong support for generative AI for global enterprises. It is reported that with the help of NVIDIA AI Foundry, enterprises and countries can now use Llama 3.1 and NVIDIA software, computing, and expertise to create custom "super models" for industry use cases in their specific fields.
At the same time as releasing the product, Zuckerberg also released an open letter titled 'Open Source Artificial Intelligence is the Way Forward'. Zuckerberg takes the development of early Linux (operating system kernel) as an example. He proposed that in the early days of high-performance computing, major technology companies invested heavily in developing their own closed source versions of Unix, and it was hard to imagine any other way to develop such advanced software. But ultimately, open-source Linux became popular - initially because it allowed developers to freely modify code at a more affordable price, and over time it became more advanced, secure, and had a wider ecosystem that supported more features than any closed source Unix. Nowadays, Linux is the industry standard foundation for cloud computing and operating systems that run most mobile devices.
Zuckerberg said he believes that artificial intelligence will also develop in a similar way. Today, several technology companies are developing leading closed source models, but open source is rapidly narrowing the gap. Last year, we released Llama 2, which was only comparable to the outdated previous generation model. By this year, Llama 3 has been able to rival the most advanced models and is leading in certain fields. Starting from next year, we expect the future Llama model to become the most advanced model in the industry
We are further developing the image, video, and voice functions of Llama 3
When asked why open source is more beneficial for developers, Zuckerberg listed some phenomena he observed during his research process: for developers, CEOs, and government officials around the world, they need to train, fine tune, and refine their own models; We also need to have some control over the model and do not want to be constrained by a closed supplier. At the same time, I also hope to protect my data and do not want to send data cloud APIs to closed source models; I am more looking forward to investing in ecosystems that will become long-term standards, and many people believe that the development speed of open source models is faster than that of closed source models.
Zuckerberg also mentioned that for Meta, choosing an open source model is more conducive to Meta's vision of continuing to create the best user experience. Regarding the question of whether open source will cause the Llama series of large models to lose their technological advantages, Zuckerberg provided answers from the aspects of the open integrity of the ecosystem and Meta's commercialization path on large models.
Firstly, in order to ensure that we can use the best technology and not be trapped in a closed ecosystem for a long time, Llama needs to develop into a complete ecosystem, including tools, efficiency improvements, chip optimization, and other integrations. If we were the only company using Llama, then this ecosystem would not develop. Secondly, I anticipate that artificial intelligence development will continue to maintain a high level of competition, which means that open sourcing any given model will not lose its huge advantage over the next best model at the time. The path for Llama to become an industry standard is to maintain competitiveness, efficiency, and openness generation after generation; Thirdly, a key difference between Meta and closed source model providers is that selling access to AI models is not our business model. This means that publicly releasing Llama will not weaken our revenue, sustainability, or ability to invest in research like closed source providers do, which is also one of the reasons why some closed source providers continue to lobby the government against open source
Llama internal scientist @ astonzhangAZ also revealed on social media that the research team is currently considering integrating image, video, and voice functions into Llama 3, so that the model can recognize images and videos and support interaction through voice.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

  • 【科技记者古尔曼:苹果计划于12月第一周发布iOS 18.2系统更新 带来更多人工智能功能】科技记者古尔曼透露,苹果计划于12月第一周发布iOS 18.2系统更新。iOS 18.2将为iPhone 15 Pro机型和所有iPhone 16机型带来更多 ...
    cristianna
    昨天 17:32
    支持
    反对
    回复
    收藏
  •   为期超七周的大罢工终于落下帷幕。   当地时间11月4日,波音美国西海岸工厂工人们就改进后的合同提案投票。   随后,代表着波音超过33000名西雅图地区机械师的IAM工会经表决,以59%的同意票决定接纳波音提 ...
    cristianna
    2 小时前
    支持
    反对
    回复
    收藏
  •   近日,爱立信中国区总裁方迎在接受《经济参考报》记者采访时表示,5G技术在全球范围内得到了迅速发展,但面临商业潜力未能充分挖掘、网络运营难度较以往更高两大挑战。因此,运营商在继续5G网络部署的同时,应关 ...
    blueskybb
    昨天 15:05
    支持
    反对
    回复
    收藏
  •   “新四化”的时代浪潮下,新能源汽车行业百家争鸣。伴随着自主品牌不断崛起,合资品牌当下的生存状况备受外界关注,如何打好电动化时代的突围战,成为合资品牌的新课题。   作为国内合资车企的代表之一,上汽 ...
    mbgg2797
    5 小时前
    支持
    反对
    回复
    收藏
m337283 新手上路
  • 粉丝

    0

  • 关注

    0

  • 主题

    0