首页 News 正文

Heavyweight! Meta launches open-source large model Llama 3 with performance approaching GPT-4

万小盈
1316 0 0

On April 18th local time, the AI field received heavyweight news, and Meta officially released the long-awaited open source model Llama 3.
At the same time, Meta CEO Zuckerberg announced that based on the latest Llama 3 model, Meta's AI assistant has now covered a full range of applications such as Instagram, WhatsApp, Facebook, and has launched a separate website. Additionally, there is an image generator that can generate images based on natural language prompts.
Meta CEO Mark Zuckerberg stated in a video that the assistant can answer questions, create animations, and generate images.
Zuckerberg posted on Facebook: Big AI news today

Meta CEO Mark Zuckerberg said in the video: We believe that Meta AI is now the smartest artificial intelligence assistant you can freely use. Meta AI is built into the search boxes of WhatsApp, Instagram, Facebook, and Messenger applications, allowing users to easily ask questions that can be answered through new tools.
Zuckerberg stated that Meta's generative AI capabilities are the most powerful among free products.
On Facebook, Instagram, WhatsApp, and Messenger, users can now use Meta AI for searching without switching between applications:
When browsing the information flow, you can also directly ask Meta AI for more information from the post:
The gameplay brought by the image generator is more interesting, and the Image function brings real-time creation of images from text. The beta version of this feature will be launched on WhatsApp and Meta AI network experiences in the United States starting today.
When you start typing, you will see an image appearing, which changes with each additional letter input:
Meta states that Llama 3 outperforms industry-leading models in multiple key benchmark tests, achieving comprehensive leadership in code generation and other tasks. It is capable of complex reasoning, follows instructions more closely, visualizes ideas, and solves many subtle problems.
The main highlights of Llama 3 include:
Based on training with over 15T tokens, it is equivalent to more than 7 times the Llama 2 dataset;
Supporting 8K long text, the improved tokenizer has a vocabulary of 128K tokens, which can achieve better performance;
It has the most advanced performance among a large number of important benchmarks;
New ability categories, including enhanced reasoning and coding abilities;
The training efficiency is three times higher than Llama 2;
A new version of trust and security tools with Llama Guard 2, Code Shield, and CyberSec Eval 2.
The recently released 8B and 70B versions of the Llama 3 model have been used in Meta AI assistants and are also open source for developers, including pre trained and fine-tuning versions.
The latest release of Llama 3 models with 8B and 70B parameters can be said to be a significant leap in Llama 2. Due to improvements in pre training and post training, the pre training and instruction fine-tuning models released this time are the best models in the current size of 8B and 70B parameters. At the same time, the improvement of the post training process greatly reduces the model error rate, further improves consistency, and increases the diversity of model responses.
Llama 3 takes data and scale to new heights. Meta states that Llama 3 was trained on two customized 24K GPU clusters based on data exceeding 15T tokens - more than 7 times the size of the Llama 2 dataset, and 4 times the size of the code data. This has resulted in the most powerful Llama model to date, with Llama 3 supporting 8K context length, which is twice the capacity of Llama 2.
In addition, Meta has developed a new set of high-quality human assessment datasets. This evaluation set contains 1800 prompts, covering 12 key use cases: seeking advice, brainstorming, categorization, closed ended Q&A, coding, creative writing, extraction, character shaping, open ended Q&A, reasoning, rewriting, and summarizing. In order to prevent overfitting of Llama 3 on this evaluation set, Meta stated that their own team is also unable to access the dataset. The following figure shows the summary results of manual evaluation of these categories and prompts for Claude Sonnet, Mistral Medium, and GPT-3.5.
The following figure shows a comparison between the Llama 3 pre trained model and other models of the same scale, with the former exhibiting a SOTA level.
Managing large, high-quality training datasets is crucial for training the best language models. Meta has invested a significant amount of cost in pre training data. Llama 3 was pre trained using tokens exceeding 15T, all of which were collected from public sources. Overall, the training dataset of Llama 3 is more than seven times larger than the dataset used by Llama 2, and it contains more than four times more code. In preparation for the upcoming multilingual use cases, over 5% of the Llama 3 pre training dataset consists of high-quality non English data covering over 30 languages. However, the performance level of Llama 3 on these languages is not expected to be the same as English.
To ensure that Llama 3 receives training on the highest quality data, the research team developed a series of data filtering pipelines, including the use of heuristic filters (filters), NSFW filters, semantic duplicate data removal methods, and text classifiers to predict data quality.
The research team found that previous generations of Llama were very good at recognizing high-quality data, so Meta used Llama 2 to generate training data for the text quality classifier that supports Llama 3.
The research team also conducted extensive experiments to evaluate the optimal mixing method of data from different sources in the final pre training dataset, ultimately ensuring that Llama 3 performs well in various use cases (including daily problems, STEM, coding, historical knowledge, etc.).
Meta stated that the maximum Llama 3 parameters exceed 400B, and although these models are still in training, they will also be released in the coming months. New features include multimodal, multilingual dialogue ability, longer contextual windows, and stronger overall capabilities.
Meta hopes that Llama 3 can catch up with OpenAI's GPT-4. However, informed sources have revealed that as researchers have not yet started fine-tuning Llama 3, it has not been decided whether it will be a multimodal model. Fine tuning is the process by which developers provide additional data to existing models so that they can learn new information or tasks. Larger models typically provide higher quality responses, while smaller models tend to provide responses faster. It is reported that the official version of Llama 3 will be officially launched in July this year.
Meta also announced a new partnership with Alphabet's Google, which includes real-time search results in the assistant's response as a supplement to its existing partnership with Microsoft Bing. With this update, Meta AI Assistant is expanding to more than ten markets outside of the United States, including Australia, Canada, Singapore, Nigeria, and Pakistan. Cox said that Meta is "still working hard to do this in the right way in Europe.". Europe's privacy regulations are becoming stricter, and the upcoming artificial intelligence bill is also preparing to propose requirements such as disclosing model training data.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

  •   知名做空机构香橼研究(Citron Research)周四(11月21日)在社交媒体平台X上发布消息称,该公司已决定做空“比特币大户”微策略(Microstrategy)这家公司,并认为该公司已经将自己变身成为一家比特币投资基金 ...
    caffycat
    昨天 11:18
    支持
    反对
    回复
    收藏
  •   每经AI快讯,11月20日,文远知行宣布旗下自动驾驶环卫车S6与无人扫路机S1分别在新加坡滨海湾海岸大道与滨海艺术中心正式投入运营。据介绍,这是新加坡首个商业化运营的自动驾驶环卫项目。 ...
    star8699
    3 天前
    支持
    反对
    回复
    收藏
  •   上证报中国证券网讯(记者王子霖)11月20日,斗鱼发布2024年第三季度未经审计的财务报告。本季度斗鱼依托丰富的游戏内容生态,充分发挥主播资源和新业务潜力,持续为用户提供高质量的直播内容及游戏服务,进一步 ...
    goodfriendboy
    3 天前
    支持
    反对
    回复
    收藏
  •   人民网北京11月22日电 (记者栗翘楚、任妍)2024广州车展,在新能源汽车占据“半壁江山”的同时,正加速向智能网联新能源汽车全面过渡,随着“端到端”成为新宠,智能驾驶解决方案成为本届广州车展各大车企竞 ...
    3233340
    昨天 17:06
    支持
    反对
    回复
    收藏
万小盈 新手上路
  • 粉丝

    0

  • 关注

    0

  • 主题

    3