首页 News 正文

Alibaba Cloud Releases Tongyi Qianwen 2.0 with Performance Exceeding GPT-3.5 to Accelerate and Catch up with GPT-4

白云追月素
325 0 0

On October 31st, Alibaba Cloud officially released the 100 billion level parameter large model Tongyi Qianwen 2.0. In 10 authoritative evaluations, the comprehensive performance of Tongyi Qianwen 2.0 exceeds GPT-3.5 and is accelerating to catch up with GPT-4. On that day, the Tongyi Qianwen APP was officially launched in major mobile application markets, and everyone can directly experience the latest model capabilities through the APP.
Tongyi Qianwen 72B is about to be open source

In the past six months, Tongyi Qianwen 2.0 has made a huge leap in performance. Compared to the 1.0 version released in April, Tongyi Qianwen 2.0 has significantly improved its abilities in complex instruction comprehension, literary creation, general mathematics, knowledge memory, and illusion resistance. At present, the comprehensive performance of Tongyi Qianwen has exceeded GPT-3.5, accelerating its pursuit of GPT-4.
The comprehensive performance of Tongyi Qianwen 2.0 exceeds GPT-3.5 and is accelerating to catch up with GPT-4

On 10 mainstream benchmark evaluation sets, including MMLU, C-Eval, GSM8K, HumanEval, and MATH, the overall score of Tongyi Qianwen 2.0 surpasses Meta's Llama-2-70B, with nine wins and one loss compared to OpenAI's Chat-3.5 and four wins and six losses compared to GPT-4, further narrowing the gap with GPT-4.
The ability to understand both Chinese and English is a fundamental skill in large language models. In terms of English tasks, Tongyi Qianwen 2.0 has a score of 82.5 on the MMLU benchmark, second only to GPT-4. By significantly increasing the number of parameters, Tongyi Qianwen 2.0 can better understand and handle complex language structures and concepts; In terms of Chinese tasks, Tongyi Qianwen 2.0 achieved the highest score on the C-Eval benchmark with a clear advantage. This is because the model learned more Chinese language materials during training, further strengthening its Chinese understanding and expression abilities.
In fields such as mathematical reasoning and code understanding, Tongyi Qianwen 2.0 has made significant progress. In the inference benchmark test GSM8K, Tongyi Qianwen ranked second, demonstrating strong computational and logical reasoning abilities; In the HumanEval test, the Tongyi Qianwen score closely follows GPT-4 and GPT-3.5. This test mainly measures the ability of large models to understand and execute code fragments, which is the foundation of their application in programming assistance, automatic code repair, and other scenarios.
Tongyi Qianwen 2.0 Release

Tongyi Qianwen has become more mature and useful. Tongyi Qianwen 2.0 has been technologically optimized in terms of command following, tool usage, and refined creation, which can be better integrated into downstream application scenarios. The official website of Tongyi Big Model has launched multimodal and plugin functions, supporting segmented tasks such as image input and document parsing.
At the same time, eight industry model groups based on the Tongyi big model training have been launched, including Tongyi Lingcode Intelligent Coding Assistant, Tongyi Zhiwen AI Reading Assistant, Tongyi Listening and Learning AI Assistant, Tongyi Xingchen Personalized Character Creation Platform, Tongyi Dianjin Intelligent Investment Research Assistant, Tongyi Xiaomi Intelligent Customer Service, Tongyi Renxin Personal Exclusive Health Assistant, and Tongyi Farui AI Legal Advisor. The 8 major industry models are targeted at the most popular vertical scenarios today, using domain data for specialized training. Users can directly experience the model functionality on the official website, and developers can integrate model capabilities into their large model applications and services through web page embedding, API/SDK calls, and other methods.
Tongyi Big Model Family has been fully upgraded, and 8 industry model clusters have been launched

As of October, Alibaba Cloud has conducted in-depth cooperation with more than 60 industry leaders to promote the implementation of Tongyi Qianwen in fields such as office, culture and tourism, power, government affairs, medical insurance, transportation, manufacturing, finance, software development, etc.
Zhou Jingren revealed that Alibaba Cloud plans to open source the 72B version of Tongyi Qianwen in the near future. Previously, Alibaba Cloud had already opened up versions 7B and 14B of the model, with a cumulative download volume of over 1 million. Alibaba Cloud will continue to support developers of Qianxing Baiye to innovate models and applications based on the Tongyi Qianwen open-source model.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

  • 【哔哩哔哩季度首次实现盈利 公司股价为何反跌超13%?】今日哔哩哔哩-W(09626.HK)公布三季业绩,季度首次实现盈利。股价却出现大跌,盘中一度跌超13%。截至发稿,跌10.59%,报145.20港元。 ...
    7p狼
    3 天前
    支持
    反对
    回复
    收藏
  • 阿里巴巴公告,第二财季云智能集团收入为人民币296.10亿元(42.19亿美元),同比增长7%。
    sn222663
    前天 12:56
    支持
    反对
    回复
    收藏
  •   【大河财立方记者陈薇】双11收官之后,11月15日,阿里巴巴集团发布2025财年第二季度(截至2024年9月30日季度)财报,本季度公司收入2365.03亿元,同比增长5%,净利润435.47亿元,同比增长63%。   虽然增长只 ...
    sn222663
    前天 13:14
    支持
    反对
    回复
    收藏
  •   达拉斯消息:美国西南航空一架客机当地时间15日晚在得克萨斯州达拉斯市拉夫菲尔德机场停机坪上被子弹击中,幸无人受伤。   路透社援引西南航空公司发言人的话称,涉事的是该公司航班号为2494的客机,一颗子弹 ...
    sherlock1985
    前天 20:17
    支持
    反对
    回复
    收藏
白云追月素 注册会员
  • 粉丝

    0

  • 关注

    0

  • 主题

    39