At the end of 2022, ChatGPT, a chatbot capable of freely and deeply conversing with humans, will ignite a hot wave of generative AI worldwide upon its release. As early as 2013, Baidu established a deep learning laboratory, which was the earliest enterprise in China to enter the AI field, ahead of other peers.
On November 12, when delivering a speech at the 2024 Baidu World Congress, Robin Lee, the founder of Baidu, raised a realistic question: "Objectively speaking, the super application that everyone expected has not yet appeared, and some people even began to ask, in the past 24 months, is this global big model craze a new technological revolution or a new round of foam?"
As the flag bearer of China's artificial intelligence, Baidu's answer is: AI is a real demand. Robin Lee mentioned that in May this year, Baidu executives had a discussion about how the AI big model was "successful". The key indicator he provided is that the call volume of the large model API can increase tenfold from 200 million to 2 billion within one year. And just six months later, the daily average usage of Baidu Wenxin's big model has exceeded 1.5 billion, an increase of 7.5 times. Compared to the first disclosure of 50 million times a year ago, it has increased significantly by about 30 times.
In Robin Lee's view, this steep growth curve is also the epitome of the explosion of China's large-scale model application in the past two years. Today, with the basic model capabilities ready, we are about to usher in a shining moment of AI applications. Every application is a star, and every application will become a force that changes the world, "he said.
Robin Lee's "non mainstream" AI view: give priority to solving illusions and focus on developing agents
Compared with other players in the AI industry, Robin Lee and Baidu are somewhat "non mainstream" in many cases. When many companies fell into the "100 model war" and focused on the performance and parameters of the large model, Robin Lee repeatedly stressed on many public occasions that "the application should not be used to roll the large model" and "the AI model is worthless without application". In the face of the open source and closed source dispute, Robin Lee took a clear stand in the closed source model camp, saying that only closed source can have a real business model.
In February this year, OpenAI unveiled the Wensheng video model Sora to ignite the global Internet again, and many Chinese enterprises have also started to follow this track. However, Robin Lee said in an internal speech that no matter how popular Baidu is, Baidu will not do the Sora type video generation model, because the investment cycle is too long, and business income may not be obtained in 10 or 20 years.
Robin Lee further explained this at Baidu World 2024 Conference. He said that compared to Sora, Baidu prefers to solve the illusion problem of image generation, that is, the generated content of large models does not match the facts or has no basis. Robin Lee said, "This problem seems simpler, or even more boring, but if it is not solved, there will be no application." He believed that the biggest change for the industry in the past 24 months is that the big model has basically eliminated illusion, and the accuracy of answering questions has been greatly improved, making AI usable and trustworthy from "serious nonsense".
The key to solving the illusion problem of large models lies in RAG (Retrieval Enhancement) technology, which allows large models to use the retrieved information to guide the generation of text or answers. At the "AI Round Table Talk" with Luo Yihang, the founder of Silicon Star, and Zhang Yijia, the founder of Jiazi Guangnian, Robin Lee mentioned that RAG has made the big language model have practical value in the past two years, which can effectively control the generation of illusion, so it can be used in various scenes, especially in the aspect of To B. He emphasized that not doing Sora on Baidu does not mean that Baidu does not do multimodality. "In order for multimodality to enter the truly practical stage, it also needs accuracy and controllability, and the application space can only be opened up
Therefore, Baidu has targeted the combination of multimodal content such as images and RAG, and released the iRAG (image-based RAG) technology for retrieval enhancement at the event. IRAG can combine Baidu's billion level image resources and powerful basic modeling capabilities to generate various ultra realistic images.
Robin Lee showed a picture of Volkswagen cruising over the Great Wall generated by Wenxin model. Through Wenxin iRAG technology, both the model logo of this specific model of car and the Great Wall used as the background have not experienced any illusions of errors or deformations. With the help of this technology, the cost of creating posters that used to cost hundreds of thousands of yuan is now almost zero. Its commercial value is reflected in its absence of illusions, hyper realism, no cost, and immediate feasibility.
As global technology giants shift their focus to finding AI landing application scenarios, intelligent agents, which can autonomously perceive the environment, perform actions, and achieve set goals, have also received more attention. However, Robin Lee stressed that there are not many companies like Baidu that regard agents as the most important strategic direction. Baidu believes that as the most mainstream form of AI applications, intelligent agents are about to reach a tipping point. At present, Baidu's Wenxin intelligent agent platform has attracted 150000 enterprises and 800000 developers, covering a wide range of application scenarios.
Baidu mainly divides intelligent agents into four categories: corporate intelligent agents, role intelligent agents, tool intelligent agents, and industry intelligent agents. Specifically, company type intelligent agents can provide proactive recommendation and other services on the basis of traditional company websites, and may become the first interface directly facing consumers in the future. Role based intelligent agents are highly anthropomorphic digital humans, while industry based intelligent agents focus on vertical fields such as law. Baidu Wenku and Baidu Netdisk have jointly released a tool based intelligent agent called "Free Canvas", which fully breaks through the restrictions of public and private domain data, achieving freedom of input, editing, creation, and sharing.
Free canvas generates multimodal research report
Industry intelligent agent - Faxingbao
Code generation is one of the core capabilities of the big model. Robin Lee has also said many times that generative AI will enable everyone to have the ability of a programmer. At the 2024 Baidu World Conference, Robin Lee announced the official launch of the code free tool "Seconds Da". Unlike auxiliary code generation tools, with the help of the new software "Suida" consisting of large models and intelligent agents, users no longer need to understand the code to build various applications, "ushering in an unprecedented era where money can be made solely through ideas".
second click
At the end of the speech, Robin Lee said that he was a software engineer. There is a foreign saying that "software engulfs the world". But in his view, the world should not be swallowed up, but created. In the era of AI, applications create the world, "said this technology leader.
The Layout and Vision of an AI Longtermist
After more than 20 years of founding Baidu, Robin Lee has followed many labels, including "AI's long termist". This not only means Robin Lee's early layout in AI, but also represents his firm determination to continue to invest in the future.
Robin Lee was listed on the first Global Top 100 AI People List of Time Weekly in 2023. Also named as the global AI leader were Tesla CEO Elon Musk, Nvidia founder and CEO Huang Renxun, OpenAI CEO Sam Altman and other figures. Time affirmed Robin Lee's long-term investment in AI and Baidu's achievements in AI, and rated him as "the most outstanding futurist in China, who has long been involved in the wave of AI development".
AI Godfather and Turing Award winner Yang Likun also mentioned in the preface of his Chinese version of "The Road to Science" that Baidu was one of the earliest large companies to deploy commercial deep learning systems, ahead of Google and Microsoft. As early as 2012, Robin Lee encouraged all Baidu product managers to understand the latest trends of AI technology after seeing the improvement of deep learning technology on search. One year later, Baidu set up an in-depth learning laboratory, with Robin Lee as the president, and officially launched the exploration of AI.
In fact, Baidu has intricate relationships with many AI leaders today. Geoffrey Hinton, the "father of neural networks" who won the 2024 Nobel Prize in Physics, once rejected Baidu's highest bid of $44 million and ultimately chose to lead his team to join Google. At that time, Geoffrey Hinton's team won the 2012 ImageNet Challenge championship with their new neural network algorithm AlexNet, and gained instant fame. Its team members included Ilya Sutskever, later co-founder of OpenAI.
In addition, Andrew Ng, co-founder and head of Google Brain, a Google AI project, joined Baidu in 2014 as Chief Scientist. During his three-year tenure at Baidu, Andrew Ng led the Baidu AI team to grow to 1300 people, including 300 members of the Baidu Research Institute, helping Baidu cultivate a large number of AI talents. Dario Amodei, co-founder and CEO of Anthropic, OpenAI's biggest competitor and AI startup, started his first job as a researcher at Baidu's Silicon Valley AI Laboratory after graduating from Stanford University as a postdoctoral fellow.
After ten years of exploring AI, Baidu's achievements and influence in the field of artificial intelligence have been deeply recognized. The "2024 Global AI Ecosystem Overview" report recently released by consulting firm Sullivan shows that after the outbreak of the self generated AI wave, chip manufacturers represented by NVIDIA and cloud vendors such as Microsoft have begun to reap dividends, and open source models, startups, and software application vendors are also accelerating their efforts to seize market share. In the global AI ecosystem, AI native giants have performed outstandingly. In addition to OpenAI and Google, Baidu has also been included in the same quadrant.
In 2021, Robin Lee mentioned in his Letter to Shareholders that in the past ten years, Baidu has been committed to building the foundation and ecology, and has begun to invest heavily in AI research and development, expecting to make the complex world simpler with technology. Therefore, Baidu has become a leading AI ecotype company with a strong Internet user base at the moment of the surge of AI. Baidu has determination and patience. Because we deeply understand that the most cutting-edge technological wave cannot be waited for, and you must lay out 10 or 20 years in advance
The firm belief in AI also profoundly influences Baidu's outlook on the future. Having witnessed and experienced the Internet foam from the late 1990s to the early 2000s, Robin Lee has a clear understanding of the big model boom. When talking to the editor in chief of Harvard Business Review in October, he proposed that generative AI, like many technological waves in history, will inevitably experience the process of foam after the initial exciting stage, and those pseudo innovations that cannot meet the market demand will be cleaned up.
However, the Internet pioneer is still optimistic about the long-term prospects of the big model, believing that this is a disruptive technological revolution. "In the face of the new technology cycle, entrepreneurs who adhere to the long-term principle will stand out." Robin Lee once said.
At the "AI Round Table Dialogue", Robin Lee recalled the anxiety of the industry in the past year, such as where super applications are, what opportunities entrepreneurs have, and so on. In this regard, he hopes to share the development tools or various exploration scenarios through Baidu's efforts, and showcase what kind of paths and applications can generate practical value based on large models.
After the rapid development of big model technology, it gradually returned to rationality and pragmatism, and Baidu's attitude became increasingly clear: it is not about launching a "super application", but about continuously helping more people and enterprises create millions of "super useful" applications. Nowadays, its large-scale model applications have been implemented in dozens of industries and hundreds of scenarios such as energy, electricity, and manufacturing. At the 2024 World Congress, Baidu also explicitly revealed that the Wenxin model is still undergoing continuous training, and a more powerful new version is worth looking forward to.