The 2024 World Artificial Intelligence Conference (WAIC), held from July 4-6, showcased the latest advancements and opportunities in AI. This event, themed "Intelligent Connectivity, Infinite Possibilities," brought together global experts, industry leaders, academics, and investors to discuss the future of AI.
As an AIGC (Artificial Intelligence Generated Content) expert, I’m excited to share a deep dive into the key takeaways from this year's conference, uncovering new technologies, applications, and products worth watching.
Q1: What are the New Highlights in General AI Models? How are Domestic AI Models Progressing?
At the conference, I had the chance to experience the fifth-generation model from a leading domestic AI software company. This model integrates multimodal information, creating a new interactive experience using voice, text, image, and video data. For instance, it can identify a person's environment based on the information from a badge captured by a camera or summarize the content of a book page from a photo. Overall, this model’s real-time interaction capabilities are comparable to some of the latest overseas AI models. I’m looking forward to its applications in consumer and educational scenarios.
Q2: How are Domestic Text-to-Video Models Progressing Compared to Their International Counterparts? What Impact Will AI Have on Short Video Creation?
The conference showcased several text-to-video models, each demonstrating breakthroughs in areas like layer segmentation, frame control, motion control, and camera movements. Utilizing image segmentation for video processing appears to be a feasible approach for achieving controllable video generation. These segmented vector representations could potentially improve the accuracy of physical world representations when combined with existing video generation models. However, it seems that traditional large models remain the main architecture for AI video generation, with vectorization possibly becoming an embedded feature.
Q3: What New Applications Can We Focus On in the AI Field?
Based on the conference’s findings, tool-based AI applications currently outperform entertainment-based ones. In the short term, applications targeting businesses and governments (to B/G) are developing faster than those for consumers (to C). Within the C-end, educational applications are particularly noteworthy. Here are some notable advancements across different application scenarios:
- Office Applications: An online education company has incubated a UI company to help startups develop apps.
- Educational Applications: An education tech company introduced the seventh generation of a dictionary pen featuring an industry-first built-in camera for rapid scanning, voice interaction, and photo input, providing a seamless dictionary and translation experience. Another company showcased a "Socratic AI Q&A" feature that engages learners in deep interaction, promoting critical thinking and enhancing expression, comprehension, and writing skills.
- Entertainment Applications: A video streaming company demonstrated a dynamic comic model using their proprietary comic data. This model, based on the DiT architecture, inputs comics and dialogue and generates videos with audio and subtitles, including eye, mouth, and hand movement recognition and scene transitions. However, this model is still in internal use and has not yet been commercialized.
- Digital Humans: Numerous companies are producing digital humans, but there is little differentiation among their products.
- Sports Applications: A media company announced the use of AI technology for analysis in the broadcast of this year’s Olympic Games.
The 2024 WAIC provided a wealth of insights into the future of AI, highlighting both immediate applications and long-term potential in various sectors. Keeping an eye on these developments will be crucial for anyone interested in the evolving landscape of artificial intelligence.