OpenAI now enables the generation of videos based on spoken instructions.

OpenAI, a prominent figure in artificial intelligence, unveiled a novel AI model named Sora. According to the company, Sora boasts the capability to produce 60-second videos described as both “realistic” and “imaginative” based on concise text prompts.

In a recent blog announcement, OpenAI highlighted that Sora has the capacity to craft videos of up to one minute in duration guided by textual instructions. It can proficiently depict scenes featuring multiple characters, specific types of movement, and intricate background elements.

The blog post emphasized, “The model not only comprehends the user’s prompt but also grasps how those elements manifest in the tangible world.”

OpenAI has announced its intention to train AI models to assist in solving real-world problems involving human interaction. This represents the company’s latest endeavor following the success of ChatGPT, a widely recognized chatbot, in advancing the field of generative AI. While “multi-modal models” and text-to-video models already exist, what distinguishes this effort, according to Reece Hayden, a senior analyst at ABI Research, is the claimed length and accuracy of OpenAI’s Sora.

Hayden suggests that such AI models could significantly impact digital entertainment markets by delivering personalized content across various platforms. One potential application is in television, where short scenes could complement narratives. However, Hayden notes that the model still has limitations, indicating the evolving nature of the market.

OpenAI acknowledges that Sora is a work in progress with noticeable “weaknesses,” such as inaccuracies in spatial details and cause-and-effect relationships. For instance, it may struggle to depict the aftermath of actions accurately, like showing a bite mark after someone takes a bite out of a cookie.

Emphasizing safety, OpenAI plans to collaborate with experts to evaluate Sora for potential harms or risks, particularly concerning misinformation, hateful content, and bias. The company is also developing tools to detect misleading information. Initially, Sora will be accessible to cybersecurity professionals and creative individuals to gather feedback on its potential applications.

This update coincides with OpenAI’s ongoing advancements in ChatGPT. Recently, the company announced testing a feature that allows users to control ChatGPT’s memory, enabling personalized future conversations or the option to forget previous discussions.