Sora: OpenAI launches a program that quickly generates video from text

ChatGPT’s model’simulates actual world in motion’ for up to a minute based on users’ subject and style directions.

On Thursday, OpenAI released a tool that can make films from text instructions.

The new model, codenamed Sora after the Japanese word for “sky,” can create realistic film up to a minute long while adhering to a user’s subject matter and style directions. According to a business website, the model may also make a video from a still image or supplement existing footage with additional content.

“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction,” according to the blog post.

One video, among several early examples from the company, was based on the prompt: “A movie trailer featuring the adventures of the 30-year-old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.”

The business revealed that a limited number of researchers and media artists now have access to Sora. According to the company’s blog post, the experts would “red team” the product, testing it for its ability to circumvent OpenAI’s rules of service, which forbid “extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others”. The firm is only granting restricted access to researchers, visual artists, and filmmakers, but CEO Sam Altman responded to user requests on Twitter following the launch with video clips he said were created by Sora. The videos have a watermark that indicates they were created by AI.

The business launched Dall-E, a still picture generator, in 2021, and ChatGPT, a generative AI chatbot, in November 2022, which swiftly grew to 100 million users. Other AI companies have released video generating tools, but these models can only produce a few seconds of material that often has no resemblance to the instructions. Google and Meta have stated that they are developing generative video tools, but have not yet made them available to the public. On Wednesday, it unveiled an attempt to add deeper memory to ChatGPT, allowing it to recall more of its users’ talks.

OpenAI does not reveal how much footage was used to teach Sora or where the training movies may have originated, other than to inform the New York Times that the corpus included both publicly available and licensed media from copyright owners. The company has been sued several times for claimed copyright infringement in the training of its generative AI tools, which ingest massive amounts of data scraped from the internet and replicate the images or text contained in those datasets.