
India.com News Desk
The India.com News Desk cover breaking stories from around the world. The desk works 24/7 to bring latest news related to national and international politics, business and education. ... Read More
New Delhi: OpenAI has gained wide popularity for its revolutionary work with text-based generative AI, ChatGPT, which made its debut on November 30, 2022. While the San Francisco-based company had already ventured into text-to-image-based generation with the launch of DALL-E earlier in 2021, it has recently unveiled its new video generation AI, known as Sora, whose visual results look high-quality and impressive. Here’s a look at its features, prowess and how to use the AI.

A still from the OpenAI generated video from the following prompt: A litter of golden retriever puppies playing in the snow. Their heads pop out of the snow, covered in. (Source: OpenAI)
Sora is a text-to-video generative AI model developed by OpenAI with the goal of creating videos from text descriptions. Based on text prompts from users, the model can create videos up to a minute long. It can generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. It understands not only what the user has asked for in the prompt, but also how those things exist in the physical world. The model is being developed to help people solve problems that require real-world interaction. As of the most recent information available, Sora is still in development and has not been made publicly available.

A still from a video generated by Sora AI when given the prompt to create – A beautiful silhouette animation shows a wolf howling at the moon, feeling lonely, until it finds its pack. (Source: OpenAI)
While the new text-to-video generation tool is speculated to be launched soon, here is what the AI can do (based on information available so far):
Sora’s working mechanism involves a deep understanding of language semantics, enabling it to grasp the narrative and thematic essence of textual inputs. One of the key technical challenges Sora addresses is the simulation of motion within the generated videos. It incorporates dynamic modeling techniques that predict and render motion in a way that feels authentic and seamless.
Sora uses a combination of a diffusion model and a type of neural network called a transformer. The transformer inside Sora can process chunks of video data, and it can be trained on various types of video, including different resolutions, durations, aspect ratios, and orientations. This approach allows Sora to generate videos that are high-definition and full of detail, handling occlusion well and creating videos that are up to a minute long.
Additionally, Sora is also able to simulate artificial processes, such as video games. It can simultaneously control the player in a game like Minecraft while rendering the world and its dynamics in high fidelity. These capabilities can be elicited zero-shot by prompting Sora with captions mentioning the specific game.
For breaking news and live news updates, like us on Facebook or follow us on Twitter and Instagram. Read more on Latest Technology News on India.com.
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts Cookies Policy.