DailyDispatchOnline

Bringing You the Daily Dispatch

The latest video creation tool from OpenAI could gain valuable insights from infants.
Science

The latest video creation tool from OpenAI could gain valuable insights from infants.

“F
Mashable recently reported that OpenAI has developed a new model called Sora, which uses text to generate videos. This news sparked a range of reactions from excitement about the potential of this technology, to skepticism and everything in between. Many are speculating about the impact of this development on various activities.

Sora, which means “sky” in Japanese, is a T2V tool that stands out for its advanced features compared to previous tools like Meta’s Make-a-Video AI. It has the ability to transform a short written description into a detailed, high-quality video clip lasting up to one minute. For instance, by providing the prompt “A cat wakes up its sleeping owner and demands breakfast. The owner initially ignores the cat, but the persistent feline uses new tactics until the owner finally reveals their secret stash of treats hidden under a pillow to delay the cat,” Sora produces a polished video that would quickly gain popularity on any social media platform.

Isn’t it adorable? However, there are some limitations to be aware of with OpenAI. The company is surprisingly honest about the tool’s capabilities and admits that it may have difficulty accurately simulating the physics of a complicated scene.

To say the least, one of the videos in its collection showcases the challenges of the model. The given prompt for the video is to create a “Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee”. At first glance, it appears impressive. However, upon closer examination, it becomes evident that one of the ships moves in an illogical manner, indicating that while Sora may excel in simulating light reflections in liquids, it lacks understanding of the fundamental laws governing the movement of ships.

Sora has some limitations including difficulty understanding cause and effect, such as a person taking a bite out of a cookie but the cookie not having a bite mark afterwards. It may also struggle with spatial details, such as confusing left and right. These are just a few examples.

However, it is a beginning and will surely improve with an additional billion teraflops of computational power. While executives in Hollywood can remain comfortable in their oversized beds, Sora will soon be capable enough to substitute certain types of stock footage, just like AIs like Midjourney and Dall-E are replacing Shutterstock-style photography.

Although OpenAI acknowledges the limitations of the tool, they claim that Sora can serve as a basis for developing models capable of comprehending and replicating the real world. They believe this will be a significant step towards achieving artificial general intelligence (AGI).

The situation becomes intriguing at this point. OpenAI’s ultimate goal is to achieve Artificial General Intelligence (AGI), and the company believes that generative AIs are a crucial step towards achieving this goal. However, reaching AGI requires building machines that have a deep understanding of the real world, comparable to our own. This includes understanding the physics of objects in motion. Therefore, the underlying assumption of the OpenAI project is that with enough computing power, machines that can predict how pixels move on a screen will also possess the knowledge of how physical objects behave in the real world. In other words, it is a belief that using the machine-learning approach will eventually lead us to superintelligent machines.

However, artificial intelligences (AIs) that can successfully navigate the physical world must possess more than just knowledge of its laws of physics. They must also have an understanding of human behavior. This may seem like a daunting task for machines that are currently classified as “AI” in our world, as seen in the research of Alison Gopnik.

Reworded: Gopnik has gained recognition for her studies on child learning. Viewing her Ted Talk, “What Do Babies Think?”, could be a valuable experience for those in the tech industry who believe technology holds all the answers to intelligence. After decades of research on the advanced cognitive abilities and decision-making processes of infants during play, she has concluded that “Babies and young children are like the research and development department of humanity.” This columnist, who has spent a year observing their granddaughter’s development and particularly her understanding of causality, is inclined to agree with this perspective. Perhaps if Sam Altman and the team at OpenAI are truly interested in artificial general intelligence, they should consider spending time with babies.

I have been perusing

What I’ve been browsing

Algorithmic politics

Henry Farrell authored a groundbreaking essay on the economic aspects of artificial intelligence.

Bot habits

A recent article in the Atlantic, written by Albert Fox Cahn and Bruce Schneier, discusses the impact of chatbots on communication.

No call-up

Author Charlie Stross has published a blog post discussing the reasons why conscription is not feasible in Britain, even if it were desired.

Source: theguardian.com