ChatGPT now understands real-time video, seven months after OpenAI first demonstrated it

[ad_1]

OpenAI has finally released the real-time video capabilities of ChatGPT which it has been showcasing for nearly seven months.

On Thursday, during a live stream, the company said that Advanced Voice Mode, ChatGPT’s human-like conversation feature, is gaining visibility. Using the ChatGPT app, users subscribed to ChatGPT Plus or Pro can point their smartphones at objects and have ChatGPT respond in near real-time.

Advanced Voice with Vision mode can also understand what’s on the device’s screen, via screen sharing. It can explain the different settings menus, or make suggestions on a math problem.

OpenAI says the rollout of Advanced Voice Mode with Vision will begin today, wrapping up next week.

In a Last offer On CNN’s 60 Minutes, OpenAI chief Greg Brockman had Advanced Voice Mode with Anderson Cooper’s vision test about his anatomy skills. As Cooper drew body parts on the board, ChatGPT was able to “understand” what he was drawing.

Image credits:OpenAI

“Location marked,” the assistant said. “The brain is right there in the head. As for the shape, it’s a good start. The brain is more oval.

In the same demo, Advanced Sound Mode with Vision made a mistake in an engineering problem, indicating that he was prone to hallucinations.

Advanced Audio with Vision mode was delayed several times — It is said This is partly because OpenAI announced the feature long before it was ready for production. In April, OpenAI promised that Advanced Voice Mode would be rolled out to users “within a few weeks.” Months later, the company said it needed more time.

When Advanced Audio Mode finally arrived in early fall for some ChatGPT users, it lacked a visual analysis component. In the lead-up to today’s launch, OpenAI has focused most of its attention on bringing the advanced audio-only audio mode experience to additional platforms and Users in the European Union.

[ad_2]

Leave a Comment Cancel reply