Day 6 of OpenAI’s 12 Days event — all the big news as it happens

We’re now halfway through OpenAI’s 12 Days extravaganza. Every weekday until December 20 the AI lab is making at least one product, service or feature announcement. It is safe to say it has been a mixed bag so far.

There are likely still surprises in store and we’ve yet to see any updates to the GPT family of models, Advanced Voice or image creation in ChatGPT.

Given Google’s Gemini 2 announcement, ushering in the “agent era” and a new version of its flagship model capable of matching OpenAI’s o1 for reasoning and GPT-4o for multimodal capabilities — the pressure is on.

In CEO Sam Altman’s own words, the company will “have a livestream with a launch or demo, some big ones and some stocking stuffers.” As yesterday was just an update on ChatGPT coming to Apple Intelligence, I’m hoping for something big.

Most rumors point to either an update to ChatGPT itself or the launch of Advanced Voice with Vision. This is similar to Google’s Project Astra and adds webcam access to the Advanced Voice assistant.

12 Days of OpenAI: The biggest announcements

(Image credit: Ideogram/Future AI generated)

[Day Five] ChatGPT with Apple Intelligence: Apple Intelligence got a huge update today with the release of iOS 18.2 including the inclusion of ChatGPT. This brings with it enhanced vision and text capabilities right from the Siri window.
[Day Four] ChatGPT Canvas launch: OpenAI has finally unleashed ChatGPT Canvas, its text and code editor, to all users. It is also being made available for use with Custom GPTs and gets the ability to run Python code.
[Day Three] OpenAI launches Sora: OpenAI’s artificial intelligence video generation tool, Sora, is official and enables you to generate videos and images in nearly any style from realistic to abstract. It’s a whole new product for the company on a separate page from ChatGPT.
[Day Two] Fine-tuning AI models: In a roundtable, OpenAI devs focused on the power behind OpenAI’s models and reinforcement fine-tuning for AI models tailored for complex, domain-specific tasks. to make them work in specific fields like science, finance and medicine.
[Day One] ChatGPT Pro Tier: Sam Altman and his roundtable continued the 12 Days by announcing a Pro tier of ChatGPT meant for scientific research and complex mathematical problem solving that you can get for $200 a month (this also comes with unlimited o1 use and unlimited Advanced Voice).
[Day One] ChatGPT o1 model: OpenAI’s 12 Days of AI kicked off with a rather awkward roundtable live session where Altman and his team announced that the o1 reasoning model is now fully released and no longer in public preview.

What can we expect on Day 6?

12 days.12 livestreams.A bunch of new things, big and small.12 Days of OpenAI starts tomorrow.December 4, 2024

Possible announcements

ChatGPT video analysis

GPT-4o image generation

Advanced Voice with Vision

AI Agents (Operator)

ChatGPT web browser

Until yesterday I thought we had a pattern going with OpenAI’s announcements where every other day was a ‘small’ announcement — or “stocking stuffer” as Sam Altman puts it. A day where it’s interesting but not for everyone.

The first day was big with o1, while the second was small with fine-tuning. We then had Sora on day three and a ChatGPT Canvas update on day four. This should have pointed to a relatively big update on day 5, especially given Altman’s presence.

For some people, the news ChatGPT is coming to Apple Intelligence is huge, but it was first announced in June at WWDC and has been in developer beta for weeks at this point. There wasn’t much shown in the 12 minute live stream we didn’t already know.

Assuming that yesterday was a blip because Apple launched iOS 18.2, all but requiring an OpenAI livestream on the topic, we may be getting something bigger today and my radar is pointing towards vision coming to Advanced Voice.

This is something first teased during the Spring Update and would allow OpenAI’s flagship voice assistant to also see you or what you’re looking at through your phone’s camera (or in the future, you laptop’s webcam).

If it isn’t Advanced Voice with Vision there are a multitude of other ideas floating around on social media including GPT-4.5, native image generation for GPT-4o (which Altman hinted at yesterday) or even related to AI agents.

We’ll find out at 1pm ET (6pm GMT, 5am ACT).

More from Tom’s Guide

Source link