Forget ChatGPT — Google Gemini can now see the world with live video and screen sharing

Google’s AI assistant, Gemini, is set to introduce exciting features to give Android users new ways to interact more intuitively with their devices. Leveraging advanced capabilities, Gemini will soon allow users to ask questions about content on their screens, much like the screen sharing feature currently available in Gemini 2.0 on desktop.

In a recent announcement, Google unveiled these Gemini functionalities, which focus on real-time interaction and on-screen inquiries. These features are part of Google’s Project Astra.

New functionalities

(Image credit: Google Gemini)

The screen-sharing function allows users to share their screens with Gemini and ask questions based on displayed content. For instance, while viewing an image of a jacket, a user might ask for shoe recommendations to complement the attire.

The live video interactions, which are undoubtedly a direct response to ChatGPT’s Voice and Vision option, let users engage in real-time conversations about their surroundings by enabling the camera within the Gemini app.

This functionality allows Gemini to provide insights based on live video feeds, similar to a video call experience.

These enhancements position Gemini as a versatile AI assistant capable of understanding and interacting with visual content to deliver support that is more personalized and context-aware.

Integration with existing applications

Gemini’s new features are designed to integrate seamlessly with various applications such as YouTube. Now, while watching a video, users can activate Gemini to ask questions about the content.

For example, a user could inquire about a specific muscle or fitness technique during an exercise tutorial.

In addition, when viewing a PDF, the “Ask about this PDF” option allows users to obtain summaries or clarifications, streamlining the research process without moving to a desktop.

These features aim to make on-the-go information retrieval more efficient, reducing the need for manual searches and enhancing user productivity.

Project Astra

The development of these features falls under Google’s Project Astra, a multimodal AI assistant initiative. Project Astra seeks to create an assistant to perceive and understand its environment, facilitating more natural interactions.

By enabling Gemini to interpret visual inputs and engage in contextual conversations, Google is advancing toward a more immersive AI experience.

Availability

Google plans to roll out these features to Gemini Advanced subscribers on Android devices later this month.

Google’s introduction of screen-aware capabilities in Gemini marks a pivotal moment in AI assistant development. By allowing users to ask questions about on-screen content, Gemini is moving beyond passive viewing into interactive experiences, enhancing AI’s utility in everyday life.

As these features become widely available, they hold the potential to redefine user expectations and set new benchmarks for what AI assistants can achieve.

More from Tom’s Guide

Source link