Google unveiled a range of new AI-powered products, including new video and image creation tools, as well as starting to rebuild its search engine around generational AI, powered by the firm’s Gemini AI model.
The tech giant used the opening of its annual developer conference, Google I/O, to also preview Project Astra, which Google calls “the future of AI assistants”, which could see, understand and react to it through a smartphone camera. or smart glasses.
Demonstrated on stage by the British founder of Google-owned AI firm DeepMind, Sir Demis Hassabis, the so-called AI agent was able to recognize things it saw when someone scanned an office space with a phone camera, creative ideas recommend based on what it is. to see on a computer screen and remind the user where their glasses were on a desk.
We’re sharing Project Astra: our new project focused on building a future AI assistant that could be really helpful in everyday life. 🤝
See it in action, with two parts – each captured in one take, in real time. ↓ #GoogleIO pic.twitter.com/x40OOVODdv
— Google DeepMind (@GoogleDeepMind) May 14, 2024
Sir Demis said the aim was to create a “universal AI agent” that was “helpful in everyday life” by being able to “see and respond and understand”.
Elsewhere, Google confirmed that it was expanding a test it had been running in the UK to bring AI-generated answers and recommendations to search results, and would now roll out the tool widely in the US, saying that n -he would use AI to “do more of the leg work. out of search”.
The tool would be able to break down longer, multi-part queries and display all the different parts in a single search result, and he confirmed that it would soon enable people to submit search queries using video.
The company said it had ushered in a “new era of search”, powered by generational AI, and chief executive Sundar Pichai called it the “most exciting era of search yet”.
Coming soon, we’ll bring new multi-step reasoning capabilities to Google Search. It breaks your big question down into its parts and figures out which problems to solve and in what order, so research that could take minutes or even hours can be done in seconds. #GoogleIO pic.twitter.com/Op8Iu7K21m
— Google (@Google) May 14, 2024
The company also announced the launch of a new video creation tool called Veo, which would insert text prompts into longer videos, as well as Imagen 3, an AI image creator that also responded to text prompts.
As part of its development of creative tools, Google said it was working with musicians including Wyclef Jean and songwriter Justin Tranter, who created new music with the help of Google’s Music AI Sandbox to create new sounds, as well as the it was filmmaker Donald Glover. using the firm’s text-to-video AI tools.
The new creation tools come amid ongoing concerns about AI-generated content, particularly potentially deep images and videos that have been used to spread misinformation.
In a blog post about the new tools, Google said: “We are careful not only to promote innovation, but to do so responsibly. We are therefore taking steps to address the challenges arising from generational technologies and to help people and organizations work responsibly with AI-generated content.
“For each of these technologies, we’ve been working with the creative community and other external stakeholders, gathering insights and listening to feedback to help us improve and use our technologies in safe and responsible ways.
“We’ve been carrying out safety tests, implementing filters, laying guardrails and putting our safety teams at the heart of the development.
“Our teams are also pioneering tools, such as SynthID, that can embed imperceptible digital watermarks into AI-generated images, audio, text and video. And starting today, all videos generated by Veo on VideoFX will be watermarked by SynthID.
“The creative potential for generative AI is huge and we can’t wait to see how people around the world will bring their ideas to life with our new models and tools.”
Introducing Veo: our most capable video generation model. 🎥
It can create high-quality 1080p clips that can exceed 60 seconds.
From photorealism to surrealism and animation, he can tackle a range of cinematic styles. 🧵 #GoogleIO pic.twitter.com/6zEuYRAHpH
— Google DeepMind (@GoogleDeepMind) May 14, 2024
The conference keynote also saw updates to the tech giant’s flagship Gemini AI model, including a “lightweight” model called 1.5 Flash, and improvements to its 1.5 Pro model, which Google said could ” continue to become more complex and modern” and understand context. for a longer chat window.
Mr Pichai described the developer conference as “Google’s version of (Taylor Swift’s Eras Tour)”, adding that the tech giant was in its “Gemini era”.
Mr. Pichai showed several new ways Google was integrating Gemini into its popular apps, including a new Ask Photos tool for the Google Photos app, which used text prompts from users to find specific images or create photo collections, or asked Gemini app to create a summary of all recent emails from a specific sender or topic if a user needed to catch up.
The announcements come as a new wave of innovation around generational AI is expected, particularly from the world’s biggest tech firms.
On Monday, ChatGPT maker OpenAI revealed updates it was making to the popular chatbot, including making the assistant more capable of understanding a mix of text, audio and video input, and having more human-like conversations .
Microsoft has its own developer conference next week and Apple follows in early June, and both are expected to focus heavily on their development and integration of AI generation tools.
It comes as questions and concerns remain about how best to regulate the rapidly developing technology, as governments around the world debate how best to regulate the emerging market and critics warning them that they risk falling behind due to the pace of change within the sector.