Google releases the most powerful multimodal video model "Gemini Omni"! Supports conversational seamless photo editing, free on YouTube this week

📄Full Article· Automatically extracted by trafilaturaGemini 翻譯1976 words

Edit videos just by talking! At the Google I/O 2026 developer conference, Google made a major announcement with the launch of its brand-new multimodal video model, "member." Regarded as a killer application, this AI model can not only generate videos with high physical fidelity from text, images, and audio, but also features powerful "conversational editing" capabilities, allowing for precise modifications to video perspectives and actions. The Gemini Omni Flash version is available to paid subscribers starting today and will be integrated into YouTube Shorts for free this week. (Previous coverage: Google launches new AI laptop Googlebook: Deeply integrated with Gemini, in partnership with Acer, ASUS, Dell, HP, and Lenovo, launching this autumn) (Background supplement: Google launches the most powerful "autonomous agent and programming" model, Gemini 3.5 Flash! Writes an operating system in 12 hours for less than $1,000) At today's grand Google I/O 2026 developer conference, the super-powered multimodal model that had sparked heated discussions through various leaks — Gemini Omni — finally made its official debut before a global audience. Focused on "video generation and editing," this next-generation model is seen by outsiders as the culmination of Google's integration of its top-tier AI media generation systems, expected to have a nuclear-level impact on the existing video creation ecosystem. Gemini Omni Flash is rolling out starting today. Here’s where you can find it: 🔹 Today: Google AI Plus, Pro and Ultra subscribers globally in the @GeminiApp and @FlowbyGoogle . 🔹Rolling out starting this week, for no cost: @YouTube Shorts and the YouTube Create app.… pic.twitter.com/07lAavqy2G — Google (@Google) May 19, 2026 Three Core Highlights: From Creation to Conversational Editing According to the official demonstration, Gemini Omni showcased stunning "world understanding" and physical fidelity. Its key functional highlights include: - All-encompassing Generation and Remix: Breaking the limits of single inputs, users can use plain text, images, audio, existing videos, or even "hand-drawn sketches" as a starting point, allowing AI to "create any content from any input." - Revolutionary "Conversational Editing": This feature allows users to issue modification commands using natural language directly within the chat interface. For example, asking the AI to "change the camera angle," "adjust to twilight lighting," or "replace objects in the scene." The AI performs multi-turn iterations based on previous results while perfectly maintaining character consistency and physical laws. - High-Fidelity Physical Simulation: In early demos, whether it was a professor writing a mathematical proof on a blackboard or the complex natural interaction of two people eating spaghetti, Gemini Omni demonstrated extremely high text consistency and realism. Edit your own videos with Gemini Omni with just a conversation. 🎥 Prompt the changes you want to see to reimagine the action, change the point of view, or adjust the lighting over multiple turns. Every instruction builds on the last, so your characters stay consistent, the… pic.twitter.com/irsFXVAk54 — Google (@Google) May 19, 2026 Launch Schedule: Paid Users Get Access Today, Developer API to Follow To allow creators to experience this disruptive technology as soon as possible, Google also announced the phased release plan for Gemini Omni: - Available Starting Today: Google AI Plus, Pro, and Ultra subscribers can now get early access to the Gemini Omni Flash version in the Gemini App and Flow by Google. - Free Access This Week: For general users and creators, Google will integrate this feature into YouTube Shorts and the YouTube Create App for free starting this week. - Future Plans: It will subsequently be officially opened to global developers and enterprise users via API. Industry analysis suggests that Gemini Omni may be an extension based on Google's most powerful video generation model, Veo (such as Veo 3.1), but it is no longer just a single video pipeline; rather, it emphasizes a unified multimodal experience where images, text, video, and audio are "seamlessly integrated." To ensure safety, videos generated through Gemini currently include security Watermarks and are subject to strict content restriction guidelines.

Data Status✓ Full text extractedRead Original (動區 BlockTempo)

🔍Historical Similar Events· Keyword + Asset Matching6 items

2026-05-19

Google Unveils Gemini Omni—A Next-Gen AI Video Builder That Can 'Simulate the World'

Similarity 180%關鍵字 google/gemini/omni

2026-05-22

Google launches two new AI-native ad formats: Rewriting 30 years of search ad rules with Gemini

Similarity 170%關鍵字 google/gemini同分類 zh

2026-05-22

Google admits fault after community backlash: Antigravity Gemini rate limits increased by 3x, weekly quota reset

Similarity 170%關鍵字 google/gemini同分類 zh

2026-05-20