要聞列表馬斯克 xAI 推出「極速聲音克隆」功能:自然說話 1 分鐘即可打造個人專屬 Grok 聲優
動區 BlockTempo2026-05-02 05:09:27

馬斯克 xAI 推出「極速聲音克隆」功能:自然說話 1 分鐘即可打造個人專屬 Grok 聲優

AI 影響分析Grok 分析中...
📄完整原文· 由 trafilatura 自動擷取Gemini 翻譯1453 字
Elon Musk's xAI has evolved once again! On April 30, the company officially launched the "Custom Voices" and "Voice Library" features. Users only need to speak into a microphone for less than 1 minute, and the system can rapidly clone a highly realistic, personalized voice within 2 minutes, which can then be directly applied to the Grok AI assistant. To completely prevent Deepfake fraud, xAI strictly prohibits the uploading of pre-recorded audio files, mandating "real-time recording by the user" and dual voiceprint verification. (Previous coverage: Grok quietly launches Imagine Agent Mode: Infinite canvas replaces chat box, generating entire sets of images and videos with a single prompt) (Background: Elon Musk quietly shuts down Starlink customer service centers: Grok Voice takes over calls, 20% of calls closed directly) In the generative AI voice race, xAI, led by Elon Musk, has officially launched a strong offensive against competitors like OpenAI. On April 30, 2026, xAI released an official announcement declaring a major update to its AI platform — the full rollout of "Custom Voices" and a new "Voice Library" feature, allowing individuals and enterprises to seamlessly integrate "their own voices" into various AI application scenarios with an extremely low barrier to entry. According to xAI, creating a personalized AI voice model has become easier than ever. Users only need to record a natural speech sample of "a few seconds to one minute" in the xAI console, and the entire model creation process is completed in under 2 minutes. Once generated, this exclusive voice can be immediately utilized in Grok's Text-to-Speech (TTS) service and Voice Agent API. xAI officially highlighted five core application scenarios for this technology: - Brand Customer Service Agents: Enterprises can enable AI customer service to use a brand-exclusive, consistent voice to enhance corporate image. - Content Creators and Podcasts: Creators can use their own voices to narrate videos or generate audiobooks at scale, without needing to enter a recording studio every time. - Cross-lingual Speeches: Allows CEOs of multinational corporations to deliver key speeches in multiple languages (such as Chinese, English, Japanese, French, etc.) using "their own voices" seamlessly. - Gaming and Entertainment: Rapidly voice NPC characters in the metaverse or games. - Accessibility Support: Permanently preserve the original voice characteristics for patients with rare diseases like ALS who are losing their ability to speak. With the proliferation of voice cloning technology, celebrity voice impersonation and telecommunications fraud using Deepfake technology have emerged one after another. To prevent the malicious abuse of this technology, xAI has implemented an extremely strict security net. xAI emphasizes that the system "absolutely cannot use existing audio files for voice cloning." Users must perform real-time recordings themselves, and the system will require users to read a randomly generated "Passphrase." Subsequently, the AI will confirm the content via speech-to-text and compare the speaker embedding vectors (Speaker Similarity) to ensure that the person recording the passphrase is the same as the original speaker. This dual-verification mechanism fundamentally blocks the possibility of hackers "stealing voices" using others' audio files. In addition to powerful customization features, xAI also launched the "Voice Library" simultaneously, allowing development teams to manage all custom and built-in voices centrally. Currently, the Voice Library includes over 80 high-quality voices and supports up to 28 languages for users to preview freely. What excites developers and enterprises most is that xAI announced that the use of the Custom Voices feature will be "completely free of charge" and will fully support all advanced features of the original TTS system (such as voice tags, real-time streaming, etc.). Users only need to specify the unique voice_id in the API to easily invoke it, which will undoubtedly significantly lower the cost barrier for enterprises to adopt proprietary voice AI.
資料狀態✓ 已擷取全文閱讀原文(動區 BlockTempo)
🔍歷史類似事件· 關鍵字 + 標的比對5 則
💡 目前用關鍵字 + 標的比對(MVP)· 之後會升級為 embedding 語意搜尋
原始資訊
ID:5ba9ba11fc
來源:動區 BlockTempo
發佈:2026-05-02 05:09:27
分類:zh_news · 導出分類 zh
標的:未指定
社群投票:+0 /0 · ⭐ 0 重要 · 💬 0 留言
馬斯克 xAI 推出「極速聲音克隆」功能:自然說話 1 分鐘即可打造個人專屬 Grok 聲優 | Feel.Trading