-17.4 C
United States of America
Tuesday, January 21, 2025

Hume AI launches customized artificial voices with Voice Management


Be part of our each day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Be taught Extra


Hume AI, the startup specializing in emotionally clever voice interfaces, has launched Voice Management, an experimental characteristic that empowers builders and customers to create customized AI voices by exact modulation of vocal traits — no coding, AI immediate engineering, or sound design abilities required.

This launch builds on the inspiration laid by the corporate’s earlier Empathic Voice Interface 2 (EVI 2), which launched superior capabilities in naturalness, emotional responsiveness, and customization.

Each EVI 2 and Voice Management keep away from the dangers of voice cloning, a follow that Cowen has acknowledged carries moral and sensible challenges.

As an alternative, Hume focuses on offering instruments for creating distinctive, expressive voices that align with consumer wants, reminiscent of customer support chatbots, digital assistants, tutors, guides, or accessibility options.

Transferring past preset AI voices towards customized bespoke options

Voice Management affords builders the power to regulate voices alongside 10 distinct dimensions, together with:

“Masculine/Female: The vocalization of gender, ranging between extra masculine and extra female.

Assertiveness: The firmness of the voice, ranging between timid and daring.

Buoyancy: The density of the voice, ranging between deflated and buoyant.

Confidence: The assuredness of the voice, ranging between shy and assured.

Enthusiasm: The joy inside the voice, ranging between calm and enthusiastic.

Nasality: The openness of the voice, ranging between clear and nasal.

Relaxedness: The stress inside the voice, ranging between tense and relaxed.

Smoothness: The feel of the voice, ranging between easy and staccato.

Tepidity: The liveliness behind the voice, ranging between tepid and vigorous.

Tightness: The containment of the voice, ranging between tight and breathy.”

This no-code device permits customers to fine-tune voice attributes in actual time by digital onscreen sliders. It’s presently obtainable in Hume’s digital playground, which requires a free consumer sign-up to entry.

The discharge addresses a key ache level within the AI {industry}: the reliance on preset voices, which regularly fail to satisfy the precise wants of manufacturers or purposes, or the dangers related to voice cloning.

This concentrate on customization aligns with Hume’s broader purpose of creating emotionally nuanced voice AI.

The corporate’s efforts to advance voice AI have been highlighted in September 2024 with the launch of EVI 2, which the corporate described as a major improve to its predecessor.

EVI 2 improved latency by 40%, decreased prices by 30%, and expanded voice modulation options, providing builders a safer different to voice cloning.

Sliders > textual content prompts

Hume’s research-driven method performs a central function in its product growth. The corporate, co-founded by former Google DeepMinder Alan Cowen, makes use of a proprietary mannequin primarily based on cross-cultural voice recordings paired with emotional survey information.

This technique, rooted in emotion science, kinds the spine of each EVI 2 and the newly launched Voice Management.

Voice Management extends these rules by addressing the granular, usually ineffable methods people understand voices.

The device’s slider-based interface displays frequent perceptual qualities of voice, reminiscent of buoyancy or assertiveness, with out trying to oversimplify these attributes by text-based prompts.

Voice Management is instantly obtainable in beta and integrates with Hume’s Empathic Voice Interface (EVI), making it accessible for a variety of purposes.

Builders can choose a base voice, regulate its traits, and preview the leads to actual time. This course of ensures reproducibility and stability throughout periods, key options for real-time purposes like customer support bots or digital assistants.

EVI 2’s affect is clear in Voice Management’s capabilities. The sooner mannequin launched options like in-conversation prompts and multilingual capabilities, which have broadened the scope of voice AI purposes.

For instance, EVI 2 helps sub-second response instances, enabling pure and quick conversations. It additionally permits dynamic changes to talking type throughout interactions, making it a flexible device for companies.

Differentiating in a aggressive market

Hume’s concentrate on voice customization and emotional intelligence positions it as a powerful competitor within the voice AI house, even towards well-funded rivals reminiscent of OpenAI with its Superior Voice Mode and ElevenLabs, each of which supply libraries of pre-set voices.

Hume continues to construct on its progressive method to voice AI. Plans for increasing Voice Management embody introducing extra modifiable dimensions, refining voice high quality underneath excessive changes, and rising the vary of base voices obtainable.

With the launch of Voice Management, Hume reinforces its place as a frontrunner in voice AI innovation, providing instruments that prioritize customization, emotional intelligence, and real-time adaptability. Builders can entry Voice Management right now through Hume’s platform, marking one other step ahead within the evolution of AI-driven voice options.


Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles