Technology

This New Technology Connects AI to Its Emotions—and Yours

12 September 2024

A new “empathic voice interface” launched today by Hume AI, a New York-based startup, makes it possible to add a range of emotionally expressive voices, along with an emotional listening, to large models of language from Anthropic, Google, Meta, Mistral , and OpenAI—depicts an era in which AI assistants may increasingly annoy us.

“We specialize in building empathetic personalities that speak in the ways people speak, rather than stereotypes of AI assistants,” said Hume AI cofounder Alan Cowen, a psychologist who coauthored the several research papers on AI and emotion, and previously worked on emotional technology at Google and Facebook.

WIRED tested Hume’s latest voice technology, called EVI 2 and found its output to be similar to what OpenAI produced for ChatGPT. (When OpenAI gave ChatGPT a seductive voice in May, company CEO Sam Altman described the interface as feeling “like AI from the movies.” Later, a real star in movie, Scarlett Johansson, says that OpenAI deleted her voice.)

Like ChatGPT, Hume is more emotionally expressive than most standard voice interfaces. If you tell it that your pet has died, for example, it will adopt an appropriately sad and sympathetic tone. (Also, as with ChatGPT, you can interrupt Hume mid-flow, and it will pause and adapt to a new response.)

OpenAI didn’t say how much its voice interface tries to gauge users’ emotions, but Hume’s is explicitly designed to do just that. During interactions, Hume’s developer interface will display values that measure things like “determination,” “anxiety,” and “happiness” in the users’ voice. If you talk to Hume with a sad tone, that will be answered too, something ChatGPT doesn’t seem to do.

Hume also makes it easier to deploy a voice with specific emotions by adding a prompt to its UI. Here it was when I asked it “sexy and flirty”:

Hume AI’s “sexy and flirtatious” message

And when told “sad and sad”:

Hume AI’s “sad and sad” message

And here’s a particularly bad message when asked to be “angry and rude”:

Hume AI’s “angry and rude” message

Technology is not always what it seems as shiny and smooth as OpenAI’s, and occasionally it behaves in strange ways. For example, at one point the voice suddenly speeds up and bubbles up with gibberish. But if voice can be refined and made more reliable, it has the potential to help make human voice interfaces more common and diverse.

The idea of recognizing, measuring, and simulating human emotion in technological systems goes back decades and is studied in a field known as “affective computing,” a term introduced by Rosalind Picard, a professor at MIT Media Lab, in the 1990s.

Albert Salah, a professor at Utrecht University in the Netherlands who studies affective computing, was impressed by the Hume AI technology and recently demonstrated it to his students. “What EVI seems to do is assign emotional valence and arousal values (to the user), and then modulate the agent’s speech accordingly,” he said. “It’s a very interesting twist on LLMs.”

LEAVE A REPLY Cancel reply