-Input Origins- Talking to Machines: The Origins of Voice Control

From Edison’s Phonograph to Siri in Your Pocket

IBM Shoebox (1961): The first voice-activated calculator, capable of recognizing 16 words and performing basic arithmetic from spoken input.

We’ve explored how humans first clicked, tapped, and waved at their machines. But what about the most human input of all, our voice?

Before Siri was scheduling your meetings, before Alexa was playing your favorite playlist, and way before anyone said “Hey Google,” there was Edison talking to a metal horn. This month on Input Origins, we rewind to the surprising roots of voice tech and trace its journey into your pocket.

Meet AUDREY: Bell Labs’ 1952 prototype that could understand spoken digits, voice tech’s first real listener.

AUDREY - 1952

The First Voice-Controlled Machine.

1877. For the first time, a machine captured the human voice - its sound vibrations etched into tinfoil wrapped around a rotating cylinder. Edison’s phonograph changed the course of history. But what if machines could do more than echo us? What if they could understand us?

Nearly 75 years after Edison’s invention, that question took its first mechanical step toward an answer. In 1952, researchers at Bell Labs, Stephen Balashek, R. Biddulph, and K.H. Davis introduced Audrey (Automatic Digit Recognizer), the world’s first system capable of recognizing spoken digits from 0 to 9. Audrey wasn’t versatile, but for her time, she was revolutionary.

Audrey marked a turning point, a proof that a machine could begin to understand us.

How Did They Do it?

Using a microphone and a system of filters and relays, Audrey analyzed spoken digits by isolating key frequencies and matching them against preset acoustic patterns. The system required users to pause between numbers and performed best with the voices it was trained on.

Large and confined to the lab, Audrey could light up a bulb to indicate recognition, primitive by today’s standards, but revolutionary at the time. She marked a turning point: proof that a machine could do more than hear. It could begin to understand.

[1990s coverage of the first consumer voice recognition breakthrough]

1990 - Voice Control Goes Commercial

Jump to 1990, and voice control finally stepped out of the lab and into the real world. Enter Dragon Dictate - the first commercially available speech recognition software for consumers. It ran on personal computers and required you to speak. Slowly. And. Clearly. One. Word. At. A. Time. But for those dreaming of hands-free computing, it was a revelation. Dragon Dictate paved the way for more fluid, natural-sounding dictation, which let users speak in full sentences. Voice control was no longer science fiction - it was a tool.

[2008 Google brought voice to the iPhone with Google-Voice-Search]

Siri, Are We There Yet?

The 2010s brought a seismic shift. Voice moved from desktop software into the ambient fabric of our daily lives. With the launch of Siri in 2011, followed by 'Google Now' and Amazon Alexa, voice wasn’t just about control, it was about conversation. You didn’t have to speak like a robot anymore. You could ask questions, crack jokes, set timers, send texts, and control your smart home, often without lifting a finger.

These assistants learned your preferences, adapted to your quirks, and became part of your routine. And while the dream of natural, intelligent conversation is still evolving, one thing’s clear: Our voices have become the way we command the world around us.

Voice is, at its core, a deeply human way to interact. It brought a new kind of relationship between people and their technology, one that felt less like operating a machine and more like having a conversation. But voice isn’t the only path to human-like control.

Control That Feels Human

Just as voice made it possible to command devices through speech, Mudra brings that same human-first philosophy to your hands. By translating subtle neural signals from your wrist into digital actions, Mudra makes interaction feel as effortless as thought itself. No screens, no buttons, just you, your intent, and the world responding.

Voice showed us we could talk to our machines. Mudra shows us we can be the interface.

Read Previous Editions of -Input Origins- here

-Input Origins- Talking to Machines: The Origins of Voice Control

-Input Origins- Series: Before Neuralink - There Was Cyberlink