In a few years, the early 1990s controversy over whether or not the pop duo of Milli Vanilli sang on their own Grammy Award winning record may seem like small potatoes. Get ready for synthesized lead and backup vocals to eventually replace actual humans singing on studio recordings.
As I wrote in TCS last March, "In some respects, vocals are the last bastion of expected honesty in recording," which is why the electronic pitch correction used since the late 1990s has revived some of the flavor of the Milli Vanilli controversy. No matter how many layers of guitars or keyboards are overdubbed by the same player, or how radically a drum kit is processed to sound more cannon-like, we expect that at the end of the day, a real live vocalist sang on top of the track, with minimal electronic processing.
But that may change radically in a few years. In 2003, Yamaha created a product that they call "Vocaloid". Its programming can be input by any musician; its basic graphical user interface is much like any software-based synthesizer. But after that, words are typed onto the screen. And then the user enters phonemes if the words aren't in Vocaloid's vocabulary. Phonemes are the phonetic unit of language that conveys distinction in meaning, such as the "mmm" of mat and the "buh" of bat. Finally, performance characteristics such as the tone of the singer's tone, breathiness and dynamics are programmed in.
When all the programming is complete, press play, and it sings. No, not computer generated singing -- but it sings with a human voice that has been digitally sampled, deconstructed, and re-assembled according to your input. And Vocaloid can do its own harmonies and back-up vocals.
Currently, Yamaha, through Zero-G, their UK distributor, markets Vocaloid with three different voices each available separately, two female, and one male. They're named Leon, Lola, and Miriam, after the actual humans whose voices have been sampled to create the synthetic singers.
The initial programming typically does sound like a synthesized android's voice; it takes a lot of programming to make Vocaloid sound like a real voice, and not a synthesizer.
But synthesized instruments have gotten increasingly realistic sounding over the past twenty years, and there's no reason to believe that within a much shorter period of time, Vocaloid, and its inevitable competitors will eventually become indistinguishable to most listeners from the human voice.
The Making of a 21st Century Pop Star
Since the late 1950s, the history of pop music has been one of exponentially increasing studio complexity. Over ten years ago, a critic named Ted Friedman wrote:
"Pop music-making in the 1990s has more to do with filmmaking than jamming in a garage: every song is a collection of tracks laid down by assorted musicians, edited together by producers, and fronted by charismatic performers. But while most viewers recognize the complex division of labor in moviemaking -- nobody gets upset that actors don't do their own stunts -- pop music hangs on to the folk-era image of the individual artist communicating directly to her or his listeners. Milli Vanilli became martyrs to this myth of authenticity. They were the recording industry's sacrifice meant to prove the integrity of the rest of their product -- as if the music marketed under the names U2 or Janet Jackson WEREN'T every bit as constructed and mediated, just because the voices on the records matched the faces in the videos."
There's no reason to believe that the trend of ever-complex pop studio recording won't continue, which means that programs like Vocaloid could have significant implications for the future of recorded music. Max Headroom was an amusing mid-1980s look at what an entirely electronic newscaster of the future would be like. Eventually technology caught up with the fantasy, and in the late 1990s, Websites such as Ananova began to use a combination of digital animation and speech synthesis to have their own virtual newsreaders.
There will always be humans making music, but just as flesh and blood anchormen have been joined by Max and Ananova, human singers may very well be eventually joined by synthetic counterparts. It's entirely possible that within ten or twenty years, teenagers will be worshiping entirely computerized pop stars: digital video animation will create their looks, programs such as Vocaloid will create their vocals, and a combination of pre-recorded loops of sound, crack studio musicians and software synthesizer programming will create their backing tracks. There have been plenty of rock videos shot for MTV that have been built around digital animation -- building them around entirely digital singers seems like only the next logical step.
(One suspects that even with entirely digital video artists, wardrobe malfunctions will not remain a thing of the past, of course...)
For decades, Mick Jagger sang, "Time Is On My Side." before eventually, it began to catch up with him. For the virtual pop star of the future, aging will be much less of a concern. And if she ever gets into a contract dispute, there're always the control, alt and delete keys.
At least for the moment.