Mapping the Uncanny Valley: Generative AI in Media
The concept of the uncanny valley was first articulated by Japanese roboticist Masahiro Mori in 1970, describing the discomfort people feel when confronted with humanoid robots or digital creations that are almost, but not quite, human-like.
As technology advances, this valley narrows, leading to a fascinating evolution in human perception and acceptance of AI-generated audio, video, text, and images.
This journey reflects not just technological improvements but also our growing ability to forgive imperfections and embrace these creations as part of our reality. We’ve become less skeptical of AI products and more willing to walk into the uncanny valley and meet them in the middle.
We’re accepting AI-generated media at an increasing rate. The practical and ethical considerations cannot be dismissed.
Robotics: From Eerie to Endearing
In the realm of robotics, the uncanny valley has been a significant hurdle. Humanoid robots, designed to look and act like humans, often evoke unease when their movements or expressions are just slightly off.
However, as these robots become more sophisticated, our ability to accept and even bond with them grows. Over time, we start seeing commentary on near-robots robots not as eerie imitations but as fascinating technological companions.
AI-Generated Audio and Video: From Fake to Familiar
AI-generated audio and video also traverse the uncanny valley, particularly with advancements in deepfake technology and voice synthesis. Early iterations of these technologies often resulted in unsettling, almost-human outputs that made listeners and viewers uneasy. The slight robotic intonation in synthesized voices or the barely perceptible glitches in deepfakes were enough to break the illusion.
Yet, as these technologies improve, the gap narrows. AI-generated voices are becoming more natural, capturing the emotional nuances and rhythmic patterns of human speech. Similarly, deepfake technology is reaching levels of realism that make it increasingly difficult to distinguish between real and generated videos.
People are beginning to forgive minor imperfections, and the once unsettling near-human outputs are now seen as impressive feats of engineering. This shift in perception allows us to accept and even appreciate AI-generated media in ways that were previously unimaginable.
While the improving capability of AI generation of audio and video is impressive, it becomes difficult to discern recorded clips from created clips. And in many cases, the audience seems less concerned with authenticity than with the popularity of the video in question.
A deepfake video can get millions of views before it comes down, if it comes down. Political messages, fake nude and intimate scenes of famous and familiar people, and “evidence” of various positions flood the Internet. These audio and video clips not only mislead the viewers but also threaten to drown true clips in a sea of misinformation.
AI-Generated Text: From Awkward to Articulate
Natural language processing (NLP) models like OpenAI’s GPT-4 have revolutionized AI-generated text. Early versions of these models produced text that, while coherent, often contained awkward phrasing or logical inconsistencies that made readers acutely aware of its artificial nature. These imperfections placed the text firmly in the uncanny valley.
However, as these models become more sophisticated, their outputs improve significantly. AI-generated text is now often indistinguishable from human writing, with contextually relevant and fluid prose.
Readers, once quick to spot and judge the awkwardness in AI text, are now more likely to forgive minor errors, focusing instead on the overall coherence and relevance of the content. The uncanny valley of text is rapidly becoming a thing of the past, as our ability to accept AI-generated writing grows.
Students submit AI-generated homework assignments. Professionals in the workplaces present AI-generated documents and slideshows. Management supports the use of AI to save time.
AI-Generated Images: From Disturbing to Delightful
AI-generated images, especially those created using generative adversarial networks (GANs), have also made significant strides. Early AI-generated faces often had subtle anomalies—slightly asymmetrical features or unnatural lighting—that triggered discomfort. These imperfections placed them in the uncanny valley.
Today, AI-generated images are so realistic that distinguishing them from actual photographs is becoming increasingly difficult. As these images improve, our tolerance for minor imperfections grows.
What once appeared as eerie now seems fascinating, and the boundary between real and AI-generated art is blurring. This acceptance is transforming how we view and utilize AI in digital art and media.
In Conclusion
The journey through the uncanny valley reflects not just technological progress but also a significant shift in human perception. As robots and AI-generated media become more sophisticated, our ability to forgive their imperfections and embrace their presence in our lives grows.
This path into the uncanny valley of Generative AI brings us to our present state, where online video cannot be trusted without extensive verification and confirmation from the subjects. Artists and writers are seeing the market for their work shrink because AI generation of text and images is seen as a cheaper, faster alternative to employing living creatives.
The once unsettling near-human creations are now seen as impressive technological achievements, and the line between artificial and real is becoming increasingly difficult to discern. More interestingly, the revulsion previously experienced in the “uncanny valley” is increasingly being replaced by curiosity and even acceptance.