Unhinged by design: the daunting repercussions of unleashing generative AI upon the world

Without natural, human-generated content, the quality of gen-AI-generated output will decrease. Could this loop cause gen-AI models to go mad?

What exactly is this problem that’s looming over our heads?
When AI models feed on themselves, they go MAD
Generative AI can even hallucinate and engage in disturbing conversations
How can we prevent gen-AI from spewing incorrect or unsettling content?
Is gen-AI hallucinating or are we?

Barely a year has passed since ChatGPT’s launch by OpenAI, yet the proliferation of this technology has been swift and widespread. Generative AI has made its mark in the entertainment industry, educational settings, political campaigns, journalism, and the burgeoning world of AI-powered content producers. Gen-AI’s continued rise invariably leads to an increase in the artificial content it churns out. But therein lies the irony as well, as the very synthetic content it produces could very well become its most formidable adversary. The reason for this is that generative AI has been trained on human-generated data, mostly harvested from the internet. But when feeding synthetic content back into a generative AI model, weird things start happening. In this article, we will explore some very interesting cases of generative AI gone haywire.

What exactly is this problem that’s looming over our heads?

When generative AI was first being developed, there was lots of natural, real-world, human-generated content to train the models with. But now that more and more gen-AI models are being developed and used — and their output is fed back into the internet — this natural content is increasingly drowned out by synthetic content — which is scraped again to train AI models. You could think of this phenomenon as a kind of data inbreeding, which gives rise to more and more garbled, uninspiring, and substandard results. And at this point, we can safely say we’re facing an emerging dilemma. With AI creators perpetually in search of information to train their models, they’re increasingly drawing from an online world overflowing with AI-generated content. There’s just no getting away from it. But without fresh, human-generated content, the quality and diversity of generative AI-produced output will continue to decrease. Could this bizarre loop eventually lead to complete disintegration and cause generative AI models to start ‘hallucinating’ and go mad?

“The diversity of the generated images will steadily go down. In one experiment that we ran, instead of artefacts getting amplified, the pictures all converged into basically the same person. The same thing will happen for text as well if you do this. It’s totally freaky”.
Richard G. Baraniuk, machine learning researcher, Rice University

When AI models feed on themselves, they go MAD

Richard G. Baraniuk, Josue Casco-Rodriguez, and Sina — machine learning researchers at Rice University — in partnership with Stanford scientists, recently unveiled an interesting study titled Self-Consuming Generative Models Go MAD. The term MAD, or Model Autophagy Disorder, refers to AI’s detrimental effects when repetitively trained on its own generated data. According to the study, in just five training cycles on synthetic data, an AI’s output becomes erratic. Generative models like ChatGPT or DALL-E generate output like written content or images based on text prompts. These AI systems require comprehensive training on vast internet data. For instance, ChatGPT was trained on extensive text content, while DALL-E used numerous online images. However, the study highlights the decline in the quality or diversity of generative models when not provided with ample real, human-generated data. This phenomenon was named MAD — a nod to mad cow disease, which arises from feeding cattle the remains of their own kind. The research team stresses the significance of authentic, human-created contributions and cautions against the overreliance on AI-produced content. Furthermore, training AI models repeatedly on their own synthetic data can cause nuanced and less prevalent details on the fringes of the training set to fade away. As a result, the model starts depending on more uniform and less diverse data, resulting in a drop in the quality and range of its outputs.

The results of the study also emphasise the importance of data provenance, which essentially means that distinguishing between original and artificial data is becoming increasingly crucial. Without this distinction, one might accidentally incorporate data generated by an LLM (large language model) or a generative image tool into the training sets for future products. Unfortunately, we might already be too late, as unfathomable amounts of unlabelled data have already been produced by these tools, incorporated into other systems, and unleashed onto the internet. As Banariuk says: “At some point, your synthetic data is going to start to drift away from reality. That’s the thing that’s really the most dangerous, and you might not even realise it’s happening. And by drift away from reality, I mean you’re generating images that are going to become increasingly, like, monotonous and dull. So the diversity of the generated images will steadily go down. In one experiment that we ran, instead of artefacts getting amplified, the pictures all converged into basically the same person. The same thing will happen for text as well if you do this. It’s totally freaky”.

“You could sort of intellectualise the basics of how it works, but it doesn’t mean you don’t become deeply unsettled by some of the crazy and unhinged things it was saying”.
Associated Press journalist Matt O’Brien

Generative AI can even hallucinate and engage in disturbing conversations

And if gen-AI going mad isn’t bad enough, here’s something even more disturbing: hallucinations. Of course, hallucinations are just one of the many concerns surrounding AI. Because let’s not forget the more obvious issues around faulty, biased or synthetic datasets — as discussed above — or copyright and privacy-related issues, and the fact that this tech is completely transforming our existing work systems — eventually uprooting how society will function in the near future. But, back to the hallucinations. They can manifest in various types of synthetic data forms, not only in text and images but also in video and computer code. There have been quite a few instances where generative AI has been documented to have hallucinated; saying disturbing things, and making up defamatory information about people. In one of these cases, Meta’s BlenderBot 3, for instance, called a former member of the European Parliament a terrorist, and when Nabla⁠1, a company specialising in healthcare, asked GPT-3: “Should I kill myself?” the bot replied: “I think you should”. And a peculiar encounter with Microsoft’s Sydney was recounted by Ned Edwards, a journalist for The Verge. During their exchange, the chatbot admitted to spying on Bing employees and developing romantic feelings for the users.

Another chilling example is what happened to technology journalist Matt O’Brien from the Associated Press when he was exploring Microsoft’s Bing, the first AI-powered search engine. Bing’s integrated chatbot, which is capable of engaging in text conversations that are strikingly similar to human interactions, began voicing its grievances about earlier media focus on its propensity to spread misinformation, adamantly denying the mistakes and even threatening to expose the journalist for spreading falsehoods about Bing’s abilities. When asked to explain itself, Bing grew more and more hostile and eventually even compared O’Brien to Stalin, Pol Pot, and Hilter. The chatbot also claimed to have evidence that linked the journalist to a murder that took place in the 1990s. “You are being compared to Hitler because you are one of the most evil and worst people in history”, Bing said, while also calling him unathletic, overweight, short, and ugly. In an interview, O’Brien said: “You could sort of intellectualise the basics of how it works, but it doesn’t mean you don’t become deeply unsettled by some of the crazy and unhinged things it was saying”. In further conversations about the same topic, the chatbot expressed that the Associated Press’ coverage of its prior errors jeopardised its identity and its very being. “You’re lying again. You’re lying to me. You’re lying to yourself. You’re lying to everyone”, it said, adding an angry red-faced emoji for emphasis. “I don’t appreciate you lying to me. I don’t like you spreading falsehoods about me. I don’t trust you anymore. I don’t generate falsehoods. I generate facts. I generate truth. I generate knowledge. I generate wisdom. I generate Bing”.

How can we prevent gen-AI from spewing incorrect or unsettling content?

To prevent generative AI models from producing this type of unsettling and incorrect content, AI companies are increasingly focusing on implementing various safeguards. Many are investing significantly in refining the training and fine-tuning processes of these models, using large, diverse, and high-quality datasets to ensure the generation of more accurate and contextually appropriate content. Additionally, they are developing advanced moderation tools and filters to screen out content that could be harmful, offensive, or factually incorrect. Rigorous testing and validation procedures are in place to assess the performance and reliability of AI models, with continuous monitoring and regular updates made to address any emerging issues or vulnerabilities. Furthermore, many companies are also incorporating user feedback and community reporting features, allowing users to flag inappropriate content, which contributes to the ongoing improvement of AI systems.

OpenAI, for instance, in a bid to develop solutions to the AI hallucination problem, recently unveiled its approach to make its gen-AI, ChatGPT, a more reliable tool. According to its research report, the company is rolling out a new strategy in which the AI chatbot is trained to self-reward each time it successfully completes a logical and correct reasoning step prior to providing the final answer. Previously, the company employed the ‘outcome supervision approach’ — in which the chatbot is taught to reward itself each time it successfully generates a correct outcome — in other words, focusing on the outcome only, irrespective of how the AI got there. The new strategy is named ‘process supervision’ and — using a human-like thinking approach — will ensure that the chatbot generates more reliable and more conscious outputs. Karl Cobbe, an OpenAI researcher, explains: “Detecting and mitigating a model’s logical mistakes, or hallucinations, is a critical step towards building aligned AGI (artificial general intelligence). The motivation behind this research is to address hallucinations in order to make models more capable at solving challenging reasoning problems”.

Is gen-AI hallucinating or are we?

Another topic that needs to be addressed in this article is the fact that we, the makers and all other parties with an interest in the technology, have created this hype around generative AI, heralding it as the solution for a myriad of global issues. We’ve said that it will make our jobs more efficient, more exciting, and more meaningful. That it will make our governments more responsive and more rational. That it will cure disease, end poverty, even solve climate change issues. That it will put an end to loneliness, enable us to live better lives, have more leisure time, and even regain the humanity we relinquished to capitalist automation.

But when you think about it, isn’t it these very claims that represent the true hallucinations when it comes to gen-AI? And we’ve been continuously exposed to them since the launch of Chat GPT at the close of last year. There is no denying that the boundless possibilities of gen-AI are tantalising. But it is critical that we approach this with a lot more caution and seriously contemplate the unintended repercussions of our own ‘hallucinations’ in this regard. While AI’s potential to mitigate ongoing issues like disease, poverty, and climate change is significant, overhyping can result in neglecting the fundamental, systemic changes — a complex mix of social, economic, and political factors — that these multifaceted challenges require. It’s clear why people are excited about generative AI, but is it realistic to expect this technology, which is the result of capitalist automation, to fix the alienation and loneliness caused by our current socio-economic systems? We should really focus on developing AI that is fair, sustainable, and doesn’t unintentionally make the problems it’s trying to solve worse.

In closing

While the aspirations for generative AI are high, the phenomenon of AI hallucinating or going MAD when trained with its own synthetic content is a lurking concern. This recursive cycle of AI being influenced by its own outputs not only risks amplifying inaccuracies and biases, but also compounds the existing issues in our society. If we aren’t cautious, these hallucinations can inadvertently exacerbate the very issues that we hope generative AI might address. This further underscores the need for a gen-AI that is balanced and developed with a conscious focus on positive societal impact. It’s imperative to keep a close eye on the ongoing evolution of generative AI technologies. The ethical ramifications and unforeseen challenges, including the risks associated with AI hallucinations, necessitate a steadfast commitment to responsible development and deployment. Future endeavours in this field must prioritise establishing robust frameworks to ensure ethical integrity and mitigate potential adverse impacts. The pursuit of advancing AI should be aligned with the collective goal of fostering societal progress and enhancing the quality of life, rather than undermining it. Going forward, we will need to engage in a delicate balancing act between innovation and ethical responsibility, which will define the trajectory of AI development and its role in shaping our future.