Three Easy Ways to Make AI Chatbots Safer
We have entered the brave new world of AI chatbots. This means everything from reenvisioning how students learn in school to protecting ourselves from mass-produced misinformation. It also means heeding the mounting calls to regulate AI to help us navigate an era in which computers write as fluently as people. Or even better.
So far, there is more agreement on the need for AI regulation than on what this would entail. Mira Murati, head of the team that created the chatbot app ChatGPT—the fastest growing consumer-Internet app in history—said governments and regulators should be involved, but she didn’t suggest how. At a corporate event in March, Elon Musk similarly spoke with less than exacting precision: “We need some kind of, like, regulatory authority or something overseeing AI development.” Meanwhile, ChatGPT’s wide range of uses upended European efforts to regulate single-purpose AI applications.
To break the impasse, I propose transparency and detection requirements tailored specifically to chatbots, which are computer programs that rely on artificial intelligence to converse with users and produce fluent text in response to typed requests. Chatbot apps like ChatGPT are an enormously important corner of AI poised to reshape many daily activities—from how we write to how we learn. Reining in chatbots poses trouble enough without getting bogged down in wider AI legislation created for autonomous weapons, facial recognition, self-driving cars, discriminatory algorithms, the economic impacts of widespread automation and the slim but nonzero chance of catastrophic disaster some fear AI may eventually unleash. The tech industry is rushing headlong into the chatbot gold rush; we need prompt, focused legislation that keeps pace.
The new rules should track the two stages AI firms use to build chatbots. First, an algorithm trains on a massive amount of text to predict missing words. If you see enough sentences beginning “It’s cloudy today, it might…,” you’ll figure out the most likely conclusion is “rain”—and the algorithm learns this too. The trained algorithm can then generate words one at a time, just like the autocomplete feature on your phone. Next, human evaluators painstakingly score the algorithm’s output on a handful of measures such as accuracy and relevance to the user’s query.
The first regulatory requirement I propose is that all consumer-facing apps involving chatbot technology make public the text that the AI was first trained on. This text is immensely influential: train on Reddit posts, and the chatbot will learn to speak like a Redditor. Train them on the Flintstones, and they will talk like Barney Rubble. A person concerned about toxicity on the Web might want to avoid chatbots trained on text from unseemly sites. Public pressure could even dissuade companies from training chatbots on things like conspiracy theory “news” sites—but that’s only if the public knows what text the companies train on. In Mary Shelley’s 1818 novel Frankenstein, she provided a glimpse into the monster’s mind by listing the books read by this literary forebear to artificial intelligence. It’s time for tech companies to do the same for their own unearthly chatbot creations.
The human evaluators also hugely shape a chatbot’s behavior, which points to a second transparency requirement. One of ChatGPT’s engineers recently described the principles the team used to guide this second training stage: “You want it to be helpful, you want it to be truthful, you want it to be—you know—nontoxic.… It should also clarify that it’s an AI system. It should not assume an identity that it doesn’t have, it shouldn’t claim to have abilities that it doesn’t possess, and when a user asks it to do tasks that it’s not supposed to do, it has to write a refusal message.” I suspect the guidelines provided to the evaluators, which included low-wage contract workers in Kenya, were more detailed. But there is currently no legal pressure to disclose anything about the training process.
As Google, Meta and others race to embed chatbots in their products to keep up with Microsoft’s embrace of ChatGPT, people deserve to know the guiding principles that shape them. Elon Musk is reportedly recruiting a team to build a chatbot to compete with what he sees as ChatGPT’s excessive “wokeness”; without more transparency into the training process, we are left wondering what this means and what previously off-limits (and potentially dangerous) ideologies his chatbot will espouse.
The second requirement therefore is that the guidelines used in the second stage of chatbot development should be carefully articulated and publicly available. This will prevent companies from training chatbots in a slapdash manner, and it will reveal what political slant a chatbot might have, what topics it won’t touch and what toxicity the developers didn’t eschew.
Just as consumers have a right to know the ingredients in their food, they should know the ingredients in their chatbots. The two transparency requirements proposed here give people the chatbot ingredient lists they deserve. This will help people make healthy choices regarding their information diet.
Detection drives the third needed requirement. Many teachers and organizations are considering imposing bans on content produced by chatbots (some have already done so, including Wired and a popular coding Q&A site), but a ban isn’t worth much if there’s no way to detect chatbot text. OpenAI , the company behind ChatGPT, released an experimental tool to detect ChatGPT’s output, but it was terribly unreliable. Luckily, there’s a better way—one that OpenAI may soon implement: watermarking. This is a technical method for altering chatbot word frequencies that is unnoticeable to users but provides a hidden stamp identifying the text with its chatbot author.
Rather than merely hoping OpenAI and other chatbot producers implement watermarking, we should mandate it. And we should require chatbot developers to register their chatbots and unique watermarking signatures with a federal agency like the Federal Trade Commission or the AI oversight agency that Representative Ted Lieu is proposing. The federal agency could provide a public interface allowing anyone to plug in a passage of text and see which, if any, chatbots likely produced it.
The transparency and detection measures proposed here would not slow down AI progress or lessen the ability of chatbots to serve society in positive ways. They would simply make it easier for consumers to make informed decisions and for people to identify AI-generated content. While some aspects of AI regulation are quite delicate and difficult, these chatbot regulations are clear and urgently needed steps in the right direction.
This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.