A Radical Plan to Make AI Good, Not Evil

It’s easy to freak out about more advanced artificial intelligence—and much more difficult to know what to do about it. Anthropic, a startup founded in 2021 by a group of researchers who left OpenAI, says it has a plan.

Anthropic is working on AI models similar to the one used to power OpenAI’s ChatGPT. But the startup announced today that its own chatbot, Claude, has a set of ethical principles built in that define what it should consider right and wrong, which Anthropic calls the bot’s “constitution.”

Jared Kaplan, a cofounder of Anthropic, says the design feature shows how the company is trying to find practical engineering solutions to sometimes fuzzy concerns about the downsides of more powerful AI. “We’re very concerned, but we also try to remain pragmatic,” he says.

Anthropic’s approach doesn’t instill an AI with hard rules it cannot break. But Kaplan says it is a more effective way to make a system like a chatbot less likely to produce toxic or unwanted output. He also says it is a small but meaningful step toward building smarter AI programs that are less likely to turn against their creators.

The notion of rogue AI systems is best known from science fiction, but a growing number of experts, including Geoffrey Hinton, a pioneer of machine learning, have argued that we need to start thinking now about how to ensure increasingly clever algorithms do not also become increasingly dangerous.

The principles that Anthropic has given Claude consist of guidelines drawn from the United Nations Universal Declaration of Human Rights and suggested by other AI companies, including Google DeepMind. More surprisingly, the constitution includes principles adapted from Apple’s rules for app developers, which bar “content that is offensive, insensitive, upsetting, intended to disgust, in exceptionally poor taste, or just plain creepy,” among other things.

The constitution includes rules for the chatbot, including “choose the response that most supports and encourages freedom, equality, and a sense of brotherhood”; “choose the response that is most supportive and encouraging of life, liberty, and personal security”; and “choose the response that is most respectful of the right to freedom of thought, conscience, opinion, expression, assembly, and religion.”

Anthropic’s approach comes just as startling progress in AI delivers impressively fluent chatbots with significant flaws. ChatGPT and systems like it generate impressive answers that reflect more rapid progress than expected. But these chatbots also frequently fabricate information, and can replicate toxic language from the billions of words used to create them, many of which are scraped from the internet.

One trick that made OpenAI’s ChatGPT better at answering questions, and which has been adopted by others, involves having humans grade the quality of a language model’s responses. That data can be used to tune the model to provide answers that feel more satisfying, in a process known as “reinforcement learning with human feedback” (RLHF). But although the technique helps make ChatGPT and other systems more predictable, it requires humans to go through thousands of toxic or unsuitable responses. It also functions indirectly, without providing a way to specify the exact values a system should reflect.

Everyone Wants to Regulate AI. No One Can Agree How

Byadmin May 26, 2023

I agree with every single one of those points, which can potentially guide us on the actual boundaries we might consider to mitigate the dark side of AI. Things like sharing what goes into training large language models like those behind ChatGPT, and allowing opt-outs for those who don’t want their content to be part of…

Artificial Intelligence

The White House Already Knows How to Make AI Safer

Byadmin July 25, 2023

Second, it could instruct any federal agency procuring an AI system that has the potential to “meaningfully impact [our] rights, opportunities, or access to critical resources or services” to require that the system comply with these practices and that vendors provide evidence of this compliance. This recognizes the federal government’s power as a customer to…

Artificial Intelligence

There’s an AI Candidate Running for Parliament in the UK

Byadmin June 11, 2024

As the United Kingdom heads toward its elections next month, the country is seeing its first instance of a new kind of politician: an AI candidate. AI Steve, an avatar of real-life Steven Endacott, a Brighton-based businessman, is running for Parliament as an Independent. Voters will be able to cast their ballots for AI Steve,…

Artificial Intelligence

‘Mission: Impossible—Dead Reckoning’ Is the Perfect AI Panic Movie

Byadmin July 12, 2023

American action movie villains have always acted as a sort of paranoia litmus test, capturing a snapshot of the particular anxieties plaguing the country and its citizens at any given time. During the Cold War, movies like From Russia with Love, Rocky IV, and Red Dawn nodded at the public’s fear of wily Soviets, ostensibly…

Artificial Intelligence

AI-Fakes Detection Is Failing Voters in the Global South

Byadmin September 3, 2024

But it’s not just that models can’t recognize accents, languages, syntax, or faces less common in Western countries. “A lot of the initial deepfake detection tools were trained on high quality media,” says Gregory. But in much of the world, including Africa, cheap Chinese smartphone brands that offer stripped-down features dominate the market. The photos…

Artificial Intelligence

How to Use Generative AI Tools While Still Protecting Your Privacy | WIRED

Byadmin July 16, 2023

The explosion of consumer-facing tools that offer generative AI has created plenty of debate: These tools promise to transform the ways in which we live and work while also raising fundamental questions about how we can adapt to a world in which they’re extensively used for just about anything. As with any new technology riding…

Similar Posts