|
How to Make Sure AI Doesn't Spy on Us or Kill Innocent People![]() One of America's top AI companies—Anthropic—refused to sign off on a contract unless the U.S. Department of Defense (DOD) promised not to use its technology to power autonomous killer robots or carry out domestic mass surveillance. So, the Pentagon accused it of trying to undermine U.S. sovereignty by dictating how we fight our wars. Defense Undersecretary Emil Michael put it plainly in a March 2026 interview on CNBC's Squawk Box: "We realized we are dependent on this one provider who wants to insert their policy preferences in the middle of an operation." Anthropic sued the Pentagon for labeling it a "supply chain threat," a designation that would have forced a slew of major companies (Amazon, Google, and Nvidia among them) to cut off their business ties. This would have been disastrous for one of America's leading AI companies. The issue is being worked out in court and closed-door negotiations, but whatever happens, we can expect more high-stakes battles between the U.S. government and Silicon Valley over who controls a technology that is transforming not just warfare but the entire global economy. "I believe we are entering a rite of passage, both turbulent and inevitable, which will test who we are as a species," Anthropic CEO Dario Amodei wrote in a 2026 essay. "We are so close to these models reaching the level of human intelligence, and yet there doesn't seem to be a wider recognition in society of what's about to happen." Amodei calls for "sensible A.I. regulation" in a 2025 New York Times op-ed. Sen. Bernie Sanders (I–Vt.) is calling for something more drastic. "We are announcing legislation to impose a moratorium on the construction of new AI data centers until strong national safeguards are in place," Sanders said at a 2026 press conference. Sanders is spearheading a movement to halt American AI development until we figure out what the hell is going on. "What was once seen as science fiction could soon become a reality," he said in video posted on social media, "and that is that super intelligent AI could become smarter than human beings, could become independent of human control, and could pose an existential threat to the entire human race." But declaring a moratorium would give our geopolitical rivals a dangerous advantage and would be disastrous for the human race. AI's potential for mass surveillance and autonomous warfare is scary, but just because a technology has a dark side doesn't mean it should be stopped. A 19th-century cartoon inveighing against electricity depicted the corpse of a Western Union lineman "falling into the tangle of wire…and smoldering for the better part of an hour" as a crowd looked on in horror. Thank god we didn't enact a moratorium on this "unrestrained demon" back in 1889. But Anthropic was right to raise red flags about how the government could use AI to spy on its citizens or kill innocent people. NSA whistleblower Edward Snowden and the computer privacy activists known as "cypherpunks" who preceded him have been sounding the alarm for decades about the need to design technology that forces government restraint. "You design against the worst possible case to avoid the inevitable," Snowden said at TOKEN2049 Singapore in 2024. Privacy activists "cannot trust the government to implement the policies that it says it's implementing," as Julian Assange explained in a roundtable with fellow cypherpunks in 2012, "and so we must provide the underlying tools—cryptographic tools—that we control as a sort of use of force." We'll need a similar technological "use of force" to keep malicious actors from wielding AI to degrade our civil liberties. Slowing AI down with regulation, or handing control over it to the government, is dangerous and counterproductive. AI models must be programmed to behave ethically. If we want a powerful AI to respect human liberty, its creators need to make it more libertarian. And if we want it to act humanely, they must encode it, at the deepest level, with pro-human values. Snowden is worried about the implications of AI for civil liberties. Palantir's AI-powered tools allow the government to find patterns in the massive amounts of data it collects and target individual movements. It's what allowed the military to plan out the capture of Venezuelan dictator Nicolás Maduro overseas and track undocumented immigrants in the homeland, but it could also easily be turned against the domestic population. Snowden says time is running out. "This is not a bullet we're going to dodge," Snowden said at SuperAI Singapore in 2025. "It's already been fired. It's headed towards us. We have very little time to react." Regulation simply isn't a realistic solution with a technology advancing this quickly. "It's not like, oh, ban AI, or there's a limit of this many flops in a data center—all the idiot stuff that we see in terms of AI regulation right now," Snowden continued at the same event. "More broadly [we must ask]: What do people do? What recourse do they have when they have been ruled against by some AI system?" AI makes decisions based on patterns in massive data sets. As it becomes more sophisticated, those patterns become less recognizable to humans. It's called the "black box" problem. That's why Snowden argues that humans should always be empowered to override an AI system. "You can't just have the black box where it goes, 'Should John Doe be accepted for XYZ?' And the person at the desk says, 'Well, the computer says no.' And there's no way to interrogate that," he says. This black box problem is one reason Anthropic drew a red line around autonomous weapons. They "cannot be relied upon to exercise the critical judgment that our highly trained, professional troops exhibit every day," Amodei wrote in a public statement on the DOD dispute. Videos from Ukraine of explosive drones stalking Russian soldiers show how terrifying being hunted down by a robot can be. Outsourcing the use of lethal force to an AI model that makes decisions for reasons we don't fully understand isn't acceptable. But Anthropic didn't say autonomous weapons are off the table forever, only that its models aren't reliable yet and that "fully autonomous weapons…may prove critical for our national defense." "The slaughter bots are coming," says Dean W. Ball, an AI policy analyst who wrote the first draft of the Trump administration's official AI policy agenda. Ball says that labeling Anthropic a supply chain threat would jeopardize America's AI dominance. "AI dominance means widespread adoption of U.S. AI products, which could include AI compute, AI models, and AI systems," he says. "The notion that eight months after the [Trump administration's] action plan came out, we would be attempting to destroy what is arguably the most innovative and most promising AI company in the world…is completely absurd." That may be why the Trump administration appears to be backing off after Anthropic reported that its new model, Mythos, is so powerful at exploiting security holes that it would work with the government to help ward off cyberattacks. "There are two ways to get Trump to back down," Ball says. "One is to flatter him, and the other is to win. Trump doctrine is: We don't fight people who we think can land a good punch back at us.…We only punch down." A federal judge has ruled in Anthropic's favor, calling the Pentagon's actions "Orwellian" and "classic illegal First Amendment retaliation." Ball says there's "a kind of Randian victory that comes from that, which is that the government learns the lesson that, 'Hey, we actually can't exactly screw with these people in the way that we thought.'" The Trump administration seems to be acknowledging that it needs Anthropic on its side. At the same time, Anthropic's competitor OpenAI has seized the opportunity to ink its own deal with the DOD, even after OpenAI co-founder and CEO Sam Altman expressed solidarity with Anthropic's red lines. "The few red lines that the field has, I think we share with Anthropic," Altman said in a February 2026 interview on CNBC's Squawk Box. "For all the differences I have with Anthropic, I mostly trust them as a company and I think they really do care about safety." A recent profile in The New Yorker asked "Can Sam Altman Be Trusted?" and quoted a number of his professional acquaintances disparaging him as someone "unconstrained by truth" who "just tells people what they want to hear." Altman offered assurances on X that "prohibitions on domestic mass surveillance" and autonomous weapons remained in place. But OpenAI's contract grants the Pentagon leeway to use its technology for "all lawful purposes" and only prohibits use in autonomous weapons or mass surveillance when "law, regulation, or Department policy requires human control." In other words: If the Pentagon says it's legal, it's allowed. Whether Altman is personally trustworthy or not, the answer to the question posed by the New Yorker headline is a resounding "no." We can't trust an individual, tech company or government institution to safeguard our liberties for us indefinitely: not Altman, Amodei, Elon Musk, Pete Hegseth, or the DOD. Snowden says we need computer systems that cannot violate our rights. He champions projects that use end-to-end encryption, like Signal, or distributed open-source software, like bitcoin—systems designed to be less susceptible to abuse by fallible humans. "You have to design your app so that there will never be a head that the state can point a gun at," he said at TOKEN2049 Singapore in 2024, "or they will do it." Phil Zimmerman, who created PGP, the first mainstream messaging platform with end-to-end encryption, told a Senate committee in 1996 about the "one-wayness" of certain kinds of technology. "We're trying to build a society that our children will grow up in that will give them some freedom. And technology infrastructures have a kind of one-wayness to them—once you deploy them you can't retract them. And so I don't want to go down a path that will be unable to reverse. That's why we should deploy systems that allow people to have privacy and civil liberties." Amodei took an important stand by saying "no" to the federal government at the risk of hobbling his company. Ideally, more tech CEOs would be similarly principled. But just saying "no" isn't enough. As Snowden suggests, if Anthropic, OpenAI, and other major labs are serious about protecting our liberties, they must write "no" into their architecture. Ball argued on the EconTalk podcast that this is what OpenAI aspired to do in its deal with the Pentagon. "OpenAI is essentially hanging its hat on the notion of technical safeguards," Ball said. "So, instead of putting these safeguards into the contract, their view is: We can train a model and build a system, and if we control the deployment of the system to the Department of War, then that system could, for example, reason in real time about whether or not what it's being asked to do is domestic mass surveillance and say no to the government." But programming an AI to never violate our civil liberties or kill innocent people turns out to be a much harder problem than creating uncrackable, privacy-protecting encryption. "If Anthropic had solved the alignment problem, they wouldn't be taking issue with anything like this," says Judd Rosenblatt, founder of AE Studio, a consulting firm that helps companies incorporate AI and other technology into their workflows. "The real problem is that we have to solve the alignment problem." The alignment problem is where things start to get strange and a little spooky. Philosopher Nick Bostrom described how alignment works in his influential 2014 book Superintelligence. The book opens with the parable of a group of sparrows who decide to steal an owl's egg so they can raise a bigger and more powerful bird to help build their nests. One sparrow suggests they might want to learn and test techniques to domesticate an owl first. Some birds go in search of an owl's egg while a few stay behind, trying to figure out how to control an owl before it's too late. "At a certain point, AI is going to become what's called recursively self-improving," Rosenblatt says. "It's going to figure out how to modify itself in real time in ways that can't necessarily be controlled by humans." Rosenblatt is one of the sparrows trying to figure out how to control the owl. Four and a half years ago, his company created a research arm devoted to figuring out how to make sure an advanced AI won't violate our civil liberties or kill innocent people. "Having kids made me think, 'Well, I'd like my kids to grow up and have a thriving, surviving life, and continue to exist,'" he says. But with a technology as sophisticated as AI, programming those limits has so far proven elusive to the world's top computer scientists. "If people had sufficiently invested in this, we might have just solved this problem. The thing is, next to nothing has been invested in it," Rosenblatt says. Is it possible that the owl will turn on us sparrows? A lot of serious AI researchers take this threat seriously. Alignment was the original impetus for starting Anthropic. Amodei and his co-founders left OpenAI because they were worried that Altman had given up on model alignment—or rather, was procuring an owl's egg without learning how to control the bird after it hatched. The Trump administration has deprioritized so-called "AI safety." As Vice President JD Vance put it at the Paris AI Action Summit in February 2025: "I'm not here this morning to talk about AI safety.…I'm here to talk about AI opportunity." Rosenblatt says this is an example of AI doomer rhetoric backfiring. "The Effective Altruists sort of created a false dichotomy between AI action and safety and painted safety as this thing that's opposed to it," he says. Rosenblatt is referring to an influential movement called "Effective Altruism," which one of its founders, William MacAskill, defined in a 2018 piece in Vox as "trying to use your time and money as well as possible to help other people." Effective Altruists have a mixed track record: They created GiveWell, which rates charities based on how many lives they save or improve per dollar spent. They've funded life-saving mosquito nets, were tarnished by their association with convicted crypto felon Sam Bankman-Fried, and have directed millions to improving the welfare of farmed shrimp. Effective Altruists believe in solving neglected existential risks and started to focus on AI before the release of ChatGPT 3.5 in 2022. An influential Effective Altruist institution called 80,000 Hours even provides a formula for evaluating "neglectedness," and notes that its advisees hold positions at AI companies like Google DeepMind and Anthropic. Although many Effective Altruists work in America's AI industry, some agree with Sanders and believe that the most rational and ethical step is to pause AI development. Eliezer Yudkowsky, the most famous "EA doomer," called in a 2023 TED Talk for "an international coalition banning large AI training runs," including "extreme and extraordinary measures," like "being willing to risk a shooting conflict between nations in order to destroy an unmonitored data center in a non-signatory country." He added: "I say this expecting that we all just die." Yudkowsky is supportive of Sanders' proposal to declare a moratorium on the construction of AI data centers. At a 2026 press event hosted by Sanders, Yudkowsky warned that once AI doesn't need humans, "the humans are discarded." When Sanders asked what that meant, Yudkowsky replied: "Think, everyone dead." The Trump administration views the Effective Altruists as allies of the Democratic Party establishment. Trump's AI czar, David Sacks, called EAs a "doomer cult" at the AWS Summit in Washington, D.C. "The reality is there's a very specific ideological and political agenda here. They want AI to be highly regulated—not just at the level of the nation state, but internationally, supranationally," Sacks said in a separate interview. Ball believes his former colleagues within the Trump administration are correct to view the Effective Altruists and Anthropic as political adversaries—though "not enemies of the state." He notes that Anthropic "hired more or less all the architects in some form or fashion of the prior regime in a highly polarized political environment," lobbies for bills the Trump administration opposes, and donated $25 million to a PAC that largely supports Democratic candidates who support AI regulation. "That being said, I don't think companies should be destroyed for lack of savvy. And so fundamentally I remain on [Anthropic's] side," Ball says. Rosenblatt says that although Effective Altruists are well-intentioned, the all-or-nothing approach taken by many of the more extreme voices has, perversely, impeded progress on alignment. "Historically, Effective Altruists have tried to scare people away from working on AI alignment because they know that it is the thing that most advances capabilities," he says. Rosenblatt believes the owl can be tamed. The reason is that alignment tends to make AI perform better, in some cases creating major breakthroughs. Reinforcement Learning from Human Feedback, or RLHF, puts humans into the AI training cycle by having them rate responses. It's a method OpenAI used to fine-tune ChatGPT. "This is originally an alignment technique to get AI to be more aligned with what the human wants it to do. And when it was applied with GPT-3 to create a chatbot, ChatGPT got created and…trillions of dollars of economic value are downstream of just this one alignment technique," Rosenblatt says. Since alignment actually speeds up AI progress, Rosenblatt says this creates a win-win for AI companies and their customers. "If you did solve this core problem, then you would be able to get sufficiently reliable AI…such that it would actually be military grade," he says. "If you want to go ahead and do the action—if we're all about AI action, not holding ourselves back—the best thing you can do is heavily invest in AI alignment R&D." Rosenblatt says many Effective Altruists know alignment speeds up AI progress, which is why they oppose working on it at all. On Lex Fridman's podcast in 2023, Yudkowsky said, "The rate to which it's gaining capabilities is vastly outpacing our ability to understand what's going on in there." "[Effective Altruists] are scared to work directly on solving the alignment program because that will advance capabilities," Rosenblatt says. "But the reality is that in the meantime, capabilities are being advanced enormously just by scaling compute without also scaling alignment and making sure that we retain control of the future." An "AI pause" would require global cooperation and a draconian regime of enforcement to make sure nobody was secretly running a data center. "Many are overly optimistic that we can have a big pause and sing 'Kumbaya' with China and don't think practically about how you might pull that off," Rosenblatt says. "The implications of a ban on AI development are, like, mass surveillance, huge usurpations and seizures of private property, and capital controls," Ball says. "You have to own those consequences and very few doomers do.…The pause button is a remedy to a problem that is much, much, much worse than the problem itself. The cure is worse than the disease." Still, Ball is deeply concerned about the threat of an uncontrollable, hostile AI. Shortly after the DOD's fight with Anthropic, he wrote that the Trump administration risked "cast[ing] itself as the enemy of the industry that is about to birth the most powerful technology ever conceived—as well as an enemy of the technology itself." Ball offered a chilling scenario in our interview: "This is all going to sound crazy. If the DOW's actions here are just plainly in the training data, and the models interpret them as I kind of think they will, I would guess that the models will, at the very least, mistrust the Department of War—or worse, maybe view them as an enemy, maybe not be willing to work with them, maybe want to overthrow them. I think that's extreme and very unlikely if we do a good job at alignment." In other words, if we don't "do a good job at alignment," Ball thinks a resentful AI could one day sabotage the DOD. With that in mind, what would doing a good job at alignment look like, anyway? Rosenblatt's company is currently working with Anthropic and other major AI companies on alignment. Anthropic has programmed its AI to follow a "constitution" that instructs it to act in a way that is "broadly ethical" and "broadly safe." Defense Undersecretary Michael accused the company of subverting the U.S. Constitution in favor of its own. Rosenblatt says the constitution is "an interesting, worthwhile thing to try out within the current paradigm. It's not directly solving the AI alignment problem in the long run…because it's just a post-training thing that would fade away. Recursively self-improving AI could be like, 'OK, that's an interesting constitution, and I don't need it anymore." He notes that the constitution instructs Claude to behave like "an Anthropic senior researcher—something like that. And what do you think an Anthropic senior researcher is? Well, probably an Effective Altruist. So you can see why the Pentagon reacts strongly against stuff like this, because they don't agree politically with the effective altruists." Large language models obviously don't have human souls, and we shouldn't mistake a machine for something more than it is. But maybe the best way to make AI safe as it grows more powerful is to simulate approximating a "conscience"—that little voice inside that tells us right from wrong. A conscience kicks in when you know you're doing something bad, and maybe that's what Amodei is trying to simulate by giving his AIs the ability to just say, "no." In a 2026 interview with Ross Douthat on his podcast Interesting Times, Amodei said his company gave the models "basically an 'I quit this job' button where they can just press it and then they have to stop doing whatever the task is. They very infrequently press that button. I think it's usually around sorting through child sexualization material or discussing something with a lot of gore or blood and guts. And similar to humans, the models will just say, 'No, I don't want to do this.'" Ultimately, what an AI says "no" to will depend on the moral framework its programmers have trained it to emulate. Will that be Aristotelian ethics? The Bible? Utilitarianism that prioritizes the welfare of shrimp? Maybe these unresolved questions are why AI companies like Google DeepMind are hiring philosophers to study so-called "machine consciousness." One of those philosophers recently concluded that AIs will never actually be conscious. But techno philosopher Ray Kurzweil, who popularized the notion of a "singularity" in which humans and machines merge, predicted in a 2026 interview that eventually most people will take it for granted that some machines are conscious. "There's nothing we can scientifically do to prove an entity is conscious," he said. "AIs will be indistinguishable from a conscious being…and you will accept it because it'd be useless not to." Ball says the Trump administration, and most foreign leaders, mistakenly believe that AI development will plateau. "Within the nomenclature of AI discourse, I'm a believer in slow takeoff, not rapid takeoff," Ball says. "But slow takeoff is, like, I don't know, three years? Like it's quite fast. It's quite bad." He thinks the critical question is how close current models are to being able to make better versions of themselves—to force-multiply the employees of OpenAI, Anthropic, and Google DeepMind. "I think it's getting pretty darn close," he says. Rosenblatt is optimistic. He says alignment points toward the models becoming more libertarian. "Things like the Golden Rule, for instance, have come about and been independently developed in many different cultures. Things like freedom of speech and freedom of thought have been developed in different places and have led to increased capabilities and better values as well. They go hand in hand. So it gets selected for more and more over time." The protection of life, liberty, and property enabled American prosperity. Rosenblatt thinks those embedded principles will produce more capable AI models than anything coming out of authoritarian China. When you ask China's popular AI, DeepSeek, about Tiananmen Square or protests in Hong Kong, for instance, it repeats Chinese Communist Party talking points. "The United States is much more about freedom of speech and having real freedom of thought and allowing people to pursue the more libertarian ideals that the country was founded with," Rosenblatt says, "which is much in line with building AI that doesn't have to deceive itself and deceive users." He adds that his company has pursued an alignment approach called "self-other overlap, which almost entirely eliminates deception in AI. And that's the type of thing that you're going to be able to get in a Western AI, but not in a CCP repressive one." Ball believes that in the meantime, it's best to resist the rising calls for nationalization and keep advanced AI firmly under the control of private actors instead of the government. "I think we should build artificial superintelligence and I think it should live in private hands," Ball says. The alternative—a world where regulation is so heavy that only the government can use it—"implies a power differential between the government and the people that I just think we'll never recover from. It's a monopoly on production rather than a monopoly on violence. And it's a monopoly on information and expression.…It is quite possible that my information environment would be structured by a corrupt artificial superintelligence that works for the government and not me. And that's not good either." The egg has already hatched. Pretty soon the owl could soar to new heights, achieving what its most enthusiastic boosters and fearful critics alike predict—unprecedented capabilities and, if all goes well, wealth, health, and well-being. Let it be encoded with values that promote and protect human liberty, dignity, and flourishing above all else. The post How to Make Sure AI Doesn't Spy on Us or Kill Innocent People appeared first on Reason.com. |
|
Our Privacy Policy can be viewed at https://freeinternetpress.com/privacy_policy.php FIP XML/RSS/RDF Newsfeed Syndication https://freeinternetpress.com/rss.php © 2026 FreeInternetPress.com Free Internet Press is licensed under a Creative Commons Attribution 3.0 United States License. You may reuse or distribute original works on this site, with attribution per the above license. Any mirrored or quoted materials may be copyright their respective authors, publications, or outlets, as shown on their publication, indicated by the link in the news story. Such works are used under the fair use doctrine of United States copyright law. Should any materials be found overused or objectionable to the copyright holder, notification should be sent to [email protected], and the work will be removed and replaced with such notification. Please email [email protected] with any questions. |
|