AI Safety and the Age of Dislightenment

rw-book-cover

Metadata

Author: Jeremy Howard
Full Title: AI Safety and the Age of Dislightenment
Document Note: The article argues that proposals to regulate AI through model licensing and surveillance are likely to be ineffective and may concentrate power in unsustainable ways. Instead, the author suggests an approach that advocates for openness, humility, and broad consultation to develop better responses that align with principles and values. This could include supporting open-source model development to ensure that the most powerful AI models are available to everyone and that society can work together to understand and counter potential harms. The article concludes by warning that restricting access to open-source models could endanger public knowledge about the function of AI.
URL: https://www.fast.ai/posts/2023-11-07-dislightenment.html

Highlights

society and empowering society to defend itself is delicate. We should advocate for openness, humility and broad consultation to develop better responses aligned with our principles and values — (View Highlight)
Other experts, however, counter that “There is so much attention flooded onto x-risk (existential risk)… that it ‘takes the air out of more pressing issues’ and insidiously puts social pressure on researchers focused on other current risks.” (View Highlight)
if AI turns out to be powerful enough to be a catastrophic threat, the proposal may not actually help. In fact it could make things much worse, by creating a power imbalance so severe that it leads to the destruction of society. These concerns apply to all regulations that try to ensure the models themselves (“development”) are safe, rather than just how they’re used. The effects of these regulations may turn out to be impossible to undo, and therefore we should be extremely careful before we legislate them. (View Highlight)
the only way to ensure that AI models can’t be misused is to ensure that no one can use them directly. Instead, they must be limited to a tightly controlled narrow service interface (like ChatGPT, an interface to GPT-4) (View Highlight)
But those with full access to AI models (such as those inside the companies that host the service) have enormous advantages over those limited to “safe” interfaces. If AI becomes extremely powerful, then full access to models will be critical to those who need to remain competitive, as well as to those who wish to cause harm (View Highlight)
This could lead to a society where only groups with the massive resources to train foundation models, or the moral disregard to steal them, have access to humanity’s most powerful technology. These groups could become more powerful than any state. Historically, large power differentials have led to violence and subservience of whole societies. (View Highlight)
If we regulate now in a way that increases centralisation of power in the name of “safety”, we risk rolling back the gains made from the Age of Enlightenment, and instead entering a new age: the Age of Dislightenment. Instead, we could maintain the Enlightenment ideas of openness and trust, such as by supporting open-source model development. Open source has enabled huge technological progress through broad participation and sharing. Perhaps open AI models could do the same. Broad participation could allow more people with a wider variety of expertise to help identify and counter threats, thus increasing overall safety — as we’ve previously seen in fields like cyber-security. (View Highlight)
By regulating applications we focus on real harms and can make those most responsible directly liable. Another useful approach in the AI Act is to regulate disclosure, to ensure that those using models have the information they need to use them appropriately (View Highlight)
The rapid development of increasingly capable AI has many people asking to be protected, and many offering that protection. The latest is a white paper titled: “Frontier AI Regulation: Managing Emerging Risks to Public Safety’’ (FAR). Many authors of the paper are connected to OpenAI and Google, and to various organizations funded by investors of OpenAI and Google. FAR claims that “government involvement will be required to ensure that such ‘frontier AI models’ are harnessed in the public interest”. But can we really ensure such a thing? At what cost? (View Highlight)
While superficially seeming to check off various safety boxes, the regulatory regime being advanced in FAR ultimately leads to a vast amount of power being placed into the entrenched companies (by virtue of them having access to the raw models), giving them an information asymmetry against all other actors - including governments seeking to regulate or constrain them. It may lead to the destruction of society. (View Highlight)
Here’s why: because these models are general-purpose computing devices, it is impossible to guarantee they can’t be used for harmful applications. That would be like trying to make a computer that can’t be misused (such as for emailing a blackmail threat). The full original model is vastly more powerful than any “ensured safe” service based on it can ever be. The full original model is general-purpose: it can be used for anything. But if you give someone a general-purpose computing device, you can’t be sure they won’t use it to cause harm. (View Highlight)
you give them access to a service which provides a small window into the full model. For instance, OpenAI provides public access to a tightly controlled and tuned text-based conversational interface to GPT-4, but does not provide full access to the GPT-4 model itself. (View Highlight)
Those who crave power and wealth, but fail to get access to model weights, now have a new goal: get themselves into positions of power at organizations that have big models, or get themselves into positions of power at the government departments that make these decisions. Organizations that started out as well-meaning attempts to develop AI for societal benefit will soon find themselves part of the corporate profit-chasing machinery that all companies join as they grow, run by people that are experts at chasing profits. (View Highlight)
The truth is that this entire endeavor, this attempt to control the use of AI, is pointless and ineffective. Not only is “proliferation” of models impossible to control (because digital information is so easy to exfiltrate and copy), it turns out that restrictions on the amount of compute for training models are also impossible to enforce. That’s because it’s now possible for people all over the world to virtually join up and train a model together. For instance, Together Computer has created a fully decentralized, open, scalable cloud for AI, and recent research has shown it is possible to go a long way with this kind of approach. (View Highlight)
There is more compute capacity in the world currently deployed for playing games than for AI. Gamers around the world can simply install a small piece of software on their computers to opt into helping train these open-source models. Organizing such a large-scale campaign would be difficult, but not without precedent, as seen in the success of projects such as Folding@Home and SETI@Home (View Highlight)
in a recent interview with Lex Fridman, Comma.ai founder George Hotz explained how his new company, Tiny Corp, is working on the “Tiny Rack”, which he explains is powered based on the premise: “What’s the most power you can get into your house without arousing suspicion? And one of the answers is an electric car charger.” So he’s building an AI model training system that uses the same amount of power as a car charger (View Highlight)
When the self-described pioneer of the AI Safety movement, Eliezer Yudkowsky, proposed airstrikes on unauthorized data centers and the threat of nuclear war to ensure compliance from states failing to control unauthorized use of computation capability, many were shocked. But bombing data centers and global surveillance of all computers is the only way to ensure the kind of safety compliance that FAR proposes.5 (View Highlight)
Alex Engler points out an alternative approach to enforced safety standards or licensing of models, which is to “regulate risky and harmful applications, not open-source AI models’’. This is how most regulations work: through liability. If someone does something bad, then they get in trouble. If someone creates a general-purpose tool that someone else uses to do something bad, the tool-maker doesn’t get in trouble (View Highlight)
This is a critical distinction: the distinction between regulating usage (that is, actually putting a model into use by making it part of a system — especially a high risk system like medicine), vs development (that is, the process of training the model). (View Highlight)
Improvements in AI capabilities can be unpredictable, and are often difficult to fully understand without intensive testing. Regulation that does not require models to go through sufficient testing before deployment may therefore fail to reliably prevent deployed models from posing severe risks.” This is a non-sequitur. Because models cannot cause harm without being used, developing a model cannot be a harmful activity.6 Furthermore (View Highlight)
This leads us to another useful regulatory path: deployment disclosure. If you’re considering connecting an automated system which uses AI to any kind of sensitive infrastructure, then we should require disclosure of this fact. Furthermore, certain types of connection and infrastructure should require careful safety checks and auditing in advance. (View Highlight)
Better AI can be used to improve AI. This has already been seen many times, even in the earlier era of less capable and well-resourced algorithms. Google has used AI to improve how data centers use energy, to create better neural network architectures, and to create better methods for optimizing the parameters in those networks. Model outputs have been used to create the prompts used to train new models, and to create the model answers for these prompts, and to explain the reasoning for answers. (View Highlight)
Those with access to the full models can build new models faster and better than those without. One reason is that they can fully utilize powerful features like fine-tuning, activations, and the ability to directly study and modify weights.7 One recent paper, for instance, found that fine-tuning allows models to solve challenging problems with orders of magnitude fewer parameters than foundation models. (View Highlight)
This kind of feedback loop results in centralization: the big companies get bigger, and other players can’t compete. This results in centralization, less competition, and as a result higher prices, less innovation, and lower safety (since there’s a single point of failure, and a larger profit motive which encourages risky behavior). (View Highlight)
There are other powerful forces towards centralization. Consider Google, for instance. Google has more data than anyone else on the planet. More data leads directly to better foundation models. Furthermore, as people use their AI services, they are getting more and more data about these interactions. They use AI to improve their products, making them more “sticky” for their users and encouraging more people to use them, resulting in them getting still more data, which improves their models and products based on them further. Also, they are increasingly vertically integrated, so they have few powerful suppliers. They create their own AI chips (TPUs), run their own data centers, and develop their own software. (View Highlight)
The alternative to craving the safety and certainty of control and centralization is to once again take the risk we took hundreds of years ago: the risk of believing in the power and good of humanity and society. Just as thinkers of the Enlightenment asked difficult questions like “What if everyone got an education? What if everyone got the vote?”, we should ask the question “What if everyone got access to the full power of AI?” (View Highlight)
To be clear: asking such questions may not be popular. The counter-enlightenment was a powerful movement for a hundred years, pushing back against “the belief in progress, the rationality of all humans, liberal democracy, and the increasing secularization of society”. It relied on a key assumption, as expounded by French philosopher Joseph de Maistre, that “Man in general, if reduced to himself, is too wicked to be free.” (View Highlight)
What does it look like to embrace the belief in progress and the rationality of all humans when we respond to the threat of AI mis-use? One idea which many experts are now studying is that open source models may be the key. (View Highlight)
What if the most powerful AI models were open source? There will still be Bad Guys looking to use them to hurt others or unjustly enrich themselves. But most people are not Bad Guys. Most people will use these models to create, and to protect. How better to be safe than to have the massive diversity and expertise of human society at large doing their best to identify and respond to threats, with the full power of AI behind them? How much safer would you feel if the world’s top cyber-security, bio-weapons, and social engineering academics were working with the benefits of AI to study AI safety, and that you could access and use all of their work yourself, compared to if only a handful of people at a for-profit company had full access to AI models? (View Highlight)
In order to gain access to the better features of full model access, and reduce the level of commercial control of what has previously been an open research community with a culture of sharing, the open-source community has recently stepped in and trained a number of quite capable language models. As of July 2023, the best of these are at a similar level to the second-tier cheaper commercial models, but not as good as GPT-4 or Claude. They are rapidly increasing in capability, and are attracting increasing investment from wealthy donors, governments, universities, and companies that are seeking to avoid concentration of power and ensure access to high quality AI models. (View Highlight)
In order to gain access to the better features of full model access, and reduce the level of commercial control of what has previously been an open research community with a culture of sharing, the open-source community has recently stepped in and trained a number of quite capable language models. As of July 2023, the best of these are at a similar level to the second-tier cheaper commercial models, but not as good as GPT-4 or Claude. They are rapidly increasing in capability, and are attracting increasing investment from wealthy donors, governments, universities, and companies that are seeking to avoid concentration of power and ensure access to high quality AI models. (View Highlight)
In order to gain access to the better features of full model access, and reduce the level of commercial control of what has previously been an open research community with a culture of sharing, the open-source community has recently stepped in and trained a number of quite capable language models. As of July 2023, the best of these are at a similar level to the second-tier cheaper commercial models, but not as good as GPT-4 or Claude. They are rapidly increasing in capability, and are attracting increasing investment from wealthy donors, governments, universities, and companies that are seeking to avoid concentration of power and ensure access to high quality AI models. (View Highlight)
However, the proposals for safety guarantees in FAR are incompatible with open source frontier models. FAR proposes “it may be prudent to avoid potentially dangerous capabilities of frontier AI models being open sourced until safe deployment is demonstrably feasible”. But even if an open-source model is trained in the exact same way from the exact same data as a regulatorily-approved closed commercial model, it can still never provide the same safety guarantees. That’s because, as a general-purpose computing device, anybody could use it for anything they want — including fine-tuning it using new datasets and for new tasks. (View Highlight)
“For foundation models to advance the public interest, their development and deployment should ensure transparency, support innovation, distribute power, and minimize harm… We argue open-source foundation models can achieve all four of these objectives, in part due to inherent merits of open-source (pro-transparency, pro-innovation, anti-concentration)” (View Highlight)
“If closed-source models cannot be examined by researchers and technologists, security vulnerabilities might not be identified before they cause harm… On the other hand, experts across domains can examine and analyze open-source models, which makes security vulnerabilities easier to find and address. In addition, restricting who can create FMs would reduce the diversity of capable FMs and may result in single points of failure in complex systems.” (View Highlight)
Access to open source models is at grave risk today. The European AI Act may effectively ban open source foundation models, based on similar principles to those in FAR. (View Highlight)
‘The fears around new technologies follow a predictable trajectory called “the Tech Panic Cycle.” Fears increase, peak, then decline over time as the public becomes familiar with the technology and its benefits. Indeed, other previous “generative” technologies in the creative sector such as the printing press, the phonograph, and the Cinématographe followed this same course. But unlike today, policymakers were unlikely to do much to regulate and restrict these technologies. As the panic over generative AI enters its most volatile stage, policymakers should take a deep breath, recognize the predictable cycle we are in, and put any regulation efforts directly aimed at generative AI temporarily on hold.’ (View Highlight)
Instead, perhaps regulators should consider the medical guidance of Hippocrates: “do no harm”. Medical interventions can have side effects, and the cure can sometimes be worse than the disease. Some medicines may even damage immune response, leaving a body too weakened to be able to fight off infection. (View Highlight)
So too with regulatory interventions. Not only can the centralisation and regulatory capture impacts of “ensuring safety” cause direct harm to society, but they can even result in decreased safety. If just one big organization holds the keys to vast technological power, we find ourselves in a fragile situation where the rest of society does not have access to the same power to protect ourselves (View Highlight)
“The Malicious Use of Artificial Intelligence” was written by 26 authors from 14 institutions, spanning academia, civil society, and industry. The lead author is today the Head of Policy at OpenAI. It’s interesting to see how far OpenAI, as co-creators of FAR, has moved from these original ideas. The four recommendations from the Malicious Use paper are full of humility — they recognise that effective responses to risks involve “proactively reaching out to relevant actors”, learning from “research areas with more mature methods for addressing dual-use concerns, such as computer security”, and “expand the range of stakeholders and domain experts involved in discussions”. The focus was not in centralization and control, but outreach and cooperation. (View Highlight)
“The Malicious Use of Artificial Intelligence” was written by 26 authors from 14 institutions, spanning academia, civil society, and industry. The lead author is today the Head of Policy at OpenAI. It’s interesting to see how far OpenAI, as co-creators of FAR, has moved from these original ideas. The four recommendations from the Malicious Use paper are full of humility — they recognise that effective responses to risks involve “proactively reaching out to relevant actors”, learning from “research areas with more mature methods for addressing dual-use concerns, such as computer security”, and “expand the range of stakeholders and domain experts involved in discussions”. The focus was not in centralization and control, but outreach and cooperation. (View Highlight)
“The Malicious Use of Artificial Intelligence” was written by 26 authors from 14 institutions, spanning academia, civil society, and industry. The lead author is today the Head of Policy at OpenAI. It’s interesting to see how far OpenAI, as co-creators of FAR, has moved from these original ideas. The four recommendations from the Malicious Use paper are full of humility — they recognise that effective responses to risks involve “proactively reaching out to relevant actors”, learning from “research areas with more mature methods for addressing dual-use concerns, such as computer security”, and “expand the range of stakeholders and domain experts involved in discussions”. The focus was not in centralization and control, but outreach and cooperation. (View Highlight)
“The Malicious Use of Artificial Intelligence” was written by 26 authors from 14 institutions, spanning academia, civil society, and industry. The lead author is today the Head of Policy at OpenAI. It’s interesting to see how far OpenAI, as co-creators of FAR, has moved from these original ideas. The four recommendations from the Malicious Use paper are full of humility — they recognise that effective responses to risks involve “proactively reaching out to relevant actors”, learning from “research areas with more mature methods for addressing dual-use concerns, such as computer security”, and “expand the range of stakeholders and domain experts involved in discussions”. The focus was not in centralization and control, but outreach and cooperation. (View Highlight)
“The Malicious Use of Artificial Intelligence” was written by 26 authors from 14 institutions, spanning academia, civil society, and industry. The lead author is today the Head of Policy at OpenAI. It’s interesting to see how far OpenAI, as co-creators of FAR, has moved from these original ideas. The four recommendations from the Malicious Use paper are full of humility — they recognise that effective responses to risks involve “proactively reaching out to relevant actors”, learning from “research areas with more mature methods for addressing dual-use concerns, such as computer security”, and “expand the range of stakeholders and domain experts involved in discussions”. The focus was not in centralization and control, but outreach and cooperation. (View Highlight)
“The Malicious Use of Artificial Intelligence” was written by 26 authors from 14 institutions, spanning academia, civil society, and industry. The lead author is today the Head of Policy at OpenAI. It’s interesting to see how far OpenAI, as co-creators of FAR, has moved from these original ideas. The four recommendations from the Malicious Use paper are full of humility — they recognise that effective responses to risks involve “proactively reaching out to relevant actors”, learning from “research areas with more mature methods for addressing dual-use concerns, such as computer security”, and “expand the range of stakeholders and domain experts involved in discussions”. The focus was not in centralization and control, but outreach and cooperation. (View Highlight)
The ancient Greeks taught us about the dangers of Hubris: excessive pride, arrogance, or overconfidence. When we are over-confident that we know what the future has in store for us, we may well over-react and create the very future we try to avoid. What if, in our attempts to avoid an AI apocalypse, we centralize control of the world’s most powerful technology, dooming future society to a return to a feudal state in which the most valuable commodity, compute, is owned by an elite few. We would be like King Oedipus, prophesied to kill his father and marry his mother, who ends up doing exactly that as a result of actions designed to avoid that fate. Or Phaethon, so confident in his ability to control the chariot of the sun that he avoids the middle path laid out by Helios, his father, and in the process nearly destroys Earth. (View Highlight)
“The Malicious Use of Artificial Intelligence” points towards a different approach, based on humility: one of consultation with experts across many fields, cooperation with those impacted by technology, in an iterative process that learns from experience. (View Highlight)
The AI community has also developed effective mechanisms for sharing important information, such as Datasheets for Datasets, Model Cards for Model Reporting, and Ecosystem Graphs. Regulation could require that datasets and models include information about how they were built or trained, to help users deploy them more effectively and safely. This is analogous to nutrition labels: whilst we don’t ban people from eating too much junk food, we endeavor to give them the information they need to make good choices. The proposed EU AI Act already includes requirements for exactly this kind of information. (View Highlight)
Whilst there is a lot of good work we can build on, there’s still much more to be done. The world of AI is moving fast, and we’re learning every day. Therefore, it’s important that we ensure the choices we make preserve optionality in the future. It’s far too early for us to pick a single path and decide to hurtle down it with unstoppable momentum. Instead, we need to be able, as a society, to respond rapidly and in an informed way to new opportunities and threats as they arise. That means involving a broad cross-section of experts from all relevant domains, along with members of impacted communities. (View Highlight)
The more we can build capacity in our policy making bodies, the better. Without a deep understanding of AI amongst decision makers, they have little choice but to defer to industry. But as Marietje Schaake, international policy director at Stanford University’s Cyber Policy Center, said, “We need to keep CEOs away from AI regulation”: (View Highlight)
“Imagine the chief executive of JPMorgan explaining to Congress that because financial products are too complex for lawmakers to understand, banks should decide for themselves how to prevent money laundering, enable fraud detection and set liquidity to loan ratios. He would be laughed out of the room. Angry constituents would point out how well self-regulation panned out in the global financial crisis. From big tobacco to big oil, we have learnt the hard way that businesses cannot set disinterested regulations. They are neither independent nor capable of creating countervailing powers to their own.” (View Highlight)
“Imagine the chief executive of JPMorgan explaining to Congress that because financial products are too complex for lawmakers to understand, banks should decide for themselves how to prevent money laundering, enable fraud detection and set liquidity to loan ratios. He would be laughed out of the room. Angry constituents would point out how well self-regulation panned out in the global financial crisis. From big tobacco to big oil, we have learnt the hard way that businesses cannot set disinterested regulations. They are neither independent nor capable of creating countervailing powers to their own.” (View Highlight)
The push for commercial control of AI capability is dangerous. Naomi Klein, who coined the term “shock doctrine” as “the brutal tactic of using the public’s disorientation following a collective shock… to push through radical pro-corporate measures”, is now warning that AI is “likely to become a fearsome tool of further dispossession and despoilation”. (View Highlight)
“The rapid deployment of AI-based tools has strong parallels with that of leaded gasoline. Lead in gasoline solved a genuine problem—engine knocking. Thomas Midgley, the inventor of leaded gasoline, was aware of lead poisoning because he suffered from the disease. There were other, less harmful ways to solve the problem, which were developed only when legislators eventually stepped in to create the right incentives to counteract the enormous profits earned from selling leaded gasoline.” (View Highlight)

Pelayo Arbués

Explorer

AI Safety and the Age of Dislightenment

AI Safety and the Age of Dislightenment

Metadata

Highlights

Graph View

Table of Contents

Recent Notes

AI Enhanced Knowledge Management

Sucessful Model

The Rise of the Dataset Engineer

Now Reading

A Guide to Structured Generation Using Constrained Decoding

Pelayo Arbués

Explorer

AI Safety and the Age of Dislightenment

AI Safety and the Age of Dislightenment §

Metadata §

Highlights §

Graph View

Table of Contents

Recent Notes

AI Enhanced Knowledge Management

Sucessful Model

The Rise of the Dataset Engineer

Now Reading

A Guide to Structured Generation Using Constrained Decoding

AI Safety and the Age of Dislightenment

Metadata

Highlights