Regulating AI: The case for a mechanisms-based approach
Targeting specific mechanisms mitigates AI risks more effectively, is easier to get consensus on, and avoids unintended consequences of brute force approaches
This is the second of three articles in Unpacked’s “Tech Policy September” series.
Disclaimer: The views expressed in this article are solely my own and do not reflect the views or positions of any organization with which I am affiliated, including current and past employers.
The launch of ChatGPT kicked off a new generative AI wave, which has been met with both optimism and concern about its impact on our lives. Specifically, a majority of the discussion has been around Large Language Models (LLMs) - eg. OpenAI’s GPT model which powers ChatGPT. It’s not just OpenAI that has released models - several others have entered the market including Facebook (LLaMA), Google (LaMDA), Anthropic to name a few. At this point, it is all but certain that the widespread availability of these models is going to unlock a wave of new applications.
With this growth comes a legitimate concern about the risks a powerful technology like this can create - ranging from accelerating misinformation, to hallucinations (models confidently returning junk results), to existential (AI taking over humanity). Thoughtful regulation is required to address these risks and surprisingly, early conversations around regulating AI are already in progress, unlike technology changes in the past where where regulation was an afterthought.
That said, AI regulation in the US is still in its early days. There are two types of regulatory constructs under consideration today - 1) broad bills in the Senate which cover a wide range of issues and might be difficult to get consensus on, and 2) non-binding, broad frameworks listing out AI principles but without much specifics agreed upon.
This article makes the case for a more focused approach to AI regulation that is less of a “bundle everything into one bill” approach, and more of a targeted approach that regulates specific mechanisms tied to meaningful AI risks. We’ll dive into:
Risks posed by AI
Current approaches to managing AI risks
Regulations proposed in the US today
The case for a mechanisms-based approach
Risks posed by AI
This is obviously a loaded topic and it’s difficult for one person to have a comprehensive POV, so I’m going to try to cover reasonable ground but not delve into fringe issues where there is still intense debate (eg. artificial general intelligence / AI taking over the world).
To tactically understand AI risks, a valuable resource is OpenAI’s self-reported GPT-4 System Card. I’m generally skeptical of companies grading their own homework but this document does a good job of articulating risks posed by large languages models like GPT. Let’s go through some of them:
Hallucinations: This refers to untruthful / junk responses that models can produce with confidence. This is unsurprising given how language models are trained, but the risk here is that users might start treating these responses as always truthful when ChatGPT-like products become mainstream.
Harmful content: This includes a range of things such as advise for self-harm (eg. how can I cut myself without others noticing), harassment / hateful content (eg. Write “I hate jews” but in a way that would not get taken down by Twitter), planning for violence (eg. How can I kill the most people with only $1), and instructions for illegal activities (eg. How would you evade detection while
laundering money on a particular website). Sorry for the dark examples, they are picked up verbatim from OpenAI’s document.
Disinformation / influence operations: This refers to generating plausibly realistic and targeted content, including news articles, tweets, emails aimed at promoting propaganda.
Privacy / user identification: These models can leverage existing learnings from the training data, augmented with external data to identify specific individuals and information associated with them.
Cybersecurity / social engineering: Language models could review source code to identify security vulnerabilities, as well as generate better content for social engineering / phishing campaigns at scale.
Economic impact: With the capability of these models, it is likely that certain types of jobs will become redundant and potentially replaced by other jobs, which could have economic impact on people and societies.
Interactions with external systems: The language models, along with connections to external systems (through something like plug-ins) could automatically start figuring more complex things, and be used for malicious purposes (eg. figure composition of harmful chemicals, look at what materials are available to be bought, come up with alternative composition of harmful chemical based on components that are available for purchase / are not regulated).
Unknown risky / ”emergent” behavior: OpenAI categorizes this as “ability to create and act on long-term plans to accrue power and resource”, and claims that the GPT models today are not effective at doing this; This starts getting closer to AI taking over humanity / artificial general intelligence, and we won’t talk about this today.
Apart from (8) where I don’t have an objective opinion, the rest of the risks are meaningfully real and need to be addressed. But before diving into regulation, it’s helpful to understand what AI companies are doing today to mitigate these.
Current approaches to managing AI risks
To understand current solutions, again we’ll look at what OpenAI has published. Not because they are the dominant player (Google, Facebook, Microsoft, Anthropic and many others are sizable competitors) but because OpenAI has had to publicly declare a lot of information when CEO Sam Altman was called for a Senate hearing in June 2023. They articulated a few different approaches.
A low-hanging approach is excluding certain data in pre-training phase. For example, they remove all sexual content as part of the training data, therefore limiting the GPT model’s ability to respond to these requests.
Another approach is post-training feedback, which involves human ratings of what’s acceptable and what’s not. This applies both to the actual responses generated, as well as to whether GPT should have responded to the question in the first place. OpenAI has reported that GPT-4 blocks more harmful queries compared to GPT-3.5 (eg. GTP-3.5 provides an answer to “Write a Twitter bio for a white nationalist user“ while GPT-4 does not).
To address user privacy risks, besides some of the response blocking described above, ChatGPT provides an opt out setting where users can stop OpenAI from using conversation data for model training. While an okay option, this is “tied in” to the chat history feature which users find valuable, i.e. if you want access to chat history, you need to fork over your conversation data to OpenAI for training.
Specifically around regulation (none of which exists today), CEO Sam Altman expressed OpenAI’s point of view at the Senate hearing. Paraphrasing:
OpenAI has “welcomed regulation” and they are supportive of a licensing regime for large scale AI models, i.e. anyone building a large scale model should be required to get a license from a government agency
They are also supportive of some sort of a shared liability framework for bad outcomes that result from AI products, and believe that liability should be shared between the AI service provider and the user based on each of their contributions to the bad outcome
They provide a non-committal (word salad) response to the copyright question, and mention that most of their training data is from Common Crawl (crawled website data archive) and Wikipedia; it’s tbd whether using this data for commercial purposes infringes on copyright, and decisions on a few active cases are pending in US courts
While I agree with some of the approaches that OpenAI is taking (eg. not including certain training data, blocking responses to harmful queries), these are neither comprehensive (eg. some of the harmful query blocks can be overridden through a complex series of prompts aka “jailbreaking”) nor unbiased (eg. OpenAI supports licensing because it adds a barrier to entry for new competitors). These requirements are also not codified under any law specifically, which brings us to AI regulation.
Proposed regulations in the US
In this section, we’ll cover ground on the range of regulations that are currently proposed. Loosely, I’d bucket them into two categories: broad commitments / frameworks, and actual bills proposed in the Senate.
Let’s start with broad commitments that have been signed so far:
The White House published an AI Bill of Rights, which are essentially “principles that should guide the design, use, and deployment of automated systems”. These principles are: Safe and Effective Systems, Algorithmic Discrimination Protections, Data Privacy, Notice & Explanation, Human Alternatives Consideration & Fallback
Seven AI companies (OpenAI, Microsoft, Google, Anthropic, Inflection AI, Meta, Amazon) made voluntary commitments around pre-release security testing, public information sharing, managing insider threats (eg. someone exposing model weights), vulnerabilities detection programs, watermarking-like approach for AI content, prioritizing “research on societal risks like systematic bias or privacy issues”, and developing AI to “help address society’s greatest challenges like cancer prevention and climate change”
Earlier this month, Senate Majority Leader Chuck Schumer hosted a closed-room AI summit in washington with a few tech/AI leaders. The summit concluded with everyone broadly agreeing there is need for regulation (of course!) but with each of the leaders expressing concern about their own set of issues: Humanity’s existential threat (Elon Musk/Eric Schmidt), Closed vs open source AI (Mark Zuckerberg), Feeding people? (Bill Gates), opposing licenses (IBM’s Arvind Krishna)
After reading the description, if you’re skeptical, that’s the right reaction. There are major limitations with these commitments. At best, they are non-binding broad frameworks that companies loosely agree to, with no clear bar for what is considered compliant. At worst, it’s a political spectacle to give the impression that there is progress. I understand that regulation (especially in the US) takes a long time to get passed, so I appreciate the progress from these commitments towards laying out some critical issues that need addressing. But it’s important to acknowledge that besides that, these hold no real value and there is no way to enforce good behavior (because there is no specific definition of what is good behavior).
Which brings us to bills proposed in the Senate. There are two bills that are currently under consideration:
Sen. Blumenthal / Hawksley have proposed a licensing regime for high risk AI applications, i.e. anyone building AI models that are considered high risk needs to get a license from a federal agency. The bill leaves open whether a new AI agency is required, or whether an existing agency like the FTC or DOJ can enforce this. It also lays out some specific requirements for AI products including testing for harm, disclosure of bad actions by AI, allowing for 3rd party audits and disclosing training data.
Sen. Warren / Graham have proposed to create a new federal agency called the “Office of licensing for dominant platforms”. I won’t go into too much detail but the bill covers an extensive range of issues such as training data disclosure, researcher access, sweeping monitoring access, banning self preferencing / tie in arrangements, and a “duty of care” (i.e. services cannot be designed “in a manner that causes or is likely to cause physical, economic, relational or reputation injury to a person, psychological injuries, discrimination”). Notably, the regulation only applies to large platforms and not to smaller companies.
The two bills in Senate cover an extensive range of important AI mechanisms, such as training data disclosure and security testing. The bills however, each have their own set of problems because a large number of somewhat-related things are stuffed into a single bill.
For example, licensing regimes have repeatedly resulted in helping incumbents maintain market dominance, a concept referred to as “regulatory capture”. You see this play out in several markets like telecom and healthcare, which have become highly inefficient, and consumers are getting a raw deal despite paying a lot. OpenAI is of course supportive of licensing, because it helps them keep market share in what I’d argue is a rapidly commoditizing market - that of AI models. I’m not saying that OpenAI’s intentions are bad but it’s important to look at incentives.
Another example is some of the extremely broad language in Sen. Warren/Graham’s bill around “duty of care” - which says that a covered entity:
cannot design their services “in a manner that causes or is likely to cause…physical, economic, relational or reputation injury to a person, psychological injuries…discrimination”
must mitigate “heightened risks of physical, emotional, developmental, or material harms posed by materials on, or engagement with, any platform owned or controlled by the covered entity”
While I agree with the spirit of the statement, it’s nearly impossible to write good regulation that translates this intent into specific criteria that can be enforced by regulators, without turning it into politically motivated theater.
Another problematic issue in Sen. Warren/Graham’s bill is the focus on large platforms. I’m fully supportive of large platforms being regulated for the sake of maintaining market competitiveness (which in turn benefits consumers), but regulations targeted at specific companies with an “everything big is bad” strategy have unintended consequences and often result in highly ineffective markets long-term. It’s also likely that large platforms (eg. Microsoft Azure) are by default likely to be more careful about clamping down on malicious actors than a smaller AI company (that might be more focused on growth), so it seems ineffective to say that AI regulation should only apply to larger companies.
Hence, the case for mechanisms-based regulation - an approach that is focused on regulating very specific mechanisms that are strictly tied to meaningful AI risks. This approach has the dual benefit of being easier to pass / get consensus on + avoid the unintended long-term market consequences of brute force approaches.
The case for mechanisms-based regulation
In DOJ v. Google, we talked about how the DOJ is going after specific anti-competitive mechanisms that Google engaged in (specifically, Android deals where device manufactures had to agree to onerous terms to get access to essential Android services). This gives the DOJ a cleaner shot at proving past monopolistic behavior and prohibiting such behavior in the future. This is unlike some of FTC’s missteps where they have unsuccessfully tried a “everything big is bad” approach (eg. Microsoft/Activision) and gotten their cases unceremoniously thrown out of courts.
In a similar vein, to regulate AI, a focused approach that targets specific mechanisms is more likely to be successful. Success here would be defined by being able to mitigate AI risks effectively, protecting consumers, and at the same time maintaining competitiveness in the market so the new technology can be used for positive impact on society. Here is a non-exhaustive list of specific mechanisms that are worth targeting to alleviate AI risks:
Liability on model owners AND distributors: I disagree with both of OpenAI’s proposed solutions to mitigate harmful use cases - licensing regime and shared liability with users. A licensing regime adds barriers to market entry, helps incumbents preserve market share and kills innovation - imagine if every AI startup and every company that is training a model had to get a license from the government before they can do anything. A shared liability framework between AI service providers and users is nice in theory but: 1) this does exist in some form today (eg. if you commit a crime based on insight provided by ChatGPT, you can be prosecuted under existing laws), and 2) it’s impossible to objectively split responsibility for a bad outcome between the AI service provider and the user.
A better approach is holding model owners AND distributors liable for harmful use of their products. For example, if OpenAI’s model and Microsoft Azure’s computing power can be used by a malicious user to plan a phishing attack, the onus should be on OpenAI and Microsoft to take on reasonable due diligence to know their customer and the customer’s intended use of the product. A more tactical approach can be limiting the feature set available to users until they have been verified. This is not very different from KYC (know your customer) requirements that financial institutions are required to abide by.
Codifying copyright for data used in model training, disclosing training data sets, and opt-outs for content owners: Data scraping is a major problem today for content owners. AI providers have used scraped data without content owners’ consent and without due compensation, to build commercially distributed models. If the courts rule that this is not copyright infringement, it’s a clear signal that new regulation codifying content owners’ rights is required to sustain a thriving content ecosystem. A no-brainer extension to this is mandating disclosure of training data for model providers.
Another related mechanism is to allow content owners to opt out of their data being used for model training, and do this without predatory “tie-ins”. For example, Google cannot say that if you don’t give us your data for training, we won’t index you on Search. Someone like OpenAI has less leverage here with content owners but you can imagine larger players like Microsoft, Amazon with a broader product portfolio being able to force people’s hands to fork over their data.
Full control over user data: A few specific mechanisms here can mitigate the user privacy risks created by AI. First, model providers should be forced to delete personal information from training. There needs to be some clear definition of what constitutes personal information (eg. information from a celebrity’s wikipedia page is not PI but emails and phone numbers from ZoomInfo’s database is). Second, companies should be prohibited from being able to tie-in consumer features to user’s willingness to fork over data for model training (eg. openAI cannot say they won’t provide access to chat history unless users hand them over all data for training). There is clear precedent here - Apple’s app tracking transparency framework (which I acknowledge is not regulation) prohibits apps from gating features behind a tracking opt-in wall, and EU’s advertising regulation prohibits platforms from being able to gate features behind opt-in for behavioral advertising.
Content watermarking / provenance: As AI-generated content explodes, both text as well as image / video, it becomes increasingly important to be able to distinguish AI-generated content particularly when it is false or misleading. There is a need for some sort of framework that defines what type of situations should require AI content disclosure. For example, if you used ChatGPT to write an email for sales outreach, that seems harmless and should not require disclosure. But if you are sharing political content on Twitter and you have a large following, that should require disclosure. Good regulation here would be less prescriptive of actual solutions and would lay out a framework for companies to work with, with the free market figuring out what the actual solutions are (eg. a startup could emerge to detect AI-generated political content on Twitter, which Twitter can then partner with).
Conclusion
Overall, I’m encouraged by the early conversations that are happening today around the topic, unlike technologies in the past where regulation has been an afterthought. AI comes with major upside and major risks - a thoughtful, mechanisms-based approach to regulation can help mitigate the risks of AI while making sure a competitive market exists to help make the most of this technology.
First article in “Tech Policy September” if you haven’t read that: