AGISF - Week 7 - AI Governance

AI Governance: Opportunity and Theory of Impact (2020) by Allan Dafoe

Fun fact: there are ~60 people working on AI governance in 2022. 

AI governance is arriving at strategic and political solutions to manage safe deployment of potential AGI systems, as well as dealing with new questions and risks arising from the existence of AGI. Sam Clarke defines it as "bringing about local and global norms, policies, laws, processes, politics, and institutions that will affect social outcomes from the development and deployment of AI systems". 

Targeted strategy research examples of long-termist AI governance are: 

"I want to find out how much compute the human brain uses, because this will help me answer the questions of when Transformative AI will be developed."

While exploratory strategy research example can be: 

"I want to find out what China's industrial policy is like, because this will probably help me answer important strategic questions, although I don't know precisely which ones."

This brings into light the importance of TAI forecasting, performance scaling as size increases as a part of strategy research. I think the most tangible tactics research would be AI & Antitrust, which is to propose ways to mitigate tensions between competition law and the need for cooperative AI development. 

Organizations that are doing Strategy research are: [FHI], [GovAI], [CSER], [DeepMind], [OpenAI], [GCRI], [CLR], [Rethink Priorities], [OpenPhil], [CSET].

Organization that are doing Tactics research are: [FHI], [GovAI], [CSER], [DeepMind], [OpenAI], [GCRI], [CLR], [Rethink Priorities],  [CSET], [LPP].

The AI deployment arms race proves difficult to institutionalize control over and share the bounty from artificial superintelligence. This problem is the problem of constitution design. Another problem is described in Hanson's "Age of Em" where biological humans have been economically displaced by evolved machine agents from a constructed ecology of AI systems. 

Mainstream perspective regards AI as a general purpose technology where the reaping of low-hanging fruits are reduction of labor share of value, reduce cost of surveillance. I gathered that the author is saying that the risks from looking at it from a general purpose technology perspective is too broad and too superficial. He says that existential risk factors that indirectly affect existential risk can be compared to "when trying to prevent cancer, investing in policies to reduce smoking can be more impactful than investments in chemotherapy". Then the general purpose tech perspective is substantial. 

Identify misuse risks, accident risks, and structural risks simultaneously is what the author is arguing for.

From looking at precedent examples with nuclear instability: sensor tech, cyberweapon, and autonomous weapons could increase the risk for nuclear war. Experts have understood nuclear deterrence, command and control, first strike vulnerability and how it could change with AI processing of satellite imagery, undersea sensors in order mitigate existential risk. 

A point that I can see unfold really quickly is value erosion through competition, or a safety-performance tradeoff. The example here is intense global economic competition pushed against non-monopolistic markets, privacy, and relative equality. Christiano refers to this as "greedy patterns". 

The assets that AI researchers can contribute are: 

1. technical solutions

2. strategic insights

3. shared perception of risk 

4. a more cooperative worldview 

5. well-motivated and competent advisors

The author quotes Eisenhower, “plans are useless, but planning is indispensable.” Most of AI governance work, is an assemblage of competence, capacity, credibility to eventually be ready to formulate a plan. 


Racing through a minefield: the AI deployment problem by Holden Karnofsky 

More of an introduction to AI risk blog post. I skimmed this paper. 


Why and How Governments Should Monitor AI Development (2021) by Jess Whittlestone and Jack Clarke 

Whittlestone and Clarke's article is a proposal for monitoring capabilities of AI systems, specifically that they are conforming to standards before deployment. 

Tracking activity, attention, and progress in AI research by specific benchmarks and incrementally access the maturity of AI capabilities by governments. The authors propose that governments can seek out third parties to monitor AI development. 

The problem is traditional governance approaches can not keep pace with AI development and deployment, which is following Moore's Law. Hence intervention is highly improbable, examples are Clearview AI's facial recognition capabilities already deployed in private sectors; manipulations of clustered AI techniques in the production of "deepfakes"; Radicalization of populations by recommendation systems on Youtube, Facebook. 

From building an infrastructure for measuring and monitoring the capabilities and impacts of AI systems, we garner benefits such as: 

1. Measurement statistics will be integrated into policymaking 

2. Assembling a set of people in the public sector to advance relationships in main stakeholders like academia and industry (what a trio)

Section 4 talks about two aspects of government monitoring: 

1. Impacts of AI systems already deployed in society 

2. Development and deployment of new AI capabilities 

For deployed systems, the risks are exhibiting behaviors in environments distinct from training, being used for unintended mal-intent, or by displaying harmful biases (See the Alignment Problem, Buolamwini and Gebru 2018). The two key benchmarks are model robustness and output interpretability. 

In terms of action items, authors suggests government proactively creating datasets and make it easy to test a system. An example is audio speech datasets of different regional accents to test how well speech recognition approaches can serve different accents. Competitions and fundings to gather discrete AI capability evaluation information, more research organizations to gather continous continous AI capability evaluation information. 

"Analysis by the AI Index on 2020 data showed that robotics and machine learning saw the fastest growth in attention within AI between 2015 and 2020 (based on pre-print publications on arXiv), and that computer vision was one of the most popular areas of research within AI in 2020 (31.7% of all arXiv publications on AI in 2020) (Zhang et al., 2021)."

We, then, associate attention/activity with potential to grow. it looks like Computer Vision is gaining a lot of traction. CV's benchmarks like ImageNet and SuperGLUE are useful because it is an indicator for progress in a field, just as compute and parameter size are useful indicators. 

The ImageNet Adversarial, or ImageNet-A, is a collection of natural adversarial examples of image that are inherently challenging for contemporary AI systems to label correctly. ImageNet Rendition, or ImageNet-R, are example images of stylized versions of original dataset. These variants of ImageNet can be used to assess robustness and generalizability. 


AI Policy Levers (2021) by Sophie-Charlotte Fischer, Jade Leung, Markus Anderljung, Cullen O'Keefe, Stefan Torges, Saif M. Khan, Ben Garfinkel and Allan Dafoe 

def policy lever: a tool or intervention point that a government has available to it, to shape and enforce changes. 

In descending order of likelihood of use for explicit national security purposes: 

1. Federal R&D spending (very likely) : Commissioning R&D projects and offering research grants

2. Foreign investment restrictions : Limiting foreign investments in U.S. companies

3. Export controls : Limiting the export of particular technologies from the U.S.

4. Visa vetting : Limiting the number of visas awarded, particularly to students and workers in key industries

5. Extended visa pathways : Increasing the number of visas awarded, particularly to workers in key industries

6. Secrecy orders : Preventing the disclosure of information in particular patent applications

7. Prepublication screening procedures : On a voluntary basis, screening papers for information that could be harmful to publish

8. The Defense Production act : Requiring private companies to provide products, materials, and services for “national defense”

9. Antitrust enforcement (very unlikely) : Constraining the behavior of companies with significant market power; alternatively, refraining from these actions or merely threatening to take them

Conclusion:

How likely are these policy levers going to be used by the USG? The USG has tightened restrictions of foreign investment through FIRRMA, which decreased foreign acquisition of domestic firms developing AI technologies. 


What does it take to catch a Chincilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring 



I think this might be my favorite paper from these past 8 weeks of reading. Description: This article introduces the most coherent strategic plan for regulating large-scale training runs via compute monitoring, to date.

Yonadav creates a framework for monitoring the large-scale NN training--- to ensure no actor uses large quantities of specialized ML chips to execute training run of agreed rules. Yonadev claims that the system can intervene at these instances: 

1. Using on-chip firmware to save snapshots of the NN weights 

2. Saving sufficient information about each training run (I think this is to match training run ==> weight)

3. Monitor chip supply chain to prevent chip hoard

Google TPUs, NVIDIA A100, H100 GPUs, AMD MI250 GPUs are specialized accelerators with high interchip communication bandwidth, and these are high-functioning than consumer-oriented GPUs such as NVIDIA RTX 4090 with lower interconnect bandwidth. 

With higher scaling trends, a potential risk is inability to identify cyber vulnerabilities such as ransomware attacks. In the event of non-compliance from criminal actors, negligent companies, and rival governments, there is no easily detectable difference between training for social good or evil. This was the argument of the US Department of Commerce for export controlling high-performance chips to China. 

Yonadav thinks about compliance in three steps: 

1. ML chip owner activity logging 

2. Take a subset of chips, inspectors analyze if rule-violation training has been done. 

3. Supply-chain monitoring to stop import chips to actors 

Some related work are nascent regulations and agreements on ML development: the EU AI Act proposes establishing risk-based regulations on AI products. The US proposed an Algorithmic Accountability Act to oversee algorithms used in critical decisions. China's Cyberspace Administration established an algorithm registry for overseeing recommender systems. 

In section 2 Yonadav sets out with two parties, the Verifier and the Prover. The Verifier verifies that a given set of ML training rules is being followed. The Prover develops the ML systems and wants to prove to the Verifier that it is complying with those rules. 

On the chip inspections can ideally be conducted remotely by log reports, thereby deterring most Provers from misbehavior. More realistically, the chip needs physical inspections from the Verifier. One chip from a batch used in training run would be enough. 

Chip inspection procedure: 

1. Verifier confirms chip's serial number

2. Checks if matches Prover's requested serial number 

3. Extract weight snapshot hashes

4. Checks if matches Prover's reported training transcripts. 

5. Checks if chip's logging mechanism is untampered


Information security considerations for AI and the long term future (2022) by Jeffrey Ladish and Lennart Heim 

Description: Model weights and algorithms relating to AGI are likely to be highly economically valuable, meaning there will likely be substantial competitive pressure to gain these resources, sometimes illegitimately.

We might want to prevent this due to the risks associated with the proliferation of powerful AI models:

1.  They could be obtained by incautious or uncooperative actors, leading to incautious or malicious use.

2.  It reduces the potency of policies regulating actors with access to powerful AI models, if models can be obtained secretly.

I understand these points well--- AI Safety needs extraordinary effort because the hardware supply chain is globally distributed, similar attempts have failed in the past such as the Manhattan Project, and that threat model is that attackers are advanced state actors and the most capable hackers. 

I am not sure if I follow industrial espionage and how SOTA AI labs can be of help.

Recent failures: 

1. Pegasus 0-click exploit

2. Twitter account hijacking 

3. Lapsus$ group hacking NVIDIA 


Comments

Popular posts from this blog

ArtificiaI General Intelligence Safety Fundamentals Notes

A year, introspectively summarized

Vicissitude