A week ago, U.S. President Biden issued an executive order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. This document is some 60 pages long and structured in 11 sections, namely (1) purpose, (2) policy and principles, (3) definitions, (4) ensuring the safety and security of AI technology, (5) promoting innovation and competition, (6) supporting workers, (7) advancing equity and civil rights, (8) protecting consumers, patients, passengers, and students, (9) protecting privacy, (10) - advancing federal government use of AI, and (11) strengthening American leadership abroad.
I've divided the analysis of the executive order into 4 parts for better digestibility. In this first part, I summarise the first ~15 pages that include sections 1, 2, 3, and subsections 4.1 and 4.2 of section 4. The key points are, in my opinion:
Section 1: The purpose is short and self-evident. Obviously, AI systems and in particular large-scale neural networks have tremendous benefits and possess serious risks at the same time. The executive order aims at controlling these risks without penalizing their benefits or hindering innovation.
Section 2 lays down the core principles behind the controls and other measures. The principles are much in line with what is generally considered trustworthy AI, and they are reflected in the structure and headings of sections 4 to 11.
Section 3 defines the key terms that are used throughout the document. The most notable one is "dual-use foundation models". The main part of the definition is worth citing, as these models are the focus of many requirements in the subsequent sections: "The term 'dual-use foundation model' means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters ."
Section 4 is one of the key sections and with about 15 pages the longest section of the executive order. Subsection 4.1 lists a set of guidances, best practices, testbeds, read-teaming testing processes, etc that are to be defined within the next 270 days, i.e. by Jul 26, 2024. The leading institution for most of these activities will be the National Institute of Standards and Technology (NIST) whose AI 100-1 risk management framework will for one of the key foundations for the anticipated guidances.
Section 4.2. is very remarkable because of its extremely speedy consequences and its concreteness. It states that companies that (1) build dual-use foundation models or (2) possess very large computing clusters will have to disclose detailed information about the models or clusters within 90 days, i.e. by Jan 28, 2024. It also states that companies that provide infrastructure as a service with dual-use foundation models and that interact with foreign persons will have to report the persons' identities, as well as their financial situation.
The executive order even specifies concrete numbers that define (1) very large computing clusters and (2) dual-mode foundation models.
Let's start with the computing clusters: These are clusters with a computing power of at least 10 to the power of 20 integer or floating-point operations per second (= 100 exaflops) for training AI. To illustrate this number, a state-of-the-art NVIDIA DGX A100 System has a computing power of 5 pentaflops and costs about US$ 200.000. You need 20.000 of those to get 100 exaflops, costing roundabout 4 billion dollars (bulk discounts and Black Friday deals disregarded).
Now to the definition of dual-mode foundation models: These are models that have been trained using 10 to the power of 26 floating point operations, or 10 to the power of 23 floating point operations for primarily biological sequence data. On a computing cluster as described above, this translates to a training of 12 days for non-biological and 17 minutes for biological data. Assuming only 1 NVIDIA DGX A100 System, a 232-day training run on biological sequence qualifies for the definition of a dual-use foundation model, whereas you would need 633 DGX A100 Systems for a year for non-biological data to qualify for a dual-use foundation model. The online community estimates that GPT4 was trained using roughly 10 to the power of 25 floating point operations.
Stay tuned for part II of this mini-series coming up soon.