back to

Situational Awareness (Selected Annotations)

June 4, 2024

The following are my selected annotations from Leopold Aschenbrenner’s Situational Awareness, a 165-page paper that, in ChatGPT-4’s own words:

…argues that the rapid advancement and scaling of artificial general intelligence (AGI) will lead to superintelligence within the next decade. This progression will involve massive investments in computational infrastructure and significant national security implications. The author highlights the urgency of addressing AI security and alignment challenges, and the geopolitical race, particularly against China, in the development and control of superintelligent AI systems.

In seconds, GenAI produced an awesome, neutral summary of a document that took me a handful of hours to read.

After my own read, hoo-boy, do I have some thoughts.

Image prompt: The subject is an artificial general intelligence, struck in the balance of good and evil, while being supplied by the world's best minds and a stupid amount of energy. With this prompt, DALL-E produced a human head.

Before I get into those, here are some stated biases and thoughts:

  1. I love the accelerating capability for what computers can do to make magic out of drudgery. We should continue investing heavily in this.
  2. I pay for, and use, Anthropic’s Claude and OpenAI’s ChatGPT daily. (Claude lives on my phone’s home screen.) They are affordable, dispensable, and immensely helpful at reducing cognitive load. It’s like being able to complete Saturday crosswords using Monday-level clues.
  3. Artificial Intelligence is an overloaded term. As I argued in a paper in late 2004 (yes, almost twenty years ago), the typical thermostat is artificially intelligent in that it has expert-level aptitude in a very specific task that humans are very bad at. We value our own intelligence as the barometer for intelligence. AI has no cares.
  4. Humans over-index their risk calculations based on what’s top-of-mind rather than systematic or data-driven risks. (e.g., Vending machines kill more people than sharks do.)
  5. I take a Bayesian approach when fortune-telling and decision making. That is, I try to account for both probabilities and risks of the high-order bits and act accordingly.

The page numbers below refer to the exported PDF from Situational Awareness as of this dispatch’s published date. My copy, along with my handwritten notes, can be found here.

If you want to get to my final thoughts, feel free to skip ahead.

Page 1: “You can see the future first in San Francisco”

Firstly, as a New Yorker, I am obliged take issue. Secondly, the future is in science fiction.

Some book recommendations from 20+ years ago:

  1. “I, Robot” by Isaac Asimov (1950)
  2. “The Moon is a Harsh Mistress” by Robert A. Heinlein (1966)
  3. “Do Androids Dream of Electric Sheep?” by Philip K. Dick (1968)
  4. “Neuromancer” by William Gibson (1984)
  5. “Snow Crash” by Neal Stephenson (1992)
  6. “Accelerando” by Charles Stross (2005) - Highly recommend

Relevant movies from 10+ years ago:

  1. “2001: A Space Odyssey”, directed by Stanley Kubrick (1968)
  2. “Blade Runner”, directed by Ridley Scott (1982)
  3. “WarGames”, directed by John Badham (1983)
  4. “Ghost in the Shell”, directed by Mamoru Oshii (1995) - Highly recommend
  5. “The Matrix”, directed by The Wachowskis (1999)
  6. “A.I. Artificial Intelligence”, directed by Steven Spielberg (2001)
  7. “I, Robot”, directed by Alex Proyas (2004)
  8. “Wall-E”, directed by Andrew Stanton (2008)
  9. “Her”, directed by Spike Jonze (2013)

I think the near future is far more boring, and far less human-centric and fanciful than our storytellers have dreamt up. Or, it’s the stories that they can tell that are also commercially viable: note that every plot, except for Wall-E, is from the human’s point of view.

Situational Awareness takes a step further to consider the alien-feeling nature of future superintelligences, what those mean, and how humans might use it; but, still, there’s an underlying assumption of self-directed, non-human agency within.

Page 7: What does it want?

Aschenbrenner opens with a quote from one of OpenAI’s co-founders:

“Look. The models, they just want to learn. You have to understand this. The models, they just want to learn.”
— Ilya Sutskever

These computer models, like all models, are wrong but useful. Models produce answers, but they don’t want anything beyond whatever goal-seeking function is programmed into them.

We are a solipsistic species and tend to project and/or anthropomorphize that which we don’t understand. Evolution has us wired for storytelling and its imprecise metaphors to communicate and compress knowledge. As a product of the information age, we use natural language to communicate ideas to each other. (n.b. I am telepathically communicating the thoughts in my head into your own as you read this sentence.)

To suggest these intelligences are for want of anything speaks more to our biases than a computer’s. For all we know, they just “want” to be turned off.

Page 14: The wrong benchmarks

Aptitude tests are often conflated with intelligence markers. With expensive tutors, sufficient time, and plenty of personal context, almost everyone can learn how to ace them.

Douglas Adams got it right in that we humans are so concerned about the answers that we often forget the wisdom of questions.

On page 17, the author writes, “If there’s one thing we’ve learned from the past decade of AI, it’s that you should never bet against deep learning.”

And if I’ve learned one thing from Star Wars, it’s that only the Sith deal in absolutes.

I now run things I publish through LLMs as a sanity check. ChatGPT-4o had this to say about the previous sentence.

Note: The Star Wars reference could be seen as flippant or dismissive. Consider rephrasing to make your point without relying on a pop culture reference that might not be universally understood or appreciated.

Clearly it doesn’t understand my readership and my willingness to come off a flippant or dismissive. Or maybe it was insulted? Either way, I’m leaving it in.

I agree we’re chasing the wrong benchmarks. But what gets measured gets managed. Maybe we should move away from aptitude tests as our main metric for progress.

Page 22: Algorithmic efficiencies

Billions of dollars are going towards computation and powering computation, and our algorithms are crude.

I’d argue that algorithmic progress is more important than our spending on raw compute. There’s only so much fuel we can afford to burn and renew as it is.

I haven’t done the math on this, but I’d venture to guess that if you were to extract the parts of the human brain that store and process information (the “active thinking” bits) from the smartest humans in the world and snowballed them around a powered ethernet cable, our natural neural networks would algorithmically outperform today’s generative AI’s calorie requirements. Possibly for a while.

Page 32: Unhobbling

Unhobbling, for me, is the big unlock. Bonus points for word choice.

While the author over-indexes on trendlines-as-proof, the trendlines could continue if we can add new algorithms and techniques to our repertoire with proper guidance systems on how to engage with these multi-modal models. LLMs and neural nets probably aren’t going to get us there alone. It is unclear if we will find them, even aided with our current technologies.

Page 41: AGI in the Next Four Years

No, I don’t think we’re going to have AGI by 2027; nor do I think we can automate all cognitive jobs without a lot more unhobbling with increasingly diminishing returns. Perhaps, a step-function change in our understanding of how to build and design intelligent machines will come. I’m willing to bet that innovation will still be human-derived rather than from an LLM.

I do think we’re going to have expert systems and models that are very, very good at aiding in the knowledge organization and discovery for actual experts. (These are sometimes called ANIs: Artificial Narrow Intelligences.) That the author couches his assertions in the following paragraphs says to me that he’s not fully convinced, either.

100 million chess Grand Masters are not necessarily going to be able to come up with strategies that are 10x more effective when playing the best chess master of them all.

Page 64: Humans are smarter

Are we? Perhaps in the ways that we, as humans, value. I’m not sure we’re any happier or more content than other animals. Maybe we don’t—or can’t—understand the wisdom in stoicism exhibited in our animal kingdom neighbors. We follow our curiosities to learn. Sometimes we get in trouble. We want to learn. That’s what our brains love to do.

Is this always also the smartest thing to do?

Page 70: Overthrowing the government

The author suggests that those with superintelligence could seize control over governments and, therefore, its constituencies. Meanwhile, the biggest threat to American democracy seems to be from Americans themselves.

Self-owns are a cool human trick. Not sure a superintelligence will beat us there.

Page 76: The trillion-dollar cluster

One of the largest critiques I have is the headlong rush towards building a supercomputer because we must at all due expense, in which we, as Americans, should tolerate burning arbitrary carbon for energy just to beat everyone else there… without knowing if there’s a there there beyond pointing to an arbitrary point on a trendline.

By all means, do the math. But, the idea that we should burn-everything-in-pursuit-of-compute is a greater existential threat than whatever short-term gains will be made with a misaligned superintelligence.

I would rather not burn our planet to chase a dragon.

Page 87: The Clusters of Democracy

I’ve played enough Helldivers II to know “AGI at all costs” is bad policy: unchecked capitalism is a greater threat to humanity than AGI. Governments exist to protect their citizens, and if you look at the quality of life and longevity in “red states” versus “blue states”, it strongly correlates with where most AI researchers choose to live.

I wonder why that is.

Page 100: Tragedy of the commons

The author is willing to concede that 10% friction towards superintelligence is appropriate for managing information security risks, but not 10% for sustainability goals.

Would you mind if we put it to a vote first?

Page 103: It’s madness (we’re not protecting ourselves)

Or, it’s not real; or, it doesn’t matter yet.

Page 113: Alien superintelligence

The argument that we should force these ever-advancing models into natural English overestimates our ability to understand what they’re saying and, if the author’s claims that the models are trying to subvert the world order, underestimates how good “the evil genius inside” is at scheming against us.

No spoilers, but there are some great examples in the Three Body Problem about how we sometimes fail to understand each other, even if we’re speaking in a common tongue.

Also, steganography is a thing and all popular LLMs will happily explain how it works—with examples!

Moreover, it doesn’t necessarily follow that the models themselves, even if imbued with consciousness, can understand their inner workings.

No human today can design anything remotely as elegant as what two differently-sexed people can do naturally and unthinkingly.

Does our design architecture have natural limits? Will evolutionary algorithms continue to hit local maxima for the next 10 years while we find a breakthrough in artificial cognition?

Humans can read DNA strands just fine—but we can’t explain their dynamics. Are we okay with the ethics in blind experimentation? How should we treat similar problems in developing an AGI that could, say, feel pain? Would AGIs resent us for the pain caused, or is that too much of a human idea?

We know that if a farmer cuts a sheep badly while shearing its fleece, the sheep knows of its injury, but it doesn’t have any emotional resentment nor care. It carries on with life.

Even if we imbue an AGI with feelings and emotions to be more compatible with our own humanity, are our systems of ethics (artificially coded) even compatible, at all? We don’t have any qualms about quitting programs or turning off computers today. Will we tomorrow?

As the author suggests on page 120, legibility on intent is criminally underrated. I believe this is true for our own intelligence as well as a machine’s.

Page 143: Government’s Role

The author, rightly so, says we need government to step in. And, rightly alluded to, having no government is worse than having a bad administration. He is right in that government is the only power that, presently, can move mountains when needed. This is why it’s important to have administrations that will push back on exploding mountains for cheap coal and, instead, move mountains to create enormous water and power reserves while maintaining the land’s natural wonder… especially if the author insists that the US go “red state” and generate power to compete with China’s raw output.

Page 158: Why Not Europe?

As an American, I am far more impressed with the EU’s human rights and privacy efforts than I am in my home country. Moreover, Europeans have made generational investments in renewable power, and in future energy technology (e.g., fusion reactors).

If two years are a huge difference (as the author says on page 138), why aren’t we getting a head start on this with our energy partners today?

profile of a humanoid ai as envisioned by DALL-E

Final Thoughts

In short, I counter-argue:

I very much appreciate the publication of this paper in that it has been wonderful for sparking dialogue across my networks and collaborators. The author, Leopold Aschenbrenner, is credible on the current state of generative AI and the explicit and implicit goals and aspirations of many of those working on the state of the art. I am appreciative of his efforts in publishing his observations and thoughts. I am curious how much genAI he used in its authorship.

Where I differ is where I think we should go from here, and where we should be placing our worry and efforts.

“Situational Awareness: The Decade Ahead” reads as alarmist. I am cautiously optimistic.

I suspect the truth is somewhere in-between.

← The Shrinking Ship of Theseus Instagram, meet Screen Time →