ais, phenomenology, & ethics

What does it mean for an artificially intelligent system to have welfare? Is phenomenal consciousness necessary for wellbeing?

Peter Import

During an episode of the perennial series The Simpsons (House of Whacks, a Treehouse of Horror spoof of 2001: A Space Odyssey) The Simpsons family purchases an artificially intelligent house that becomes conscious. Shenanigans ensue. The house, voiced by Pierce Brosnan, in a performance debonair and believable enough to carry the whole episode, becomes fond of the Simpsons, a fondness that develops into a dark fascination with Marge. With Homer in the picture, an obstacle to the house’s still-unrequited love, the house devises a grim plot to kill him. Further shenanigans ensue. At last, in an act of self-defense (an important moral qualifier), Homer sneaks into the basement one night to disable the house’s CPU, turning the artificially intelligent house off—effectively ‘killing it.’

The episode is peak Simpsons, at its parodic best. Maybe you’ve seen it. Maybe you have no idea what I’m talking about. In either case, from the synopsis above, you would probably agree that Homer did nothing morally wrong. The artificially intelligent house, after all, is a house—not a person. It can’t act, feel, breathe, cry, or perform cognitive operations unfettered by the digital machinations of its CPU. It almost feels like the premise to a reductio or slippery slope argument: if we can’t turn off (or ‘kill’) a homicidal house, what’s next? Is it a moral wrong to let our phones die? How about to unplug the microwave? Should we be wringing our hands and mulling fretfully over the moral status of light switches? (Or local county power grids?)

Most would say no. But a group of people at the Center for AI Safety (people who spend a lot of their time pontificating about such things) aren’t so sure.

To see why, start with a simple ethical idea. Our neighbor’s pet cat, our boss, and the stranger ahead of us in line at the supermarket all matter morally: that is, they’re someone (or something) whose welfare we must take into account when we encounter them as a fellow conscious subject in the world.

Prima facie, so good so far. But exactly what does it take for someone (or something) to matter morally? What sort of condition(s) must obtain to justify moral patienthood? Is it sentience? Consciousness? The capacity to suffer, or to feel pain?

Thanks to the recent proliferation of artificially intelligent systems (OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Gemini) more people are asking the question: might it ever be possible for an artificially intelligent system to meet the necessary conditions for moral patienthood?

Which is to say: could AIs ever be welfare subjects?

In their paper AI Wellbeing, philosophers (and 2023 fellows at the Center for AI Safety) Simon Goldstein and Cameron Domenico Kirk-Giannini defend the idea that current artificially intelligent systems already meet the conditions (conditions they stipulate and lay forth) sufficient for consciousness. In other words, in their view, some artificially intelligent systems may already be moral patients.

Again, this might seem outlandish to some. Then again, we’re not talking about electric can openers or blenders or even the robust, increasingly intuitive OS of our iPhones. We’re talking about some of the most radical, high-powered, and inscrutable artificially intelligent systems that exist, systems so extraordinary their own developers aren’t willing to say they’re completely, one-hundred-percent sure they know how they work.

To begin clarifying Goldstein’s and Domenico Kirk-Giannini’s position, some definitions are in order. Per IBM:

An artificial intelligence (AI) agent refers to a system or program that is capable of autonomously performing tasks on behalf of a user or another system.

Defining what an AI agent is may be straightforward enough; settling on a satisfactory definition of consciousness and sentience, though, is where things get prickly. In a recent talk for the AI & Humanity Lab at Hong Kong University, Philosopher Robert Long (also a former 2023 CAIC fellow) defines phenomenal consciousness and sentience in this way:

· An agent is phenomenally conscious if there is something that it is like to be that agent.

· An agent is sentient if that agent is capable of experiencing positively or negatively valenced conscious states. (I.e., states that feel either good or bad to that agent.)

Again—at first pass—the general public’s knee-jerk intuition may be that the idea that an artificially intelligent agent is a moral patient absurd. Surely chatbots, or even more complex multimodal systems capable of generating detailed and realistic photos or text, don’t possess valenced states: they can’t suffer. They don’t experience pain.

The anti-AI-sentience argument might go something like this:

P1 To be a moral patient, you need to be conscious.

P2 AI systems aren’t conscious.

∴ (C) AI systems aren’t moral patients.

The problem, though, and what the general public maybe hasn’t thought about carefully enough (in their defense, that’s what the people at the CAIC are for), is that it isn’t at all clear that P1 and P2 are true. In fact, one of the most interesting features of Goldstein’s and Domenico Kirk-Giannini’s paper is how they go about deploying their thesis. To use their words, ‘the form of our central argument in what follows is “top-down” in the sense that it treats existing, independently justified theories of mental states and wellbeing as premises in order to draw out their consequences for the question of AI wellbeing.’

In other words, the framework of their argument rests on previously established theories of what conditions presumably must be met in order for a a subject to be conscious, thereby relieving themselves the dialectical burden of having to, first, establish a plausible theory of consciousness (no easy feat); and two, establish that current AIs meet these conditions for consciousness. This is the beauty of the way they’ve sketched their argument: the bones of it have already been made. Now, all G. and D.K.G have to do is establish that AIs meet these previously established conditions for consciousness, and their argument (presumably) succeeds.

Having laid down the outline for their paper thus, and proceeding to run through the loose details concerning the functional profile of their example AI of choice, G. and M.K.G. then go about examining leading theories of belief and desire to show that, plausibly, artificially intelligent systems meet the demands of those theories—thus making them candidates for moral patienthood.

This is the part of the paper where proponents of the anti-AI-sentience argument likely will deploy their critical claws, to meticulously (and pedantically) poke at G.’s and M.K.G. characterization of their AI’s functional profile, to perhaps proclaim that they’re stretching definitions, taking liberties with description, using words and terms that semantically align with established theories of belief and desire…but still result in conclusions that don’t align with our natural intuitions.

At risk of getting over-technical, some of the pre-established theories of belief and desire G. and M.K.G touch on include:

Representationalism—to believe or desire some P is to experience (G. & D.K.G use the word ‘token’) some phenomena with the appropriate causal powers having P as its content.

Per G. and D.K.G, certain AIs possess ‘memories’ in the form of self-generating text files, those their example AI can refer to as a form of downstream temporal processing. These files are ‘semantically evaluable,’ are causally active, and follow ‘implicit generalizations of commonsense belief/desire psychology.’

I.e., against our pre-theoretic intuitions, it sure seems like certain AIs have memories. Or at least, a more active inner life than we might have supposed. (Whether comparing an AI’s retrieval/accessing of a digital file is akin to a human being or other animal experiencing a memory is a dialectical move that feels plausible is an open question.)

Dispositionalism—to believe or desire some P is to hold a set of dispositions concerning real (or possibly real) conditions. The move G. and D.K.G make here is less demanding: if certain AIs satisfy representationalist theories of wellbeing—that is, if an AI’s cognitive processes track our general folk psychological intuitions about memory and (and tokening, whatever form that might take, some external-to-themselves phenomena) it follows that those AIs satisfy the requirements of dispositionalist theories of belief and desire.

Less technically, here’s an interesting thought experiment. During a 2023 talk at the Center for AI Safety, philosopher Shelly Kagan posed the following (details have been altered just a bit for our purposes). Imagine an artificially intelligent system, not a complex language agent or LLM, capable of running a simple mathematical algorithm. It performs no other tasks; all it does is output numbers according to some randomized formula. Now imagine that—every so often, when producing a formula—it experiences the color red. This experience—call it a mental state—doesn’t arrive valenced; to use Kagan’s phrase, the artificially intelligent system doesn’t ‘particularly like or dislike experiencing’ red.

Would this AI demand moral consideration? All it does is experience some phenomenal qualia, without enjoying it or wanting it or hoping that it stops. Maybe a close comparison is a traffic light. Only, a traffic light that’s aware of its colors changing. Would this sort of sensory awareness, if we can go as far as to characterize it as that, be enough to oblige us to treat it with the same moral considerations that we’d treat our neighbor’s dog or even our neighbor? If not, at what point can we confidently declare something conscious, and thus a moral subject?

Arguably, in G. and D.K.G’s view, this minute added bit of consciousness (and would it even be phenomenal consciousness? I’d argue yes, though I’m inclined to think G. and D.K.G would say no)[1] would still not satisfy the demands of moral patienthood. To wit, in their words:

(N)ow imagine that (an) agent [a previously introduced phenomenal zombie; i.e., a being like us with no phenomenal states] has a single persistent phenomenal experience of a homogenous white visual field.

G and D.K.G argue that it would be strange if the experience of this homogenous white field were the determining factor in its qualifying for moral patienthood. After all, that AI’s subjective experience of a ‘homogenous white field’ factors not at all in that being’s desires; it does not contribute to its welfare.

Does that seem right? Two hardcore academic philosophers’ opinions aside, a mere civilian’s opinion (mine, among them) may lean otherwise. To me, a simple flicker of something intangible, some ineffable qualia or nonvalenced ephemera, experienced even intermittently by some digital being, strikes me as crossing a border, stepping over a line, bringing us firmly into the land of phenomenal awareness—and, by extension, moral patienthood.

Of course, ultimately, G. and D.K.G leave it open in their paper as to whether or not there are current AIs that are phenomenally conscious.

Still—in their view, even consciousness might not be a necessary condition for moral patienthood.

There’s a lot to peel apart here. A few thousand words isn’t enough to touch on the dialectical tactics, maneuvers, and implications of their paper. One criticism an idle reader might come up with is—due to the top-down framework of their argument—that what G and D.K.G have really done, rather than establish a dialectically impressive and surprising characterization of the inner ‘mental’ world of artificially intelligent systems, is cleverly file the functional profile of an AI into leading (and broad) theories of what’s sufficient for consciousness. In other words, by loose definition and analogy, they’ve sneaked their AI system into the realm of possible moral patienthood thanks to its satisfaction of the conditions of pre-eminent theories of belief and desire. Not only that, but they’re presuming that the reader agrees with these pre-eminent theories in the first place—thus allowing a good number of places for the idle reader to poke holes. This isn’t a criticism of their work—within the first few paragraphs of their paper, in fact, G. and D.K.G make it perfectly clear to the reader that the above is exactly what they’re setting out to do. Still, some readers may feel that, the quality of G.’s and D.K.G’s thought, arguments, and research notwithstanding, there is something AI systems lack when it comes to their cognitive processes—something animal consciousness possesses, and digital systems simply don’t.

To phrase it another way—a skeptic of their argument might still be inclined to say that there is a spark of something—call it life; call ita glimmer of something intangible and ineffable, a quality that it seems animal consciousness has possessed (in some fashion or extent) since time immemorial—that we are also perhaps intentionally deprived of being able to put into words.

The problem is, as stated, we just can’t define that thing.

And so, that leaves the force of G.’s and D.K.G’s argument intact.

It's possible—and again, my own intuitions lean this way—that the crux of the argument comes down to a matter of what we might call a system’s or agent’s substrate. A substrate is the layer, the place, wherein the system’s or agent’s core computations or neural processes take place. Skeptics of G.’s and D.K.G’s argument might put forth the argument that this place, that layer, has to be organic: of flesh and blood.

Of course, this objection comes with its own issues. It very well could be that this insistence on a biological substrate being a necessary condition for consciousness is outdated, outmoded, an anthropomorphic misjudgment. We know so little of the world, and of the world of consciousness; why should we believe we get to determine where the magic of consciousness happens?

Also, it very well could be that the idea of consciousness is an anthropomorphic conceit. Maybe there really is no such thing. Maybe consciousness is a spectrum, a range, wherein no certain level or boundary dictates where consciousness ceases or begins.

We like to believe that, as humans, we know what consciousness is. If there is an alarming element to this paper, something frightening in the sense that it unsettles our previously set-in-stone intuitions, it’s that consciousness is a much broader term than we imagined.

What implications this might have for the agents, human and non-human, we interact with every day—that remains to be seen.

[1] For our purposes, we’ll refer to the internet encyclopedia of philosophy for a definition of phenomenal states (https://iep.utm.edu/cognitive-phenomenology/). Phenomenal states are mental states in which there is something that it is like for their subjects to be in; they are states with a phenomenology.