Artificial Moral Agents

If one takes machine ethics to concern moral agents, in some substantial sense, then these agents can be called “artificial moral agents”, having rights and responsibilities. However, the discussion about artificial entities challenges a number of common notions in ethics and it can be very useful to understand these in abstraction from the human case (cf. Misselhorn 2020; Powers and Ganascia forthcoming).

Several authors use “artificial moral agent” in a less demanding sense, borrowing from the use of “agent” in software engineering in which case matters of responsibility and rights will not arise (Allen, Varner, and Zinser 2000). James Moor (2006) distinguishes four types of machine agents: ethical impact agents (e.g., robot jockeys), implicit ethical agents (e.g., safe autopilot), explicit ethical agents (e.g., using formal methods to estimate utility), and full ethical agents (who “can make explicit ethical judgments and generally is competent to reasonably justify them. An average adult human is a full ethical agent”.) Several ways to achieve “explicit” or “full” ethical agents have been proposed, via programming it in (operational morality), via “developing” the ethics itself (functional morality), and finally full-blown morality with full intelligence and sentience (Allen, Smit, and Wallach 2005; Moor 2006). Programmed agents are sometimes not considered “full” agents because they are “competent without comprehension”, just like the neurons in a brain (Dennett 2017; Hakli and Mäkelä 2019).

In some discussions, the notion of “moral patient” plays a role: Ethical agents have responsibilities while ethical patients have rights because harm to them matters. It seems clear that some entities are patients without being agents, e.g., simple animals that can feel pain but cannot make justified choices. On the other hand, it is normally understood that all agents will also be patients (e.g., in a Kantian framework). Usually, being a person is supposed to be what makes an entity a responsible agent, someone who can have duties and be the object of ethical concerns. Such personhood is typically a deep notion associated with phenomenal consciousness, intention and free will (Frankfurt 1971; Strawson 1998). Torrance (2011) suggests “artificial (or machine) ethics could be defined as designing machines that do things that, when done by humans, are indicative of the possession of ‘ethical status’ in those humans” (2011: 116)—which he takes to be “ethical productivity and ethical receptivity” (2011: 117)—his expressions for moral agents and patients.

2.9.1 Responsibility for Robots

There is broad consensus that accountability, liability, and the rule of law are basic requirements that must be upheld in the face of new technologies (European Group on Ethics in Science and New Technologies 2018, 18), but the issue in the case of robots is how this can be done and how responsibility can be allocated. If the robots act, will they themselves be responsible, liable, or accountable for their actions? Or should the distribution of risk perhaps take precedence over discussions of responsibility?

Traditional distribution of responsibility already occurs: A car maker is responsible for the technical safety of the car, a driver is responsible for driving, a mechanic is responsible for proper maintenance, the public authorities are responsible for the technical conditions of the roads, etc. In general

The effects of decisions or actions based on AI are often the result of countless interactions among many actors, including designers, developers, users, software, and hardware.… With distributed agency comes distributed responsibility. (Taddeo and Floridi 2018: 751).

How this distribution might occur is not a problem that is specific to AI, but it gains particular urgency in this context (Nyholm 2018a, 2018b). In classical control engineering, distributed control is often achieved through a control hierarchy plus control loops across these hierarchies.

2.9.2 Rights for Robots

Some authors have indicated that it should be seriously considered whether current robots must be allocated rights (Gunkel 2018a, 2018b; Danaher forthcoming; Turner 2019). This position seems to rely largely on criticism of the opponents and on the empirical observation that robots and other non-persons are sometimes treated as having rights. In this vein, a “relational turn” has been proposed: If we relate to robots as though they had rights, then we might be well-advised not to search whether they “really” do have such rights (Coeckelbergh 2010, 2012, 2018). This raises the question how far such anti-realism or quasi-realism can go, and what it means then to say that “robots have rights” in a human-centred approach (Gerdes 2016). On the other side of the debate, Bryson has insisted that robots should not enjoy rights (Bryson 2010), though she considers it a possibility (Gunkel and Bryson 2014).

There is a wholly separate issue whether robots (or other AI systems) should be given the status of “legal entities” or “legal persons” in a sense natural persons, but also states, businesses, or organisations are “entities”, namely they can have legal rights and duties. The European Parliament has considered allocating such status to robots in order to deal with civil liability (EU Parliament 2016; Bertolini and Aiello 2018), but not criminal liability—which is reserved for natural persons. It would also be possible to assign only a certain subset of rights and duties to robots. It has been said that “such legislative action would be morally unnecessary and legally troublesome” because it would not serve the interest of humans (Bryson, Diamantis, and Grant 2017: 273). In environmental ethics there is a long-standing discussion about the legal rights for natural objects like trees (C. D. Stone 1972).

It has also been said that the reasons for developing robots with rights, or artificial moral patients, in the future are ethically doubtful (van Wynsberghe and Robbins 2019). In the community of “artificial consciousness” researchers there is a significant concern whether it would be ethical to create such consciousness since creating it would presumably imply ethical obligations to a sentient being, e.g., not to harm it and not to end its existence by switching it off—some authors have called for a “moratorium on synthetic phenomenology” (Bentley et al. 2018: 28f).

2.10 Singularity

2.10.1 Singularity and Superintelligence

In some quarters, the aim of current AI is thought to be an “artificial general intelligence” (AGI), contrasted to a technical or “narrow” AI. AGI is usually distinguished from traditional notions of AI as a general purpose system, and from Searle’s notion of “strong AI”:

computers given the right programs can be literally said to understand and have other cognitive states. (Searle 1980: 417)

The idea of singularity is that if the trajectory of artificial intelligence reaches up to systems that have a human level of intelligence, then these systems would themselves have the ability to develop AI systems that surpass the human level of intelligence, i.e., they are “superintelligent” (see below). Such superintelligent AI systems would quickly self-improve or develop even more intelligent systems. This sharp turn of events after reaching superintelligent AI is the “singularity” from which the development of AI is out of human control and hard to predict (Kurzweil 2005: 487).

The fear that “the robots we created will take over the world” had captured human imagination even before there were computers (e.g., Butler 1863) and is the central theme in Čapek’s famous play that introduced the word “robot” (Čapek 1920). This fear was first formulated as a possible trajectory of existing AI into an “intelligence explosion” by Irvin Good:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion”, and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. (Good 1965: 33)

The optimistic argument from acceleration to singularity is spelled out by Kurzweil (1999, 2005, 2012) who essentially points out that computing power has been increasing exponentially, i.e., doubling ca. every 2 years since 1970 in accordance with “Moore’s Law” on the number of transistors, and will continue to do so for some time in the future. He predicted in (Kurzweil 1999) that by 2010 supercomputers will reach human computation capacity, by 2030 “mind uploading” will be possible, and by 2045 the “singularity” will occur. Kurzweil talks about an increase in computing power that can be purchased at a given cost—but of course in recent years the funds available to AI companies have also increased enormously: Amodei and Hernandez (2018 [OIR]) thus estimate that in the years 2012–2018 the actual computing power available to train a particular AI system doubled every 3.4 months, resulting in an 300,000x increase—not the 7x increase that doubling every two years would have created.

A common version of this argument (Chalmers 2010) talks about an increase in “intelligence” of the AI system (rather than raw computing power), but the crucial point of “singularity” remains the one where further development of AI is taken over by AI systems and accelerates beyond human level. Bostrom (2014) explains in some detail what would happen at that point and what the risks for humanity are. The discussion is summarised in Eden et al. (2012); Armstrong (2014); Shanahan (2015). There are possible paths to superintelligence other than computing power increase, e.g., the complete emulation of the human brain on a computer (Kurzweil 2012; Sandberg 2013), biological paths, or networks and organisations (Bostrom 2014: 22–51).

Despite obvious weaknesses in the identification of “intelligence” with processing power, Kurzweil seems right that humans tend to underestimate the power of exponential growth. Mini-test: If you walked in steps in such a way that each step is double the previous, starting with a step of one metre, how far would you get with 30 steps? (answer: almost 3 times further than the Earth’s only permanent natural satellite.) Indeed, most progress in AI is readily attributable to the availability of processors that are faster by degrees of magnitude, larger storage, and higher investment (Müller 2018). The actual acceleration and its speeds are discussed in (Müller and Bostrom 2016; Bostrom, Dafoe, and Flynn forthcoming); Sandberg (2019) argues that progress will continue for some time.

The participants in this debate are united by being technophiles in the sense that they expect technology to develop rapidly and bring broadly welcome changes—but beyond that, they divide into those who focus on benefits (e.g., Kurzweil) and those who focus on risks (e.g., Bostrom). Both camps sympathise with “transhuman” views of survival for humankind in a different physical form, e.g., uploaded on a computer (Moravec 1990, 1998; Bostrom 2003a, 2003c). They also consider the prospects of “human enhancement” in various respects, including intelligence—often called “IA” (intelligence augmentation). It may be that future AI will be used for human enhancement, or will contribute further to the dissolution of the neatly defined human single person. Robin Hanson provides detailed speculation about what will happen economically in case human “brain emulation” enables truly intelligent robots or “ems” (Hanson 2016).

The argument from superintelligence to risk requires the assumption that superintelligence does not imply benevolence—contrary to Kantian traditions in ethics that have argued higher levels of rationality or intelligence would go along with a better understanding of what is moral and better ability to act morally (Gewirth 1978; Chalmers 2010: 36f). Arguments for risk from superintelligence say that rationality and morality are entirely independent dimensions—this is sometimes explicitly argued for as an “orthogonality thesis” (Bostrom 2012; Armstrong 2013; Bostrom 2014: 105–109).

Criticism of the singularity narrative has been raised from various angles. Kurzweil and Bostrom seem to assume that intelligence is a one-dimensional property and that the set of intelligent agents is totally-ordered in the mathematical sense—but neither discusses intelligence at any length in their books. Generally, it is fair to say that despite some efforts, the assumptions made in the powerful narrative of superintelligence and singularity have not been investigated in detail. One question is whether such a singularity will ever occur—it may be conceptually impossible, practically impossible or may just not happen because of contingent events, including people actively preventing it. Philosophically, the interesting question is whether singularity is just a “myth” (Floridi 2016; Ganascia 2017), and not on the trajectory of actual AI research. This is something that practitioners often assume (e.g., Brooks 2017 [OIR]). They may do so because they fear the public relations backlash, because they overestimate the practical problems, or because they have good reasons to think that superintelligence is an unlikely outcome of current AI research (Müller forthcoming-a). This discussion raises the question whether the concern about “singularity” is just a narrative about fictional AI based on human fears. But even if one does find negative reasons compelling and the singularity not likely to occur, there is still a significant possibility that one may turn out to be wrong. Philosophy is not on the “secure path of a science” (Kant 1791: B15), and maybe AI and robotics aren’t either (Müller 2020). So, it appears that discussing the very high-impact risk of singularity has justification even if one thinks the probability of such singularity ever occurring is very low.

2.10.2 Existential Risk from Superintelligence

Thinking about superintelligence in the long term raises the question whether superintelligence may lead to the extinction of the human species, which is called an “existential risk” (or XRisk): The superintelligent systems may well have preferences that conflict with the existence of humans on Earth, and may thus decide to end that existence—and given their superior intelligence, they will have the power to do so (or they may happen to end it because they do not really care).

Thinking in the long term is the crucial feature of this literature. Whether the singularity (or another catastrophic event) occurs in 30 or 300 or 3000 years does not really matter (Baum et al. 2019). Perhaps there is even an astronomical pattern such that an intelligent species is bound to discover AI at some point, and thus bring about its own demise. Such a “great filter” would contribute to the explanation of the “Fermi paradox” why there is no sign of life in the known universe despite the high probability of it emerging. It would be bad news if we found out that the “great filter” is ahead of us, rather than an obstacle that Earth has already passed. These issues are sometimes taken more narrowly to be about human extinction (Bostrom 2013), or more broadly as concerning any large risk for the species (Rees 2018)—of which AI is only one (Häggström 2016; Ord 2020). Bostrom also uses the category of “global catastrophic risk” for risks that are sufficiently high up the two dimensions of “scope” and “severity” (Bostrom and Ćirković 2011; Bostrom 2013).

These discussions of risk are usually not connected to the general problem of ethics under risk (e.g., Hansson 2013, 2018). The long-term view has its own methodological challenges but has produced a wide discussion: (Tegmark 2017) focuses on AI and human life “3.0” after singularity while Russell, Dewey, and Tegmark (2015) and Bostrom, Dafoe, and Flynn (forthcoming) survey longer-term policy issues in ethical AI. Several collections of papers have investigated the risks of artificial general intelligence (AGI) and the factors that might make this development more or less risk-laden (Müller 2016b; Callaghan et al. 2017; Yampolskiy 2018), including the development of non-agent AI (Drexler 2019).

2.10.3 Controlling Superintelligence?

In a narrow sense, the “control problem” is how we humans can remain in control of an AI system once it is superintelligent (Bostrom 2014: 127ff). In a wider sense, it is the problem of how we can make sure an AI system will turn out to be positive according to human perception (Russell 2019); this is sometimes called “value alignment”. How easy or hard it is to control a superintelligence depends significantly on the speed of “take-off” to a superintelligent system. This has led to particular attention to systems with self-improvement, such as AlphaZero (Silver et al. 2018).

One aspect of this problem is that we might decide a certain feature is desirable, but then find out that it has unforeseen consequences that are so negative that we would not desire that feature after all. This is the ancient problem of King Midas who wished that all he touched would turn into gold. This problem has been discussed on the occasion of various examples, such as the “paperclip maximiser” (Bostrom 2003b), or the program to optimise chess performance (Omohundro 2014).

Discussions about superintelligence include speculation about omniscient beings, the radical changes on a “latter day”, and the promise of immortality through transcendence of our current bodily form—so sometimes they have clear religious undertones (Capurro 1993; Geraci 2008, 2010; O’Connell 2017: 160ff). These issues also pose a well-known problem of epistemology: Can we know the ways of the omniscient (Danaher 2015)? The usual opponents have already shown up: A characteristic response of an atheist is

People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world (Domingos 2015)

The new nihilists explain that a “techno-hypnosis” through information technologies has now become our main method of distraction from the loss of meaning (Gertz 2018). Both opponents would thus say we need an ethics for the “small” problems that occur with actual AI and robotics (sections 2.1 through 2.9 above), and that there is less need for the “big ethics” of existential risk from AI (section 2.10).

