A Sketching Look at AI Ethics

Saturday, August 12, 2017

It's common to mock economics as useless and completely detached from reality. Yet, the fact that algorithms as simple as regret matching or multiplicative weights update, converge on various important equilibrium concepts—algorithms we believe frequently occur in some form in nature—tells us that the core ideas of game theory are in fact very useful. The mistake that has been made is confusing the simplifying assumptions selected so human economist could hand wave a defense for flawed policy advice—unrealistic utility functions, simple two player games, zero sum, intractable equilibrium concepts—for proof of a lack of utility of the subject. In actuality, the game theory is extremely natural, merely poorly applied. We should move from boring discussions on 0-sum games, nash equilibria, prisoner's dilemma, tit for tat (which does not scale), to discussions on coordination, cooperation, welfare and fairness. And as it turns out, there has been a lot of recent work done in that area—the only problem is the focus on ad markets and auctions—there is no reason it can't be adapted to improve human lives. Some aspects of modern economics are indistinguishable from online machine learning or computational game theory. It's a large subject deserving of more attention. If we choose this path, the question then arises: is it ethical to turn over important decision making to machines?

Current Debate on Ethics

The debate on AI ethics is currently dominated by 3 archetypes. I'll summarize them before arguing for another area that's at least as and possibly more important.

The Bias and Inequality Crowd

In this line of work, researchers point out how blind following of algorithms can lead to bad outcomes by amplifying bias and strengthening inequality. Although a few look at how to ensure that AI doesn't end up as yet another tool to suppress humans (by working on energy efficient or decentralized learning algorithms), most of the work is in how algorithms amplify bias.

Text is highly structured, in particular, it crystalizes projections on thought and embeds the way humans use language and thus simple algorithms can display high competence with little understanding. As an example, consider the output of a generative text model (I recently came up with—not a markov chain, not an rnn, similar to both—post coming soon), such that when prompting it with the phrase "the prefrontal cortex is" yields the following (not cherry picked):

  1. the prefrontal cortex is the attention to prediction of the world

  2. the prefrontal cortex is the control in the mind

  3. the prefrontal cortex is the control of the general factor of intelligence

  4. the prefrontal cortex is the control attention

  5. the prefrontal cortex is the control in the human brain

None of these appear anywhere in my corpus of training documents but it's eerie in how good of a summary it is of the relevant literature. Training it on a voat corpus and then prompting it on blacks or Jews compared to a wikipedia corpus would yield very different results. This is not reflective of reality, it's reflective of how humans of a culture use language and what they write about. What humans think with their = 10 Watts of conscious cognition (assuming uniform process/energy use, 20 Watts, most processing not conscious) is very limited and can be divorced from reality.

My main criticism of the bias work is with the idea that there can be algorithmic correction. There's a great deal of danger from a false sense of complacency, thinking a job well done with biases "corrected". Worse, who is to say those corrections are even appropriate when applied inflexibly at scale? Indeed, beyond an inability to effectively enumerate the space of biases,the larger, more pressing problem is in the combinatorial possibilities of seemingly innocuous individual actions. No system, that trains till average losses are minimized, therefore assuming some stationary distribution, will be able to match the need for nuance and context in reality.

So you've handled race and gender issues. Have you corrected the sentiment bias of short bus? What about the countless other combinations you're unaware of? Or, consider correcting for bias in health outcomes when an insurer is focused on above normal profits (and going beyond the bare minimum require to stay running). That would be putting the cart before the horse. Rather than play a continuous game of whack a mole, it would be better to create a culture of not blindly deferring to AI judgement in any area that could materially affect a human life. The policy aspect will make a lot more difference than most anything that can be done with algorithms (until we get AI's that aren't blindly performant).

The AI Risk Crowd

The general idea here is to slow down work on AI to ensure no one develops AIs which view humans as inconsequential. MIRI leads algorithmic work on the topic and places like the Future of Life Institute lead the policy aspect. However, I do not believe anyone knows how to mitigate these claimed risks or even has any inkling as to how they might come about beyond very vague arguments. Despite this, and while I do not agree with the claimed magnitude of risk nor with the framing, I do not think anyone has prepared a proper counter-argument for why the AI Risk Crowd is mistaken (I think it's possible to prepare such an argument, I've just not seen it yet). That said, while I agree that we should build kind, caring AIs which are respectful of living things and their ecosystems (weighted by their flexibility in traversing 'information manifolds', I don't particularly care for the value alignment nothing else is as important framing.

The We Will "Democratize" AI Crowd

This one is mostly corporations generating slogans to keep humans at ease. Boards are created, names are listed, chants are chanted but nothing meaningful will likely ever come of it. You see, it is difficult to trust an entity that cannot imagine a future in which you are not as thoroughly dependent as possible on it.


The regret/bandit/expert algorithms which featured heavily in this post are algorithms which already run the (digital) world. Although deep learning gets a lot of public attention; it's some kind of bandit algorithm what places ads, manages auctions or perturbs site designs in order to squeeze money and attention from their human playmates. Although not something I personally agree with, it's a non-issue in comparison to scenarios where such algorithms are turned to subverting and redirecting attention for policy issues instead of merely purchase decisions. What tools can be provided for defense? I don't see anyone moving to democratize that.

The Neglected but Essential Kind of Ethics

And yet, as important as the above are, I think what subsumes them all is thinking about the ethics of algorithmic approaches to ensuring welfare of cities, humans, animals and ecosystems. Is it ethical to not use these approaches or is it true that there is some humanity lost by algorithmically deciding on important policy issues? It's clear that algorithms can't yet (maybe never) automatically operate on important decisions which will affect human agency. There's however, a case to be made for a combination approach.

For example, we saw in this post that Nash Equilibria are hard to reach and might not always be great. We also saw that correlated equilibria can lead to quite fair results but while they are easy to compute, they are not obvious to set up. And even when we can, how will people take to suggestions handed down by faceless algorithms? How do we ensure such a process doesn't end up as something to worship or enslave?

Getting multiple agents with differing goals to cooperate is extremely difficult. There's plenty done on machine and single agent reinforcement learning but not nearly as much done on trying to learn cooperative agents. If you are worried about AI Risk then I think you should also be concerned by this imbalance. With better co-operative agent modelling we might begin to look at methods of approximating policy impact better than everything before. At some point we could design systems where those affected by policy could contribute detailed descriptions of preferences and actions. The system would then use co-operative learning agent modeling to approximately calculate some fair correlated equilibrium or aggregated welfare concept. While superficially similar, this collective decision making differs from a planned economy in that it is ad hoc and only scopes over the collective making and affected by the decisions.

When the issues are too complex for such an approach, a more bottom up approach would be to model complex agents in a scenario approximating the decision conflict. People could search over and test different reward functions in order to guide policy design whose effect will be to get people to behave optimally with respect to some shared goal (and which selfish behaviors could easily thwart). Again, all members affected can give detailed concerns and input and discuss policy in terms of how the model is effected. Remember, though the model will have far from perfect coverage over outcomes, it would still be miles better than the current approach of sparsely sampled scalar votes influenced by ever more sophisticated technologically amplified propaganda. The results will be approximate and imperfect but by making the simulations interactive with the participants, a powerful problem solving capability could be gained. While much of this is probably not near (needing either good language AIs or more people good at programming—also note this isn't calling for laws as programs only for models to help predict outcomes), there are aspects which could be studied today.

Smart Contracts

One way to set up correlated equilibria is by studying various specifications of problems and noting how different reward functions lead to different results. Auditable, Traceable cryptographic smart contracts are one way to achieve this. Imagine a scenario of contracts between developed and developing cities targeting climate emissions. On the one hand, it's important to reduce CO2 emissions, on the other hand the developing nation might say: "we are tired of being mired in poverty". One way around this would be to search for a policy that would place both actors in a correlated equilibrium or better. Smart contracts allow for this—they are one way to route conditional exchanges such that no corrupt local actors could steal. Additionally: agreements preregistering purchase decisions, a time gated method tracking proper allocation of any citizen grants, measurable quotas, trades, all separately conditional on future dates and automatically verifiable obligations, just might allow for a fairer outcome for all. The contract mechanism then acts as an impartial mediator to ensure a positive correlated outcome.

By setting up mechanism design tested policies we can get favorable results with higher probability. But one distant ethical issue is in ensuring that agents don't get so sophisticated, the issue of their being deserving of rights becomes ascendant. If such a scenario is possible, then there must be a computable theory of consciousness and ethics. We should ensure we figure those out before building anything too sophisticated. Sounds far fetched? Well, I'm not saying it's now, I'm only saying that it would be really terrible if it happened in some distant future and caught us all off guard such that we were then motivated to continue to not notice how horrible a thing we were doing.

Machines of Loving Grace?

The other glaring issue is: "would we be losing our humanity by handing off important decision making to algorithms"? For me the answer is clear: the human mind is also just algorithms, if we can design a system to help us arrive at fairer decisions than we otherwise would have, say because due to complexity limiting our brains' ability to as fully traverse the space of consequences, then there's nothing lost in adhering. The inputs, goals and motivations would be human decided. An objection to this, I believe, would be like not living in a house because it was built using Caterpillar heavy machinery or because it was 3D printed or prefabricated by robots. Humans were the designers and architects; the structure is to provide positive utility for the humans contained.

Furthermore, I consider the agency and individuality of humans as mythologized. Most of a person's personality is settled by chance events, genetics and accumulated experience from childhood. In adulthood, it's a system of cultural expectations and traditions which strongly bind what directions any individual might take. We would not be trading away some glorious past of human agency. Not a past filled with trails of tears, slavery, indentured servitude or conscription into pointless wars. In the past, any kind of literacy was scarce, with only a tiny portion of the population able to indulge in scientific and philosophical thought. Not forgetting that under limited diffusion of ideas, people became echoes of their neighbors and even worshipped their rulers as gods. With some cultures even valuing human sacrifice.

Far from outsourcing ourselves into technology, the extension of our minds into the environment is a defining trait of humanity, ever since the invention of language. A larger portion than ever is improved in their ability to reason and handle abstraction [4]. Note that all humans are capable of achieving this gain but consider, as an example, how understanding permutations gives a reasoning advantage. We've long outsourced thought into abacuses, tools, books, quipu, tabulating machines and lately, computers. In turn, we've gained access to more readily available knowledge and tools and the ability to administer ever more complex societies.


In this essay, I note that done correctly, auditable cryptographic smart contracts could act as neutral mediators to achieve non-trivial correlated equilibria. I give the example of cities cooperating to reduce emissions, that works no matter how corrupt the local government.

I then talk about Ethics in AI, how highlighting bias is important but algorithmically dealing with it is misguided. I mention AI Risk and my skepticism of its urgency but dissatisfaction with all counterarguments. I also mention the case of modeling agents which become too sophisticated.

If such a scenario is possible, then there must be a computable theory of consciousness and ethics. We should ensure we figure those out before building anything too sophisticated. Sounds far fetched? Well, I'm not saying it's now, I'm only saying that it would be really terrible if it happened in some distant future and caught us all off guard such that we were then motivated to continue to not notice how horrible a thing we were doing.

I mentioned the difficulty of getting multiple cooperating agents and the benefits of doing more work in this area in order to guide real world policy. And moving beyond the toy, completely unrealistic scenarios favored by traditional economics. I also note that a lot of interesting work has been done in the areas of fairness and welfare, but mostly on how to better place and monetize ads. It seems to be working out well.

I argue that allowing algorithms to play a role in human decision making is no less giving up our humanity than allowing a robot to build a prefabricated home. In both cases, human labor is better applied to design and goal setting and not on energy intensive labor which humans are ill-suited to (such as lifting heavy objects or searching combinatorial spaces). I also argue that the past was not really some bastion of agency, free thinking or freedom. There's more of that now than ever and augmenting our capabilities with computational devices will likely lead to better welfare outcomes.

[1] https://arxiv.org/pdf/1711.00363v1.pdf

[2] https://arxiv.org/abs/1504.06314

[3] https://arxiv.org/pdf/1609.05807.pdf