Prisoner’s dilemma is a term used to describe certain types of non-zero-sum situations in game theory where rational self-interested individuals make choices that lead to suboptimal results. Many games are zero sum in that a positive result for one side (+1) will result in a loss (−1) for the other (+1, −1 = 0). In a prisoner’s dilemma, however, the outcome is often negative for both parties. Economists, mathematicians, and psychologists, among others, use game theory to observe and predict the choices that people will make when faced with various outcomes. The games involve each player having preferences and choices about the outcomes. They have information about the options open to the other party but do not know exactly how they will behave. The outcomes are not fixed but depend on the choices the players make. Two-player games of this type are useful because they provide objective quantitative data about rational choices under variable conditions.
The name prisoner’s dilemma came about from a story developed in 1950 by the mathematician Albert Tucker, who was trying to explain a problem that arises in games developed by his colleagues Merrill Flood and Melvin Dresher as part of their work for the RAND Corporation. The narrative varies in its particulars but sets up a paradoxical dynamic where individual benefits are balanced against mutual gain. Classically, two suspects are separated, and then the interrogator who has sufficient evidence for a minor charge makes a proposition to each suspect separately: Whoever confesses first and implicates the other will get a plea bargain and a small fine, while the accomplice will face the harshest charge possible and a consequent long sentence. Each suspect gets to think about the deal and slip a note under the jail door by morning. This dilemma leaves the individual with two distinct choices, to confess or to keep quiet, but the outcome depends on what the other person does. If one keeps quiet, that person will only do well if the other suspect remains quiet as well; if the other person confesses, the nonconfessor will end up in prison for a long time. Each prisoner reasons that he or she is better off confessing irrespective of what the partner does. Yet paradoxically the best mutual outcome would result from not confessing. Central to the dilemma is the fact that the actors have to operate in the absence of full information and trust. Left to ponder what is in one’s personal best interest, each prisoner’s best rational choice (called an “equilibrium”) is to minimize the risks posed by the various options and confess as quickly as possible.
The dilemma is often represented graphically, with rows representing the choices of one party and columns the choices of the other (see Table 1). Thus, if one confesses while the other keeps quiet, the result would be that the one who kept quiet has a significant negative outcome, whereas the confessor benefits. Similarly if both confess, then there is a negative result for both.
The setup means that both parties will have a common set of individual preferences. Often the choices are given the more value-laden terms “cooperation” (c) and “defection” (d). The best individual ranking of payoffs would be confession when the other is silent (d/c), followed by mutual silence (c/c), then mutual confession (d/d), and finally, keeping quiet while being implicated by the partner (c/d). Still, both prisoners are reasoning the same way at the same time, with the result that as a group they are worse off than if they could have cooperated more. The game thus leads to an outcome that is Pareto suboptimal. In other words, there are other choices that the players could have made that would have left both better off without either being made worse off.
In game theory terms, the rational dynamic that leads the players to choose as they do is known as a dominant strategy. Anyone faced with the dilemma is forced to choose what to do regardless of the other person’s choice, and in this case it makes the most sense to confess in the absence of full information. The game is also symmetrical in that both parties are given identical choices and are aware of each other’s preference ordering.
Once we recognize the dilemma, we can see it in business and in everyday life, from nuclear disarmament treaties to gas pricing at stations across the street from each other. For example, most airlines would like to get rid of their frequent flyer programs, which reward travelers with free seats at given reward levels and thus deprive the airlines of potential revenue. However, it is illegal for the airlines to collude, and therefore, they have to make moves in the market that make assumptions about the way their competitors will behave. If all airlines simultaneously abandoned their frequent flyer programs, they would all be better off. However, if one announces that it is going to, the others are faced with the choice of doing so as well (cooperating) or capitalizing on the market opportunity to make short-term gains (defecting). So although they would all be better off dropping the programs, in the absence of full trust and knowledge about the future behavior of others, none is willing to make the first move.
|Confess||−3, −3||0, −6|
|Keep quiet||−6, 0||−1, −1|
Another everyday example is where traffic flows in a single direction on a two-lane highway. If one of the lanes is blocked due to roadwork, signals advise motorists to merge to form a single lane. Drivers then have the choice to either slow down and allow cars in the blocked lane to merge gradually or race ahead and cut in at the front. It would be mutually beneficial for all motorists to have a slower but constant flow of traffic; but if there is any suspicion that someone will not cooperate, then the drivers are faced with the choice of enduring the consequent stop-and-go traffic caused by the defector or becoming defectors themselves. Similarly, there will be a temptation for political rivals to implement a negative campaign that attacks the other side even though both realize that they would both benefit from not doing so.
The prisoner’s dilemma is especially prescient in analyzing cases where there is a limited resource held in common but there are incentives for individual gain at the cost of the general welfare, such as exploiting the environment. Thus, we can see that if fishing grounds are depleted, it makes sense for everyone concerned to agree to wait until they have a chance to replenish. At the same time, there are potentially huge rewards for someone who defects from the agreement. If everyone thinks the same way, then it will be rational, if not moral, to defect from a ban on fishing.
The prisoner’s dilemma is a form of mixed motive game in that the preference orderings can be adjusted so that it is not always in someone’s best interest to cooperate or defect. In the classic case above, the order is d/c > c/c > d/d > c/d, where “>” represents the preferred outcome. Other orderings have been given individual labels too. The sequence d/c > c/c > c/d > d/d represents the game of “chicken” made famous by teen movies in the 1950s. Opposing parties engage in a destructive course of action, such as driving cars toward each other, and the winner is the one who steers away (cooperates) last. The best outcome is to stay on track while the other car swerves away. However, there are no rewards and a huge downside if no one veers and a crash occurs. This game of chicken is the sequence set up by brinksmanship or hardball bargaining in business. There are great benefits if one side can cause the other to cave in, but if both act in the same adversarial way, then they are likely to lose out on a potentially profitable deal and spoil their future relationship at the same time. Lengthy labor strikes reflect this outcome.
The order c/c > d/c > d/d > c/d is sometimes called a stag hunt, after a story from Rousseau. Here, people are engaged in a cooperative enterprise that none could succeed at individually. However, if a smaller reward presents itself to one of the participants—such as an easily caught rabbit—the temptation is to abandon the team project and go for the surer reward. Again, if everyone behaves similarly, then they are all better off seeking their own pickings, but the worst outcome is to be operating for the benefit of the team when everyone else is out for themselves. This case illustrates what happens when a group project lacks strong unanimity of purpose or lack of trust in the ultimate outcome.
So far, the games described have been symmetrical and one-time choices. Considerable research has gone into studying the effects of changing these variables. The payoffs may be adjusted, and sometimes each party will have a different preference order. For example, if one party were very rich so that the marginal utility for the profit and loss would be relatively less than it would be for a poorer player, the situation allows the rich side to play chicken since it could accommodate a mutually unfavorable result, whereas the poorer player has a traditional prisoner’s dilemma ordering.
Other factors may affect the way the prisoner’s dilemma is played. Conditions may be relaxed so that the parties may confer, for example. Although communication sometimes increases cooperation, it also gives players the opportunity to set up sham agreements and lie to each other.
If we imagine our prisoners pondering what to do, it will make a difference if they have dealt with each other previously. In repeated (or iterated) games, where the payoff matrix is known, certain strategies will emerge as being more successful over time. Robert Axelrod has run a number of computer versus computer games, and it turns out that when players can punish each other through defection or, alternatively, reward each other through cooperation, the most successful strategy is the one labeled “tit for tat” (TFT), whereby one initially cooperates with and then reciprocates the move made by the other party. Axelrod describes the program as nice in that one is never the first to defect, retaliatory in that it penalizes defection, forgiving in that it does not aim to punish beyond the move at hand, and clear insofar as its strategy is very explicit. In computer tournaments (more than 120,000 moves), TFT survived better than any other program.
The prisoner’s dilemma itself is a rational exercise, and therefore, the lessons we draw from it will be prudential but not necessarily moral. The research implies that over repeated encounters, each side will be better off cooperating rather than seeking shortterm gain. Trust and reputation have considerable benefits because they will allow parties to reach optimal solutions instead of defaulting to behavior that focuses solely on defensive postures. These insights can certainly be used to develop a practical ethics and have been used to explain the development of altruism when a population is self-interested.
Unlike computers, humans bring a range of emotions and psychological drives that often make their actions subrational in a technical sense. Individuals often bring a desire to do better than the other side no matter what the cost (the so-called auction dynamic), a desire for vengeance, a need to maintain a notion of personal integrity (e.g., never to squeal or defect even in the face of considerable incentives), or numerous other factors that influence play. This reality leads some commentators to suggest that corporations with clear mandates may be more rational than humans. However, this logic suggests that corporations may just be strategic players, with an expedient egoistic morality.
The prisoner’s dilemma and similar games are necessarily artificial and do not represent the full richness of human interaction. Nevertheless, by paring down complex issues into straightforward choices, they provide useful quantitative data for many areas of social science research.
Altruism; Auction Market; Decision-Making Models; Equilibrium; Free Riders; Game Theory; Marginal Utility; Nash Equilibrium; Negotiation and Bargaining; Prudence; Reciprocal Altruism; Rousseau, Jean-Jacques; Tragedy of the Commons
Related Credo Articles
Abstract The Prisoner's Dilemma (PD) is a game whose Nash equilibrium is not Pareto efficient. The sheer perversity of the interaction involved in t
Definition Beyond any doubt, Prisoner’s Dilemma is the bestknown situation in which self-interest and collective interest are at odds. The...
In its simplest form, the prisoner's dilemma refers to a mixed-motive conflict in which two interdependent decision makers have to decide...