Zero-determinant alliances in
multiplayer social dilemmas
arXiv:1404.2886v1 [q-bio.PE] 10 Apr 2014
Christian Hilbel , Ame Traulsen2, Bin Wu2 & Martin A. Nowalc"
Program for Evolutionary Dynamics, Harvard University, Cambridge,
MA 02138, USA
2 EvolutionaryTheory Group, Max-Planck-Institute for Evolutionary Biology,
August-Thienemann-StraBe 2, 24306 Plan, Germany
3 Department of Organismic and Evolutionary Biology, Department of
Mathematics, Harvard University, Cambridge, MA 02138
April 11, 2014
Direct reciprocity and conditional cooperation are important mechanisms to prevent free
riding in social dilemmas. But in large groups these mechanisms may become ineffective,
because they require single individuals to have a substantial influence on their peers. How-
ever, the recent discovery of the powerful class of zero-determinant strategies in the iterated
prisoner's dilemma suggests that we may have underestimated the degree of control that
a single player can exert. Here, we develop a theory for zero-determinant strategies for
multiplayer social dilemmas, with any number of involved players. We distinguish several
particularly interesting subclasses of strategies: fair strategies ensure that the own payoff
matches the average payoff of the group; extortionate strategies allow a player to perform
above average; and generous strategies let a player perform below average. We use this the-
ory to explore how individuals can enhance their strategic options by forming alliances. The
effects of an alliance depend on the size of the alliance, the type of the social dilemma, and
on the strategy of the allies: fair alliances reduce the inequality within their group; extor-
tionate alliances outperform the remaining group members; but generous alliances increase
welfare. Our results highlight the critical interplay of individual control and alliance forma-
tion to succeed in large groups.
Keywords: evolutionary game theory; zero-determinant strategies; cooperation; public goods
game; volunteer's dilemma;
EFTA01074265
Cooperation among self-interested individuals is generally difficult to achieve (1, 2), but typically
the free rider problem is aggravated even further when groups become large (3—8). In small com-
munities, cooperation can often be stabilized by forms of direct and indirect reciprocity (9-16).
For large groups, however, it has been suggested that these mechanisms may turn out to be in-
effective, as it becomes more difficult to keep track of the reputation of others, and because the
individual influence on others diminishes (3-7). To prevent the tragedy of the commons, and to
compensate for the lack of individual control, many successful communities have thus established
central institutions that enforce mutual cooperation (17-20).
However, a recent discovery suggests that we may have underestimated the amount of control
that single players can exert in repeated games. For the repeated prisoner's dilemma, Press and
Dyson (21) have shown the existence of zero-determinant strategies (or ZD strategies), which
allow a player to unilaterally enforce a linear relationship between the own payoff and the co-
player's payoff — irrespective of the co-player's actual strategy. The class of zero-determinant
strategies is surprisingly rich: For example, a player who wants to ensure that the own payoff will
always match the co-player's payoff can do so by applying a fair ZD strategy, like Tit-for-Tat. On
the other hand, a player who wants to outperform the respective opponent can do so by slightly
tweaking the Tit-for-Tat strategy to the own advantage, thereby giving rise to extortionate ZD
strategies. The discovery of such strategies has prompted several theoretical studies, exploring
how different ZD strategies evolve under various evolutionary conditions (22-28).
However, ZD strategies are not confined to pairwise games. In a recently published study,
it was shown that ZD strategies also exist in repeated public goods games (29); and herein we
will show that such strategies exist for all symmetric social dilemmas, with an arbitrary number
of participants. We also use this theory to demonstrate how zero-determinant strategists can even
further enhance their strategic options by forming alliances. As we will show, the impact of an
alliance depends on its size, the type of the social dilemma and on the applied strategy of the allies:
while fair alliances reduce inequality within their group, extortionate alliances strive for unilateral
advantages, with larger alliances being able to enforce even more extortionate relationships. These
results suggest that when a single individual's strategic options are limited, forming an alliance
may result in a considerable leverage.
To obtain these results, we consider a repeated social dilemma between it players. In each
round of the game players can independently decide whether to cooperate (C) or to defect (D).
A player's payoff depends on the player's own decision, and on the decisions of all other group
members (see Fig. S1A): in a group in which j of the other group members cooperate, a co-
operator receives the payoff /kb whereas a defector obtains bj. We assume that payoffs satisfy
the following three properties that are characteristic for social dilemmas (corresponding to the
Individual-centered' interpretation of altruism in (30)): (0 Irrespective of the own strategy, play-
ers prefer the other group members to cooperate (ai+i > aj and b7+1 > bj for all j). (ii) Within
2
EFTA01074266
Number of cooperators
n. n.2 .... 2 1 0
among co-players
Cooperator's payoff an-, an-2 a2 at ao
Defectors payoff bn-r bn-2 b2 b, be
B 3
Public goods game Volunteers Dilemma
Defector 2
2 bream
boWee,7:-b
1 .1
a. IMooperator
0 Ccoper101aio-cr
arc04
-1
o d
0 2 4 8 8 10 0 2 4 6 8 10
limber of cooperating op-players Number of cooperating co-players
C Alliance Outsiders
Figure S I: Illustration of the model assumptions for repeated social dilemmas. (A) We consider
symmetric social dilemmas in which each player can either cooperate or defect. The player's
payoff depends on the own decision, and on the number of other group members who decide to
cooperate. (B) We will discuss two particular examples: the public goods game (in which payoffs
are proportional to the number of cooperators), and the volunteers dilemma (as the most simple
example of a nonlinear social dilemma). (C) Alliances are defined as a collection of individuals
who coordinate on a joint ZD strategy. We refer to the set of group members that are not part of
the alliance as outsiders. Outsiders are not restricted to any particular strategy; in particular, they
may choose a joint strategy themselves.
any mixed group, defectors obtain strictly higher payoffs than cooperators (bi n > aj for all j).
(iii) Mutual cooperation is favored over mutual defection (an_ t > b,,). To illustrate our results, we
will discuss two particular examples of multiplayer games (see Fig. S I B). In the first example, the
public goods game (31), cooperators contribute an amount c > 0 to a common pool, knowing that
total contributions are multiplied by r (with 1 Cr< n) and evenly shared among all group mem-
bers. Thus, a cooperator's payoff is aj = rc(j + 1)/n — c, whereas defectors yield bj = rajIn. In
the second example, the volunteer's dilemma (32), at least one group member has to volunteer to
bear a cost c > 0 in order for all group members to derive a benefit b > c. Therefore, cooperators
obtain aj = b — c (irrespective of j) while defectors yield bi = b if j ≥ 1 and bo = 0. Both
examples (and many more, such as the collective-risk dilemma, (6, 7, 33)) are simple instances of
3
EFTA01074267
multiplayer social dilemmas.
We assume that the social dilemma is repeated, such that individuals can react to their co-
players' past actions (for simplicity, we will focus here on the case of an infinitely repeated game,
but our results can easily be extended to the finite case, see Supporting Information). As usual,
payoffs for the repeated game are defined as the average payoff that players obtain over all rounds.
In general, strategies for such repeated games can become arbitrarily complex, as subjects may
condition their behavior on past events and on the round number in non-trivial ways. Neverthe-
less, as in pairwise games (21), zero-determinant strategies are surprisingly simple — they depend
on the outcome of the last round only. In contrast to most previous studies on repeated games,
however, we do not presume that individuals act in isolation. Instead, we allow subjects to form
an alliance, which can be considered as a collection of players who coordinate on a joint strategy
(see Fig. SIC). In the following, we will discuss how the strategic power of such an alliance de-
pends on the number of involved players, on the social dilemma, and on the applied ZD strategy
of the allies.
Results
Zero-determinant strategies in large-scale social dilemmas. ZD strategies are particular memory-
one strategies (4, 21, 34, 35); they only condition their behavior on the outcome of the previous
round. Memory-one strategies can be written as a vector (ps,j), where psi denotes the probability
to cooperate in the next round, given that the player previously played S E (C, D}, and that j of
the co-players cooperated. ZD strategies have a particular form (see also Supporting Information):
players with a ZD strategy set their cooperation probabilities such that
Pet'? = 1 + [(1 — s)(i — — en (b2-F1 a3)]
(1)
PD,j c6[(1 — s)(i — om . (b)
71 43-01,
where a j and bi are the specific payoffs of the social dilemma (as outlined in Fig. 1), and where
1, s, and > 0 are parameters that can be chosen by the player. While these ZD strategies may
appear inconspicuous, they give players an unexpected control over the resulting payoffs of the
game, as we will show below.
Instead of presuming that players act in isolation, as in previous models of zero-determinant
strategies (21-29) we explicitly allow subjects to form alliances, and to coordinate on some joint
ZD strategy. Let k be the number of allies, with 1 < k < n — 1 (in particular, this covers the
case k = 1 of solitary alliances). In the Supporting Information we prove that if each of the allies
4
EFTA01074268
applies the same ZD-strategy with parameters 1, 8, and 4), then payoffs satisfy the equation
= sAITA + (1 - 84 15 (2]
where TrA is the mean payoff of the allies, Tr _A is the mean payoff of all outsiders, and
SA _ s(rt — 1) — (k — 1)
(3)
n—k
Relation [2] suggests that by using a ZD strategy, alliances exert a direct influence on the payoffs
of the outsiders. This relation is remarkably general, as it is independent of the specific social
dilemma being played, and as it is fulfilled irrespective of the strategies that are adopted by the
outsiders (in particular, outsiders are not restricted to memory-one strategies; it even holds if some
or all of the outsiders coordinate on a joint strategy themselves). We call the parameter 1 the
baseline payoff, s the slope of the applied ZD strategy, and 5,4 the effective slope of the alliance.
In the special case of a single player forming an alliance, k = 1, the effective slope of the alliance
according to Eq. [3) simplifies to 5„4 = s.
The parameters 1, s, and 0 of a ZD strategy cannot be chosen independently, as the entries psj
of a ZD strategy are probabilities that need to satisfy 0 ≤ psj ≤ 1. In general, the admissible
parameters depend on the specific social dilemma being played. In the Supporting Information we
show that exactly those relations [2] can be enforced for which either &A = s = 1 (in which case
the parameter I in the definition of ZD strategies is irrelevant), or for which the parameters 1 and
5„4 satisfy
max { bi _ k) (61 a3_1)} ≤ i ≤ min { aj + (b3+1 al )} , (4]
Al(nt)
where the maximum and minimum is taken over all possible group compositions, 0 ≤ j ≤ n - 1.
It follows that feasible baseline payoffs are bounded by the payoffs for mutual cooperation and
mutual defection, bo ≤ I ≤ an_1, and that the effective slope needs to satisfy the inequality
—1/(n — ≤ sA ≤ 1. Moreover, as social dilemmas satisfy bj+i > aj for all j, condition [4]
implies that the range of enforceable payoff relations is strictly increasing with the size of the al-
liance — larger alliances are able to enforce more extreme relationships between the payoffs of the
allies and the outsiders. In the following, we will highlight several special cases of ZD strategies,
and we discuss how subjects can increase their strategic power by forming alliances.
Fair alliances. As a first example, let us consider alliances that apply a ZD strategy with slope
8„4 = s = 1. By Eq. [2], such alliances enforce the payoff relation TrA = Tr_A, such that the
allies' mean payoff matches the mean payoff of the outsiders. We call such ZD strategies (and
the alliances that apply them) fair. As shown in Figure 2A, fair strategies do not ensure that
5
EFTA01074269
all group members get the same payoff — due to our definition of social dilemmas, unconditional
defectors always outperform unconditional cooperators, no matter whether the group also contains
fair players. Instead, fair players can only ensure that they do not take any unilateral advantage
of their peers. Interestingly, it follows from Eq. [3] that fair alliances consist of fair players:
because 8A = 1 implies s = 1, each player i of a fair alliance individually enforces the relation
rra = r_i. It also follows from our characterization [4] that such fair ZD strategies exist for all
social dilemmas — irrespective of the particular payoffs and irrespective of the group size.
As an example, let us consider the strategy proportional ft -for-Tat (pTFT), for which the
probability to cooperate is simply given by the fraction of cooperators among the co-players in the
previous round,
pTFTsd — [5)
n — 1*
For pairwise games, this definition of pTFT simplifies to the classical Tit-for-Tat strategy. How-
ever, also for the public goods game and for the volunteer's dilemma, pTFT is a ZD strategy
(because it can be obtained from Eq. [I] by setting s = 1 and 0 = 1/c, with c being the cost of
cooperation). As s = 1, alliances of pTFT players are fair, and they enforce TrA = ir_A. Inter-
estingly, this strategy has received little attention in the previous literature. Instead, researchers
have focused on other generalized versions of lit-for-Tat, which cooperate if at least in co-players
cooperated in the previous round (3, 36), i.e. psj = 0 if j < in and psi = 1 if j > in. Unlike
p TFT, these threshold strategies neither enforce a linear relation between payoffs, nor do they
induce fair outcomes, suggesting that pTFT may be the more natural generalization of Tit-for-Tat
for large social dilemmas.
Extortionate alliances. As another interesting subclass of ZD strategies, let us consider strategies
that choose the mutual defection payoff as baseline payoff, I = bo, and that enforce a positive slope
0 < sA < 1. For such strategies, relation [2] becomes ir _A = sAfrA + (1 — sA)bo, implying that
the outsiders only get a fraction s A of any surplus over the mutual defection payoff. Moreover,
as the slope sA is positive, the payoffs TrA and Tr—A are positively related. As a consequence,
the collective best reply for the outsiders is to maximize the allies' payoffs by cooperating in
every round. In analogy to Press and Dyson (21), we call such alliances extortionate, and we
call the quantity x = 1/sA the extortion factor. Extortionate alliances are particularly powerful
in social dilemmas in which mutual defection leads to the lowest group payoff (as in the public
goods game and in the volunteer's dilemma): in that case, they enforce the relation if_A ≤ TrA;
on average, the allies perform at least as well as the outsiders (as also depicted in Figure 2B).
Similarly to the results for fair alliances, extortionate alliances consist of extortionate players: an
alliance that enforces the baseline payoff I = bo and a slope 0 < sA < 1 requires the allies to use
ZD strategies with I= bo and 0 < s < 1, such that each player i individually enforces the relation
6
EFTA01074270
Fair Alliance B Extortionate Alliance C Generous Alliance
3
Outsiders (n-k=12)
Alliance (k=ft) 2
i ramekeememe Alliance (k=8) Alliance (k=8)
Outsiders (n-k=12) 1 1
Outsiders (n-k=12)
0 0 0
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Round Number Round Number Round Number
Figure S2: Characteristic dynamics of payoffs over the course of the game for three different
alliances. Each panel depicts the payoff of each individual group member (thin lines) and the
resulting average payoffs (thick lines) for the alliance (blue) and for outsiders (red). (A) An
alliance that adopts a fair strategy ensures that the payoff of the allies matches the mean payoff of
the outsiders. This does not imply that all outsiders yield the same payoff. (B) For games in which
mutual defection leads to the lowest group payoff, extortionate alliances ensure that their payoffs
are above average. (C) In games in which mutual cooperation is the social optimum, generous
alliances let their co-players gain higher payoffs. The three graphs depict the case of a public
goods game with r = 4, c = 1, group size n = 20, and alliance size k = 8. For the strategies
of the outsiders we have used random memory-one strategies, were the cooperation probabilities
were independently drawn from a uniform distribution. For the strategies of the allies, we have
used (A) pTF7', (B)pET with s = 0.8, (C) pec with s = 0.8.
IT-i = slrq + - s)bo.
For the specific example of a public goods game, let us consider the ZD strategy pE1 with
1 = 0, (b = 1/c, and 0 < s < 1, for which Eq. [1] becomes
423 = 4n r - (1 - s) [6]
pEs = WL7 — (1 — s)i
In the limit of s i 1, these extortionate strategies approach the fair strategy pTF7'. However,
as s decreases from 1, the cooperation probabilities of p Ez are increasingly biased to the own
advantage (with the probabilities ppi decreasing more rapidly than the probabilities pf.,Ti ). As
with fair strategies, such extortionate strategies exist for all repeated social dilemmas. However,
in large groups the power of alliances to extort their peers depends on the social dilemma, and on
the size of the alliance (as generally described by condition [4]).
For example, for single-player alliances (k = 1) in the public goods game, the feasible extor-
tion factors x are bounded when groups become large, with Xmax = — 1)r/((n — 1)r —
being the maximum extortion factor (see also (29)). To be able to enforce arbitrarily high extor-
tion factors, players need to form an alliance such that the fraction of alliance members exceeds
a critical threshold. By solving condition [4] for the case of extortionate coalitions with infinite
7
EFTA01074271
extortion factors (i.e., I = 0 and sA = 0), this critical threshold can be calculated explicitly as
k r—1
[7)
n r
Only for alliances that have this critical mass, there are no bounds to extortion.
Generous alliances. As the benevolent counterpart to extortioners, Stewart and Plotkin were
first to describe a set of generous strategies for the iterated prisoner's dilemma (22, 26). Unlike
extortioners, generous alliances set the baseline payoff to the mutual cooperation payoff I = a
while still enforcing a positive slope 0 < sA < 1. This results in the payoff relation Tr_A =
sAr A + (1 — sA)a„_1, such that a generous alliances accept a larger share of any loss (compared
to the mutual cooperation payoff a.„_1). In particular, for games in which mutual cooperation
is the optimal outcome (as in the public goods game and in the prisoner's dilemma, but not in
the volunteer's dilemma), the payoff of a generous player satisfies rA < _A (see also Fig. 2C
depicting the case of a public goods game). As with fair and extortionate alliances, generous
alliances consist of players that are individually generous.
For the example of a public goods game, we obtain a particularly simple generous ZD strategy
pGe by setting I = rc — c, = 1/c, and 0 < s < 1, such that
Pg'7i = +(1 8)Q1—.÷1
[8]
nee
raj = n-1
In parallel to the extortionate strategy discussed before, these generous strategies approach pTFT
in the limit of s = 1, whereas they enforce more generous outcomes for s < 1. Again, generous
strategies exist for all social dilemmas, but the extent to which players can be generous depends
on the particular social dilemma, and on the size of the alliance.
Equalizers. As a last interesting class of ZD strategies, let us consider alliances that choose
s = (k. — 1)1(n — 1), such that by Eq. [3] the effective slope becomes sA = 0. By Eq. [2],
such alliances enforce the payoff relation ir_A = &meaning that they have unilateral control over
the mean payoff of the outsiders (for the prisoner's dilemma, such equalizer strategies were first
discovered by (37)). However, as with extortionate and generous strategies, equalizer alliances
need to reach a critical size to be able to determine the outsiders' payoff; this critical size depends
on the particular social dilemma, and on the imposed payoff! (the exact condition can be obtained
from [4] by setting sA = 0).
For the example of a public goods game, a single player can only set the co-players' mean
score if the group size is below n < 2r/(r — 1). For larger group sizes, players need to form
8
EFTA01074272
Strategy Typical Prisoner's
Public goods game Volunteer's dilemma
class property dilemma
Fair = Always Always Always
1T—A
strategies ITA exist exist exist
In large groups, single players
Extortionate C 7.01 Always cannot be arbitrarily extortionate, Even large alliances cannot
IT-A
strategies exist but sufficiently large alliances be arbitrarily extortionate
can be arbitrarily extortionate
In large groups, single players
Generous -A > ?TA Always cannot be arbitrarily generous, Do not ensure that own
IT
strategies exist but sufficiently large alliances payoff is below average
can be arbitrarily generous
May not be feasible for single Only feasible if the size of
Always
Equalizers 7T-A = 1 players, but is always feasible for the alliance is k = n — 1,
exist
sufficiently large alliances can only enforce I = b — c
Table I: Strategic power of different ZD strategies for three different social dilemmas. In the
repeated prisoner's dilemma, single players can exert all strategic behaviors (21, 26, 27). Other
social dilemmas either require players to form alliances in order to gain sufficient control (as in
the public goods game), or they only allow for limited forms of control (as in the volunteer's
dilemma).
alliances, with
k > (n — 2)(r — 1)
(9)
n n + (n — 2)r
being the minimum fraction of alliance members that is needed to dictate the outsiders' payoff.
Although the right hand side of Eq. [91 is monotonically increasing with group size, equalizer
alliances are always feasible; in particular, alliances of size k = n — 1 can always set the payoff
of the remaining player to any value between 0 and re — c.
Strategic power of different ZD strategies. Table 1 gives an overview for these four strategy
classes for three examples of social dilemmas. It shows that while generally ZD strategies exist
for all group sizes, the power of single players to enforce particular outcomes typically diminishes
or disappears in large groups. Forming alliances allows players to increase their strategic scope.
The impact of a given alliance, however, depends on the specific social dilemma: while alliances
can become arbitrarily powerful in public goods games, their strategic options remain limited in
the volunteer's dilemma.
While fair, extortionate, and generous alliances enforce different payoff relations, simulations
suggest that each of these strategy classes has its particular strength when facing unknown oppo-
9
EFTA01074273
Extortionate Fair Generous
Alliance Alliance Alliance
• S. .-
10
T. 0 5
esCA
O. al
g 2
es v
a) to 1
CC
8 go 0.03
0". 0.02
>c
o.oi
S
2.0 C
ao 1.5 -s-
ro c
- a'? 1.0 t st - - - -•
m¢
° B 0.5
0.0
1 2 3 4 5 6
Size of Alliance
Figure S3: The effect of different alliance strategies and various alliance sizes. Each panel shows
the outcome of simulated public goods games in which the alliance members interact with n — k
random co-players (uniformly taken from the set of memory-one strategies). We compare the
success of different alliances along three dimensions: the relative payoff advantage of the alliance
(defined as xt4/Tr_A), the payoff inequality within a group (defined as the mean variance between
payoffs of all group members), and the absolute payoff of the alliance (as given by IrA). Simu-
lations suggest that (A) extortionate alliances gain the highest relative payoff advantages, (B) fair
alliances reduce inequality within their group, and (C) sufficiently large generous alliances get the
highest payoffs. For the simulations, we have used a public goods game (r = 3, c = 1) in a group
of size n = 7; data was obtained by averaging over 500 randomly formed groups. The strategy of
the alliance members was pTF7', pEx (with s = 0.85), and pee (with s = 0.85), respectively.
nents (Figure 3). Forming an extortionate alliance gives the allies a relative advantage compared to
the outsiders, and by increasing the alliance's size, allies can enforce more extreme relationships.
10
EFTA01074274
Forming a fair alliance, on the other hand, is an appropriate measure to reduce the payoff inequal-
ity within a group — while the other two behaviors, generosity and extortion, are meant to induce
unequal payoffs (to the own advantage, or to the advantage of the outsiders members, respec-
tively), fair players actively avoid generating further inequality by matching the mean payoff of
the outsiders. Generous alliances, however, are most successful in increasing the absolute payoffs.
While it is obvious that generous alliances are beneficial for the outsiders (and that this positive
effect is increasing in the size of the alliance), Figure 3 suggests that even the allies themselves
may benefit from coordinating on a generous alliance strategy. Fair and extortionate alliances are
programmed to fight back when being exploited; this is meant to reduce the outsiders' payoffs, but
it also reduces the payoffs of the other allies. Therefore, when the alliance has reached a critical
size, it is advantageous to agree on a generous alliance strategy instead (but without being overly
altruistic), as it helps to avoid self-destructive vendettas.
This somewhat unexpected strength of generous strategies is in line with previous evolution-
ary results for the iterated prisoner's dilemma. For this pairwise dilemma, several studies have
reported that generosity, and not extortion, is favored by selection (25-27). Such an effect has
also been confirmed in a recent behavioral experiment, in which human cooperation rates against
generous strategies were twice as high as against extortioners, although full cooperation would
have been the humans' best response in both cases (38). Our results suggest that in multiplayer
dilemmas, generous alliances are able to induce a similarly beneficial group dynamics. In the
Supporting Information we show that if a generous alliance has reached a critical mass, it be-
comes optimal for outsiders to become generous too (independent of the specific social dilemma,
and independent of the strategy of the remaining outsiders). Once this critical mass is achieved,
generosity proves self-enforcing.
Discussion
When subjects lack individual power to enforce beneficial outcomes, they can often improve their
strategic position by joining forces with others. Herein, we have used and expanded the theory of
zero-determinant strategies (21, 25, 26) to explore the role of such alliances in repeated dilemmas.
We have found that three key characteristics determine the effect of an alliance of ZD strategists:
the underlying social dilemma, the size of the alliance, and the strategy of the allies. While subjects
typically have little influence to transform the underlying dilemma, we have shown that they can
considerably raise their strategic power by forming larger alliances, and they can achieve various
objectives by choosing appropriate strategies.
Our approach is based on the distinction between alliance members (who agree on a joint ZD
strategy), and outsiders (who are not restricted to any particular strategy, and who may form an
alliance themselves). This distinction allowed us to show the existence of particularly powerful
11
EFTA01074275
alliances, and to discuss their relative strengths. As an interesting next step of research, we plan
to investigate how such alliances are formed in the first place (which is typically at the core of
traditional models of coalitions, e.g. (39)), and whether evolutionary forces would favor particular
alliances over others (40). The results presented herein suggest that subjects may have various
motives to join forces. As particular examples, we have highlighted extortionate alliances (who
aim for a relative payoff advantage over the outsiders), fair alliances (who aim to reduce the
inequality within their group), and generous alliances (who are able to induce higher payoffs
as they avoid costly vendettas after accidental defections). Whether such alliances emerge and
whether they are stable thus needs to be addressed in light of the respective aim of the alliance:
when subjects are primarily interested in low inequality, then forming a fair alliance is an effective
means to reach this aim; and once a fair alliance is formed, inequity-averse subjects have little
incentives to leave (even if leaving the alliance would allow them to gain higher payoffs).
While we have focused on the effects of alliances in multiplayer social dilemmas, it should be
noted that our results on ZD strategies also apply for solitary alliances, consisting of single players
only. Thus, even if players are unable to coordinate on joint strategies, zero-determinant strategies
are surprisingly powerful. They allow players to dictate linear payoff relations, irrespective of
the specific social dilemma being played, irrespective of the group size, and irrespective of the
counter-measures taken by the outsiders. In particular, we have shown that any social dilemma
allows players to be fair, extortionate, or generous. At the same time, zero-determinant strategies
are remarkably simple. For example, in order to be fair in a public goods game (or in a volunteer's
dilemma), players only need to apply a rule called proportional Tit-for Tat pTFT: if j of the n —1
other group members cooperated in the previous round, then cooperate with probability jfin — 1)
in the following round. Extortionate and generous strategies can be obtained in a similar way, by
slightly modifying pTFT to the own advantage or to the advantage of the outsiders.
While these results were derived for the special case of infinitely repeated games, they can be
extended to the more realistic finite case. In finitely repeated games, end-game effects may prevent
alliances to enforce a perfect linear relation between payoffs; but it is still possible to enforce
an arbitrarily strong correlation between payoffs, provided that the game is repeated sufficiently
often. Similarly, we show in the Supporting Information, that it is not necessary that all alliance
members coordinate on the same ZD strategy, and that different alliance members may apply
different strategies. However, we have focused here on the case of symmetric alliances with a
joint strategy, because they are most powerful: any payoff relationship that can be enforced by
asymmetric alliances with different ZD strategies, can also be enforced by a symmetric alliance.
Overall, our results reveal how single players in multiplayer games can increase their strategic
power by forming beneficial alliances with others, helping them to regain control in large-scale
social dilemmas.
12
EFTA01074276
Acknowledgments
CH gratefully acknowledges generous funding by the SchrOdinger stipend J3475 of the Austrian
Science Fund.
References
[ I] G. Hardin. The tragedy of the commons. Science, 162:1243-1248, 1968.
[2] M. Olson. The Logic of Collective Action: Public Goods and the Theory ofGroups. Harvard Univer-
sity Press, revised edition, 1971.
[3] R. Boyd and P. J. Richerson. The evolution of reciprocity in sizeable groups. Journal of Theoretical
Biology, 132:337-356, 1988.
[4] C. Hauert and H. G. Schuster. Effects of increasing the number of players and memory size in the
iterated prisoner's dilemma: a numerical approach. Proceedings of the Royal Society B, 264:513-519,
1997.
[5] S. Suzuki and E. Akiyama. Evolution of indirect reciprocity in groups of various sizes and comparison
with direct reciprocity. Journal of Theoretical Biology, 245(3):539-552, Apr 2007.
[6] F. C. Santos and J. M. Pacheco. Risk of collective failure provides an escape from the tragedy of the
commons. Proceedings of the National Academy ofSciences USA, 108:10421-10425, 2011.
[7] M. Abou Chakra and A. Traulsen. Evolutionary dynamics of strategic behavior in a collective-risk
dilemma. PLoS Computational Biology, 8:e1002652, 2012.
[8] W. Yang, W. Liu, A. Villa, M. Tuanmu, G. He, T. Dietz, and J. Liu. Nonlinear effects of group size
on collective action and resource outcomes. Proceedings of the National Academy ofSciences USA,
110:10916-10921,2013.
[9] R. L. Trivers. The evolution of reciprocal altruism. The Quarterly Review ofBiology, 46:35-57, 1971.
[10] R. Axelrod. The Evolution of Cooperation. Basic Books, New York, NY, 1984.
[II] R. Alexander. The Biology ofMoral Systems. Aldine de Gruyter, New York, 1987.
[12] C. Wedekind and M. Milinski. Cooperation through image scoring in humans. Science, 288:850-852,
2000.
[13] J. Henrich, R. Boyd, S. Bowles, C. F. Camerer, E. Fehr, H. Gintis, and R. McElreath. Cooperation,
reciprocity and punishment in fifteen small scale societies. American Economic Review, 91:73-78,
2001.
[14] M. A. Nowak and K. Sigmund. Evolution of indirect reciprocity. Nature, 437:1291-1298, 2005.
[ I 5] M. van Veelen, J. Garcia, D. G. Rand, and M. A. Nowak. Direct reciprocity in structured populations.
Proceedings of the National Academy ofSciences USA, 109:9929-9934, 2012.
[16] B. M. Zagorsky, J. G. Reiter, K. Chatterjee, and M. A. Nowak. Forgiver triumphs in alternating
prisoner's dilemma. PLoS One, page DOI: 10.137 I/joumal.pone.0080814, 2013.
[17] T. Yamagishi. The provision of a sanctioning system as a public good. Journal of Personality and
Social Psychology, 51:110-116, 1986.
[18] E. Ostrom. Governing the Commons: The Evolution ofInstitutionsfor Collective Action. Cambridge
Univ. Press, 1990.
13
EFTA01074277
[19] K. Sigmund, H. De Silva, A. Traulsen, and C. Hauert. Social learning promotes institutions for
governing the commons. Nature, 466:861-863, 2010.
[20] F. Guala. Reciprocity: Weak or strong? What punishment experiments do (and do not) demonstrate?
Behavioral andBrain Sciences, 35:1-15, 2012.
[21] W. H. Press and F. D. Dyson. Iterated prisoner's dilemma contains strategies that dominate any
evolutionary opponent. Proceedings of the National Academy of Sciences USA, 109:10409-10413,
2012.
[22] A. J. Stewart and J. B. Plotkin. Extortion and cooperation in the prisoner's dilemma. Proceedings of
the National Academy ofSciences USA, 109:10134-10135,2012.
[23] P. Ball. Physicists suggest selfishness can pay. Nature, page doi:10.1038/nature.2012.11254, 2012.
[24] C. Hilbe, M. A. Nowak, and K. Sigmund. The evolution of extortion in iterated prisoner's dilemma
games. Proceedings of the National Academy ofSciences USA, 110:6913-6918, 2013.
[25] E. Akin. Stable cooperative solutions for the iterated prisoner's dilemma. arXiv, page 1211.0969v2,
2013.
[26] A. J. Stewart and J. B. Plotkin. From extortion to generosity, evolution in the iterated prisoner's
dilemma. Proceedings of the National Academy ofSciences USA, in press, 2013.
[27] C. Hilbe, M. A. Nowak, and A. Traulsen. Adaptive dynamics of exortion and compliance. PLoS One,
8:e77886, 2013.
[28] A. Szolnoki and M. Perc. Evolution of extortion in structured populations. Physical Review E,
89:022804,2014.
[29] L. Pan, D. Hao, Z. Rong, and T Zhou. Zero-determinant strategies in the iterated public goods game.
arXiv, page 1402.3542, 2014.
[30] B. Kerr, P. Godfrey-Smith. and M. W. Feldman. What is altruism? Trends in Ecology and Evolution,
19(3):135-140, Mar 2004.
[31] J. O. Ledyard. Public goods: A survey of experimental research. In John H Kagel and Alvin E Roth,
editors, The Handbook ofExperimental Economics. Princeton Univ. Press, Feb 1995.
[32] A. Diekmann. Volunteer's dilemma. Journal of conflict Resolution, 29:605-610, 1985.
[33] M. Milinski, R. D. Sommerfeld, H.-J. Krambeck, F. A. Reed, and J. Marotzke. The collective-risk so-
cial dilemma and the prevention of simulated dangerous climate change. Proceedings of the National
Academy ofSciences USA, 105(7):2291-2294,2008.
[34] M. A. Nowak and K. Sigmund. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the
Prisoner's Dilemma game. Nature, 364:56-58, 1993.
[35] K. Sigmund. The calculus ofselfishness. Princeton Univ. Press, 2010.
[36] S. Kurokawa and Y. Ihara. Emergence of cooperation in public goods games. Proceedings of the
Royal Society B, 276:1379-1384, 2009.
[37] M. C. Boerlijst, M. A. Nowak, and K. Sigmund. Equal pay for all prisoners. American Mathematical
Monthly, 104:303-307, 1997.
[38] C. Hilbe, T. Rohl, and M. Milinski. Extortion subdues human players but is finally punished in the
prisoner's dilemma. Nature Communications, In press, 2014.
[39] B. Peleg and P. Sudholter. Introduction to the theory ofcooperative games. Springer, Berlin, Heidel-
berg, New York, 2003.
[40] M. Mesterton-Gibbons, S. Gavrilets, J. Gravner, and E. Akcay. Models of coalition or alliance forma-
14
EFTA01074278
lion. Journal of Theoretical Biology, 274:187-204,2011.
15
EFTA01074279
Supporting Information: Zero-determinant alliances in
multiplayer social dilemmas
Christian Hilbel, Ame Traulsen2, Bin Wu2 & Martin A. Nowak"3
Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 02138, USA
2 EvolutionaryTheory Group, Max-Planck-Institute for Evolutionary Biology,
August-Thienemann-Strafe 2, 24306 Plan, Germany
3 Department of Organismic and Evolutionary Biology, Department of Mathematics, Harvard
University, Cambridge, MA 02138
In the following, we develop a theory of zero•determinant strategies (ZD strategies) and alliances for general
multiplayer social dilemmas. We begin by defining the setup of our model of repeated social dilemmas
(Section 1). Thereafter, we derive the existence and the properties of ZD strategies for solitary alliances
with a single player (Section 2), and then for general alliances with an arbitrary number of players (Section
3). Moreover, we explore which ZD strategies give rise to stable Nash equilibria, and we discuss which
alliances are self•enforcing when subjects strive for high payoffs (Section 4). As applications, we study
alliances in the repeated public goods game and in the repeated volunteer's dilemma (Section 5). The
appendix contains the proofs for our results.
1 Setup of the model: Repeated multiplayer dilemmas 17
2 ZD strategies for solitary players 18
3 ZD strategies for alliances 20
4 Self-enforcing alliances 21
5 Applications 23
5.1 Public goods games 23
5.1.1 Solitary players 23
5.1.2 Alliances 24
5.2 Volunteer's Dilemma 25
A Appendix: Proofs 25
16
EFTA01074280
1 Setup of the model: Repeated multiplayer dilemmas
We consider repeated social dilemmas between n players (as illustrated in Fig. I of the main text). In each
round, players may either cooperate (C) or defect (D), and the players' payoffs for each round depend on
their own action, and on the number of cooperators among the other group members. Specifically, in a
group with j other cooperators, a cooperator receives the payoff a1, whereas a defector obtains bd. To
qualify as a social dilemma, we assume that one-shot payoffs satisfy the following three conditions:
(i) Independent of the own action, players prefer their co-players to be cooperative,
aj+1 > ai and bj+1 > bi for all j with 0 ≤ j < n — I. IS 11
(it) Within each mixed group, defectors strictly outperform cooperators,
> (xi for all j with 0 < j < - 2. is'21
(ii) Mutual cooperation is preferred over mutal defection,
an-1 > b0.
As particular examples of such social dilemmas, we discuss the linear public goods game (e.g. (31)) and
the volunteer's dilemma (32) in Section 5.
Let us assume that the social dilemma is repeated for Af rounds (in the main manuscript, we have
focused on the special case M —> co). For such repeated games, a player's strategy needs to specify how
to act in given round, depending on the outcomes of the previous rounds. Given the strategies of all group
members, let us denote player i's expected payoff in round in as 71I (ft* and the average payoff of his
co-players as n ;(310 = E i#1 ari (in)/(n — 1). If the social dilemma is repeated for Al rounds, we define
payoffs for the repeated game as the average payoff per round,
M-1
ri = Iri(m).
Al A-0
m=0
al-1
1541
At m=Il
A—,
For infinitely repeated games, payoffs are then defined by taking the limit Al oo For simplicity, we
assume that these limits exist (which holds, for example, under the realistic assumption that players some-
times commit implementation errors, (35)).
While in general, strategies for repeated games can be arbitrarily complicated, ZD strategies are a subset
of a particularly simple class of strategies called memory-one strategies (21, 34, 35). In finitely repeated
n-player games, such strategies can be written as a vector
P= (po: pan_i, pan_2, • • • ,pc.0; PD,n-1113D.n-2 • • • ,PD.0), 1551
17
EFTA01074281
where po is the player's probability to cooperate in the first round, and ps.i is the probability to cooperate in
round in ≥ 2, given that the player previously played S E {C, D}, and that j of his co-players cooperated.
In infinitely repeated games, the outcome of the first round can often be neglected; in that case one may
drop the entry for po from the representation of memory-one strategies (21, 22, 24-29), yielding
P= (Pan-1, Pan 2 PC.0; PD.n-19PD.n-2 • • • ,PD.0)• [S6]
If the group contains only players with memory-strategies, the calculation of payoffs according to [541
becomes particularly simple. In that case, the repeated game can be described as a Markov chain, as the
outcome in the next round only depends on the outcome of the previous round (4, 34, 35). While the
assumption of memory-one strategies often simplifies calculations, we will show below that the properties
of ZD strategies hold irrespective of the strategies of the other group members (in particular, ZD strategists
do not require their co-players to apply memory-one strategies).
2 ZD strategies for solitary players
As specified in the main text, ZD strategies are particular memory-one strategies. We say that a player
adopts a ZD strategy, if the player chooses three constants 1, s, and 4, 0 0, and then calculates the respective
cooperation probabilities in [S6] as
— —
PC,/ 1 -F [(1 — 3)(1 — ni) — .7. j (634.1 ck,)]
[571
PD,J 4(1 - s)(I — /12 ) + —
n 1 /
(1
2 aj—L)] .
It is the following property of ZD strategies that is central to our analysis:
Proposition 1 (ZD strategies enforce a linear relation between payoffs.)
Suppose player i applies a ZD strategy with parameters 1,.s, and ¢ > 0 in a repeated game with M rounds.
Then payoffs obey the relation
— sic — (1 —4)1l 1 IS81
0111
In particular, in the case of infinitely repeated games, M oo, payoffs satisfy
= sr, + (1 — s)l. [S9]
It is worth to highlight several features of Proposition I. First, the Proposition makes no assumptions on the
strategies of the other group members, nor does it make restrictions on the specific social dilemma. Player
i can thus enforce linear relations of the form [581 and [591, independent of the behavior of i's co-players,
and independent of the exact strategic situation (in fact, the proof of Proposition I does not make use of
the assumption that the game is a social dilemma). In Figure S4, we illustrate this result for two different
social dilemmas (the public goods game and the volunteer's dilemma), and for two different ZD strategies
(a generous and an extortionate ZD strategy). Second, while we have focused on infinitely repeated games
in the main text, relation [S8] shows how our results generalize for finitely many rounds. Even in the finite
18
EFTA01074282
A Public Goods Game B Volunteers Dilemma
Average payoff Average payoff
of co-players of co-players
x -. x_,
• I¢-c.re-q
,b-cf(n-1))
fro'n-c.reln
iin-l)rdti.(n-t}rorn-c)
(0,0
[O. Payoff of Payoff of
focal player focal player
Figure S4: Illustration of ZD-strategies for (A) the linear public goods game and (B) the volun-
teer's dilemma. The blue-shaded area represents all feasible payoffs, with the x-axis representing
the payoff of player i, and with the y-axis representing the mean payoff of i's co-players. The
dashed diagonal gives the payoff combinations for which as = r_i. In both graphs, the strategy
of player i is fixed to some zero-determinant strategy, whereas for the co-players we have sampled
10° random memory-one strategies. Red dots represent the resulting payoff combinations, and the
grey line gives the prediction according to Eq. (S9]. For both graphs, we have considered an in-
finitely repeated game in a group of size xi = 4. Parameters: (A) Public goods game with r = 2.4
and c = 1. For the strategy of player i we have used a generous ZD strategy with parameters
1 = re — c, s = 2/3, 0 = 1/2. (B) Volunteer's dilemma with b = 1.5, c = 1; player 'i applies an
extortionate strategy with parameters /= 0, s = 9/10. sb = 1/2.
case, player i can enforce an arbitrarily strong correlation between the players' payoffs, provided that the
round number Ili is sufficiently large compared to the parameter 0. Third, Proposition I gives a natural
interpretation for the three parameters 1, a and 0. The baseline payoff corresponds to the players' payoffs
if all group members apply the same ZD strategy (in which case r_i = r,); the slope a determines how
strong the payoffs r, and are related; and by (S8] the parameter 0 determines the rate of convergence
with Al.
A player cannot enforce arbitrary payoff relations (S9] because the parameters 1, a and 0 need to be set
such that all cooperation probabilities in IS7] are in the unit interval. We thus say that a payoff relation (1,$)
is enforceable, if there is a 0 7 0 such that 0 ≤ ps,j ≤ 1 for all S E (C,D} and all j with 0 ≤ j ≤ it - 1.
The following gives a characterization of the possible payoff relations that a single player can enforce.
Proposition 2 (Characterization of enforceable payoff relations)
For a given social dilemma, the payoff relation (1, s) is enforceable if and only if either a = 1 or
max-1 (bi — 1=4i=
05j (b3 art)} ≤ ≤ min {a., -F 0J+1 — el.i)) • [S10]
<n 0<j<n-1
In particular, enforceable payoff relations satisfy - t ≤ s ≤ 1, and if s < 1 then bo ≤ I < a„_ t .
19
EFTA01074283
A Public Goods Game B Dilemma
Slope • Slope,
n=10
n= 5
n=4
n=3
Baseline rI I I Baseline
,c-c payoff payoff
Figure S5: Enforceable payoff relations for (A) the linear public goods game and (B) the vol-
unteer's dilemma. A pair (1, s) is enforceable for a given group size n if the point is within the
respectively shaded area. The set of enforceable pairs for large n is a subset of the the respective
set for smaller n, i.e. the set of enforceable pairs shrinks with increasing group size. Parameters:
(A) Linear public goods game with r = 2.4, c = 1. (B) Volunteer's dilemma with b = 1.5, c = 1.
As a corollary, we note that if for a given slope s the baseline payoffs Si and /2 are enforceable, then so is
any baseline payoff between ti and /2. Figure 55 gives an illustration of the enforceable payoff relations,
again for the two examples of a public goods game and a volunteer's dilemma. In particular, it shows that
the space of enforceable payoff relations shrinks with group size. Proposition 2 confirms that this effect can
be observed for all social dilemmas (more specifically, for all games that satisfy bj+i > aj). Thus, in larger
groups, a single player has less strategic options to enforce particular payoff combinations.
3 ZD strategies for alliances
To extend this result for single subjects to alliances, let us suppose the first k players form an alliance
A = {1..... and they agree on a joint zero•determinant with parameters /, s, 4 Thus, each of the allies
E A enforces the relation tr_i = Slii -I- (1 - .$)1. Summing up over all allies 1 ≤ i ≤ k yields
k k
(iri + irk) + -IL 1 (liki-i + • • • + irn) = s(irr ... irk) k(1 s)1. [Sill
—1
Rearranging these terms then implies
74:+1 + .. . + ir. _ s(n — 1) — (k - 1) Cr1 + ... -i- Trk ) ( s(n — 1) — (k — 1))
-F 1 1, [5121
n—k n— k k n—k
or, in a more intuitive notation,
r -.4 = sAff.4 + (1 - 54)1, [S131
20
EFTA01074284
with .2r4 being the mean payoff of the allies, ir_ 4 being the mean payoff of the outsiders, and
s(n — 1) — (k — 1)
$4 [SI41
n—k
being the effective slope of the alliance. Thus the results for a single ZD strategist naturally extend to the
case of arbitrarily many allies. For the set of enforceable payoff relations (1, 84) we can give an analogous
characterization:
Proposition 3 (Characterization of enforceable payoff relations for alliances)
For a given social dilemma, an alliance with k members can enforce the payoff relation (1, 84) if and only
if either BA = 1 or
max D (I-
j <re — 1
i —
•
tiA)(n— 3 a -1)} ≤ ≤
min-
0<j< rt I
+ _naAh„1- k) (kyr' aj )) . (SI 5]
In particular, if a given payoff relation (1, sA) can be enforced by some alliance of ZD strategists (possibly
by using different ZD strategies), then it can be enforced by a symmetric alliance (where all players apply
the same ZD strategy). Moreover, enforceable payoff relations satisfy -nom< sA < 1, and if 84 < I
then bo ≤ 1 ≤ an_ t.
As a consequence of Proposition 3 we note that space of enforceable payoff relations (1,84) increases with
the size of the alliance k (this holds true for any game in which bj+1 > a5 for all)). Thus, while Proposition
2 has led to the conclusion that players lose their strategic options as groups become large, Proposition 3
suggests that players can offset this loss of strategic power in large groups by forming alliances.
4 Self-enforcing alliances
Let us next explore which ZD-strategies form a Nash equilibrium (i.e., which ZD strategies can be stably
maintained in a population, assuming that players aim at high payoffs). Having the previous section in mind,
we may ask equivalently: suppose there is an alliance with k = n — 1players, applying some ZD-strategy.
In which case would it be optimal for the remaining player to apply this ZD-strategy too? To respond to
this question, let us consider an alliance A = {1, n — 1} applying a ZD strategy with parameters!, s,
As a minimum requirement for this strategy to be an equilibrium, the remaining player n must not have
an incentive to choose a different ZD-strategy with parameters 1„, sn, q5n. By [S41, such a deviating player
enforces the payoff relation
/TA = Srarn + (1 Sn)lni [SI'S]
whereas, by [S 13], the alliance A enforces the payoff relation
7rn = SAVA + ( 1 81) 1 *
Assuming sn < 1, this system of linear equations yields the following payoff for the deviating player
1(1 — 84) + 1041 — sn)
[Sl8J
1 — 8.48n
21
EFTA01074285
By adhering to the alliance strategy, player n would obtain the payoff I. Thus, deviating from the alliance
strategy yields an advantage if
1 — 1)&4(1 — sn)
rrn > 0. 15191
1 — S.ASn
We can distinguish three cases:
I. sA = 0: In that case ir„ — I = 0, i.e. player n cannot improve his payoff by deviating unilaterally.
2. sA > 0: In that case T„ — I > 0 if and only if I„ > 1. Thus, to be stable against invasion of other
ZD strategists, the alliance needs to apply a strategy with the maximum possible baseline payoff,
I = an _1.
3. sA < 0: then sr„ — 1 > 0 if and only if 1„ — I < 0. Thus the ZD-strategy of the alliance is stable
against invasion by other ZD-strategists if and only if I is minimal, i.e. I = bo.
As the following Proposition shows, this result also holds if deviating players are not restricted to zero-
determinant strategies.
Proposition 4 (Pure Nash equilibria among ZD-strategies)
Consider a ZD-strategy with parameters 1, s, 0, and a social dilemma in which mutual cooperation is the
best outcome for the group. whereas mutual defection is the worst possible outcome,
jai-1
60 < min (11 nbi < an-1 ' IS201
a≤i≤n
Let sA be defined as in Eq. (S 141, for k = it - I. Then the ZD-strategy is a Nash equilibrium if and only
if one of the following three cases holds:
I. sA = 0, i.e., if applied by it — 1 players, the ZD-strategy acts as an equalizer.
2. ..A > 0 and 1 = /42_1, i.e., if applied by n — 1 players, the ZD-strategy is generous.
3. sA < 0 and I = be, i.e., if applied by n — 1 players the ZD-strategy is selfish.
Some observations are in order. First, the three conditions in Propostion 4 do not depend on 0. Whether
a ZD-strategy is stable depends on the enforced payoff relation only (i.e., on I and s), but not on the exact
strategy that gives rise to this payoff relation.
Second, the stability of a ZD-strategy does not depend on the sign of .s. but on the sign of sA =
(n — 1)s — (n — 2). In particular, a generous ZD-strategy with parameters 1 = an_ i and s > 0 is only
stable if it is not too generous, s > (11 — 2)/(n — 1). Thus, only for pairwise dilemmas (with n = 2), any
generous ZD strategy is stable (25, 26); as the group size increases, generous strategies need to approach
the fair strategies (with 13 = 1) in order to be stable.
Third, Proposition 4 also allows us to respond to the question when an alliance is self-enforcing (in the
sense that outsiders prefer to join the alliance, once the alliance has exceeded a critical size k). If players are
only interested in high absolute payoffs (and not in fairness, or in relative payoff advantages), then alliances
either need to be equalizers, generous, or selfish in order to be self-enforcing.
22
EFTA01074286
5 Applications
5.1 Public goods games
As one particular example of a multiplayer dilemmas, let us first study the properties of repeated public
goods games. In a public goods game, each player of a group can cooperate by contributing an amount
c > 0 to a public pool. Total contributions are multiplied by a factor r with 1 < r < it, and evenly shared
among all group members. Thus, payoffs are given by
(j + Orc jrc
of c, and bi = [521]
For solitary alliances, some of the properties of ZD strategies for public goods games have been recently
described by (29), using an independent approach. Here we complement and extend these results; in par-
ticular, we discuss the effect of alliance sizes k > 1.
Because the payoffs of the public goods game are linear in the number of players j, the characterization
of enforceable payoff relations according to Eq. [SIS] becomes particularly simple (as only the boundary
cases j = 0 and j = rt — 1 need to be considered). We conclude that alliances of size k can enforce a linear
relation with parameters (1,84) if either sA = I or
(n — 1)rc (n — 1)c (r — n)c (n — 1)c }
max { 0, < < min f rc c, [S221
n (1 sA )(It k)} n (1 — sA)(n - k)
5.1.1 Solitary players
For k = 1, the condition [522] simplifies even further; a single player can enforce a linear relation with
parameters (I, s) if and only if either .s = 1, or the following two inequalities are satisfied
0 < I < re — c
[S231
(n — 1)rc c < (r — n)c c
I <
n 1 —s n 1—s
Figure 55 shows the set of all pairs (I, s) that satisfy the above constraints for various group sizes tz. We get
the following conclusions for the existence of extortionate strategies, generous strategies, and equalizers:
I. Extortionate strategies (I = bo = 0). By the inequalities [S231, extortionate strategies need to
satisfy
(n — 1)r — n
s> [S241
(n — 1)r
In particular, for r > n/(n — 1), solitary players cannot enforce arbitrarily high extortion factors
x = 1/s. Instead the set of feasible extortion factors is bounded from above by
(n — 1)r
x,„a„ — > 1. [S251
(n — 1)r — n
For large group sizes, n —> oc, this implies a maximum extortion factor x„,„.‘ = r/(r — 1).
23
EFTA01074287
2. Generous strategies (/ = a„_ 1 = rc — c). By the inequalities (S23], the slope of a generous ZD
strategies needs to satisfy the same constraint IS241 as extortionate strategies.
3. Equalizers (s = 0). For equalizers, the inequalities [S231 imply there are three regimes:
(a) If r ≤ n/(n — 1), all baseline payoffs 0 ≤ / ≤ rr. — c can be enforced.
(b) If n/(n — 1) < r < n/(n — 2), only a limited subset of baseline payoffs 0 < 1 < re — c can
be enforced.
(c) If r > n/(n — 2), there are no equalizers.
In particular, we conclude that for a given multiplication factor r > 1 the set of equalizer strategies
disappears as groups become large.
5.1.2 Alliances
By IS221, alliances with k > 1 can enforce a linear relation with parameters (1, 84) if and only if either
.94 = 1, or if the two following inequalities hold:
0 < 1 < re —c
(n — 1)re (n — 1)c < < (r — n)c (n — 1)c (S261
n (1 — 34)(n — k) n (1 — 8.4)(n — k)
For the special cases of extortionate strategies, generous strategies, and equalizers, these inequalities allow
us to derive the following conclusions:
I. Extortionate strategies (1 = be = 0). The inequalities [S261 can be rewritten as a critical threshold
on the fraction of alliance members that is needed to enforce a certain slope 84,
k r(1 — 84) — 1
(S271
it - - 84)
In particular, if an alliance wants to enforce arbitrarily high extortion factors x —r oc, then 89 =
1/x —r 0 and the critical threshold becomes k/n ≥ (r — 1)/r.
2. Generous strategies (1 = = re — c). The inequalities ES26I lead to the same threshold for
k/n as in the case of extortionate strategies, as given in [S27I.
3. Equalizers (s = 0). For equalizers, the inequalities [S26] lead to two critical thresholds; in order to
be able to set the payoffs of the outsiders to any value between 0 ≤ t ≤ rc — e, the fraction of the
allies needs to satisfy
A
≥ -t 1. [S281
n r
However, to be able to set the payoffs of the outsiders to some value between 0 ≤ t ≤ rc — c, the
number of allies only needs to exceed
k > (n — 2)(r — 1)
[S291
n -F (n — 2)r
24
EFTA01074288
5.2 Volunteer's Dilemma
In the volunteer's dilemma, at least one of the players needs to cooperate and pay a cost c > 0, in order for
all group members to derive a benefit b> c> 0. Thus, the payoffs are given by
ai = b — c for all j, and bi = b if j ≥ 1 and bo = O. [S301
To characterize the enforceable payoff relations, we note that by condition [S151 exactly those parameters
and s4 can be enforced for which either 84 = 1 or
max { 0,b— Cl< b — c. [S311
(1 — sA)(n - k) — —
In the special case of extortionate strategies (I = 0), this implies that a given slope 84 can only be enforced
by an alliance of ZD strategists if
k 1
>1 [S321
n n (1 — sA )6.
In particular, it follows that even large alliances cannot be arbitrarily extortionate (setting s4 = 0 on the
right hand side implies that such alliances would need to satisfy k > n-1). Instead, the maximum extortion
factor for an alliance of ZD strategists is given by ;v.. = bAb — c).
For equalizers (s = 0), the inequalities in [S31I imply that alliances can only unilaterally determine the
outsiders' payoffs if k = n — 1, in which case only the baseline payoff = b — c is enforceable.
A Appendix: Proofs
Proof of Proposition 1. The proof is an extension of previous proofs for pairwise games (38? ) to the case
of multiplayer games. It is useful to introduce some notation. Let gis.) be i's payoff in a given round if
player i chooses action S E (C,D}, and if j of the co•players cooperate, that is
ai ifS=C
[S331
b./ if S = D.
Similarly, let 6; be the average payoff of the other group members in that case, that is
=I jai -F (n - j -1)bi+i
it - 1
jai-i -F (n - j - 1)bi
n —1
ifS=C
if S = D.
Moreover, let vs,j(m) be the probability that in round in, the focal player i chooses S and that j of his
co•players cooperate. Let us collect these numbers and write them as vectors,
[S341
gi (9&n— I • • • + 9b,019iD,n-i • • • +9iD,O)
g—i = (gan-lt • • • + 901319D,n-il • • • +9Di,O) [S351
„N T
v(m) = (vC,n —1(m), • • • , vaa(m), vo,n—t(m), • • • , vo,ot.m.1) •
25
EFTA01074289
With this notation, player i's expected payoff in round in can be written as rr,(m) = gt • v(rn), whereas the
expected payoff of the co-players is given by tr_ i (m) = g_i • v(m). Finally, let 1 be the 2n-dimensional
vector with entries 1, and let go be the 2n-dimensional vector in which the first n entries are I and the last
it entries are 0,
1 = (1,...,1,1,...,1)
(S361
go = (1. . ., 1, 0,..., 0).
We note that zero-determinant strategies are exactly those strategies for which
p = g o + 0[(1 — s)(11— gt) + — g-ti • [S37]
Finally, let qc(m) denote the probability that player 2: cooperates in round in. We can write qc(m) =
go • v(rn) and qc(m + 1) = p • v(m). Thus, it follows for w(m) := qc(m + 1) — qc(m) that
w(m) = (p-go)-v(m) = #[(1— •v(m) = kiri (m)-1- (1— s)1— rr_i(m)]. [S381
Taking the definition of w(m), summing up over in, and taking the limit Al —) oo yields
al
‘—‘
m=1 w(m) = 7
T 2- E
Al
goon + 1)- qc(n) r(d' +Mi) - gem) go(m +1)
AI
_ • po
[S391
m=i
On the other hand, due to [S38],
1
E w(m) = v E siri(m) + (1 —s)i —ff-i(n) = cosr, + (1 — — (S40]
m=1 m=i
As the two limits need to coincide, and as Ige(lf + 1) — pal ≤ 1, the result follows.
Proof of Proposition 2. Let us first show that the parameters s4 of a zero-determinant strategy satisfy
I tt < I ≤ 1 and 0 7 0, and that for s < 1 the baseline payoffs fulfill 6 < I ≤ a„_ t . By the definition
of zero-determinant strategies, the cooperation probabilities after mutual cooperation and mutual defection
are given by
1 + 0(1 — s)(t — an_ i )
[S41]
PD.() — 8)0 — be).
As these two entries need to be in the unit interval, it follows that
— s)(/ — am—i) ≤ 0
[S421
0 ≤ 95(1 — s)(1 — bo)
Adding up these two inequalities implies 0(1 — s)(bo — an_i ) ≤ 0, and therefore
0(1 — s) ≥ O. (S431
Analogously, by taking into account that the entries pc,„ _2 and pa„_ I need to be in the unit interval, one
finds that
1
(s + — )≥ O. [S44]
IL - 1
26
EFTA01074290
Adding up the two inequalities [S43] and (S44] shows that 0 > 0. As a consequence, it follows from [S431
and [S44] that —1/(n — 1) ≤ s ≤ 1. Lastly, Inequalities (S42) then imply for s y 1 that bp ≤ I ≤ a„_ t.
Let us now turn to the characterization of enforceable payoff relations: If s = 1, the entries of a zero-
determinant strategy p according to representation [57] do not depend on the parameter 1. By choosing
> 0 sufficiently small, the inequalities ps.j are satisfied for any social dilemma (in fact, for any game in
which b) ≥ af- t for all j).
Now suppose s ¢ 1 and p is a zero-determinant strategy. As 0 > 0, the representation of zero-
determinant strategies [S7] implies the following constraints on the entries ps,1:
(1 — s)(1 — ai ) — — ni ) ≤ 0
[S45]
(1 — s)(1 — it(b l — ni_ ) ≥ 0.
Dividing by 1 — s > 0 shows that the inequalities [S45] are equivalent to condition [S10]. Conversely,
suppose condition [S10] and thus the inequalities [S45] are met. Then only the parameter 0 > 0 has to be
chosen sufficiently small to ensure that all entries satisfy 0 < ps.i < 1 in representation [S7].
Proof of Proposition 3. Condition [SI5] is derived from the corresponding condition [S10] for solitary
players, by using the relation between 84 and s in Eq. [S141. Thus, for symmetric alliances, the result
follows directly from Proposition 2. It is only left to show that every linear relationship of the form [S131
that can be enforced by an asymmetric alliance (in which allies may use different ZD strategies) can also be
enforced by a symmetric alliance (in which all allies use the same ZD strategy). To this end, let us consider
an asymmetric alliance A = {1, , k that collectively enforces some linear relation between payoffs,
ir_4 = sArrA -I- (1 — 84)14, [S461
and where each ally i E A individually enforces the relation
7r-i = + (1 - Si)li. [S471
By summing up all equations [S47] and rearranging the terms, we obtain
si(n— 1) — (k - 1) if; san — 1) — (k - I)\
-A = 2_, It - k k
+E( n- k ) k
ES481
i.t i=1
As the right hand sides of [S46] and [550] need to coincide, we have
k k
c—• M it —1) — (k —1) ir, + E (1 si(n — 0 — (k — I) \ li
sALI -0 -13.4 14 = 2_, [5491
Both sides of this equation represent a linear function in the variables r i. A comparison of coefficients
yields
si(n — 1) — (k — 1)
[S501
n - k
27
EFTA01074291
In particular, it follows that s1 = s2 = = .sk .9, i.e. an alliance of ZD strategists that enforces a
linear relation between payoffs needs to coordinate on the same slope. In that case, 14 can be calculated
as 1A = E;_11,/k, i.e. /A is the arithmetic mean of the individual baseline payoffs li. In particular,
min ≤ /A ≤ max 1,. As all players of the alliance applied a ZD strategy, it follows that (min 1„s) and
(max /,, s) are enforceable, and therefore also (14,$) (as outlined in the first sentence after Proposition 2).
Thus, there is a symmetric alliance, in which all allies use the same ZD strategy with parameters (1A, s),
that enforces the relation IS46].
Proof of Proposition 4. We already know that for a ZD-strategy to be a Nash equilibrium, one of the three
conditions need to be fulfilled (otherwise there would be a different ZD-strategy that yields a higher payoff).
Conversely, let us assume that one of the three conditions of the Proposition is fulfilled.
I. If sA = 0, then the remaining player obtains a payoff of 71„ = 1, irrespective of n's strategy. In
particular, there is no incentive for player n to deviate.
2. Suppose .9.4 > 0,1 = a„_ t, and let us assume to the contrary that the zero-determinant strategy is not
a Nash-equilibrium. Then there is a strategy for player n such that n's payoff satisfies Ir. > an-1.
However, as A enforces the relation an = .94114 + (1 — sMan-1, and as sA > 0, we can conclude
that 114 > an _1. It follows that the average payoff of all group members exceeds an _1, contradicting
the assumption that a„_1 is the maximum average payoff per round.
3. Under the assumption that be is the minimum average payoff per round, the case .94 < 0 and / = be
can be treated analogously to the previous case.
O
28
EFTA01074292