3
,,n,Cooperation and control in multiplayer social dilemmas b.;
65
4 " " Christian Hilbe' ", Bin Wub, Arne Traulsenb, and Martin A. Nowak" 66
5 67
0:7.6.9 °Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 0213t °Department for Evolutionary Theory, Max Planck Institute for
6 Evolutionary Biology, 20306 Plan, Germany; and 'Department of Organismic and Evolutionary Biology and Department of Mathematics, Harvard University, 68
7 Cambridge, MA 02138 69
8 70
Edited by Joshua B. Plotkin, University of Pennsylvania, Philadelphia, PA. and accepted by the Editorial Board September 26, 2010 (received for review
9 April 30, 2010) 71
10 72
II
Direct reciprocity and conditional cooperation are important mecha- prevent free riders from taking over. Our results, however, are 73
nisms to prevent free riding in social dilemmas. However, in large not restricted to the space of ZD strategies. By extending the
12 techniques introduced by Press and Dyson (23) and Akin (27), we 74
groups, these mechanisms may become ineffective because they re-
13 quire single individuals to have a substantial influence on their peers. also derive exact conditions when generalized versions of Grim, Tit- 75
14 However, the recent discovery of zero-determinant strategies in the for-Tat, and Win-Stay Lase-Shift allow for stable cooperation. In 76
5 iterated prisoner's dilemma suggests that we may have underesti- this way, we find that most of the theoretical solutions for the re- 77
16 mated the degree of control that a single player can exert. Here, peated prisoner's dilemma can be directly transferred to repeated 78
17 we develop a theory for zero-determinant strategies for multiplayer dilemmas with an arbitrary number of involved players. 79
I8 social dilemmas, with any number of involved players. We distinguish In addition, we also propose two models to explore how indi- 80
several particularly interesting subclasses of strategies: fair strategies
viduals can further enhance their strategic options by coordinating
19 their play with others. To this end, we extend the notion of ZD 81
20 ensure that the own payoff matches the average payoff of the group; 82
strategies for single players to subgroups of players (to which we
extortionate strategies allow a player to perform above average;
21 refer as ZD alliances). We analyze two models of ZD alliances,
and generous strategies let a player perform below average. We use depending on the degree of coordination between the players.
22
this theory to descnbe strategies that sustain cooperation. induding When players form a strategy alliance, they only agree on the set
23
generalized variants of Tit-for-Tat and Win-Stay Lose-Shift. Moreover, of alliance members, and on a common strategy that each alliance
24 we explore two models that show how individuals can further enhance member independently applies during the repeated game. When
25 their strategic options by coordinating their play with others. Our players form a synchronized alliance, on the other hand, they
26 results highlight the importance of individual control and coordination agree to act as a single entity, with all alliance members playing the
27 to succeed in large groups same action in a given round. We show that the strategic power of
28 ZD alliances depends on the size of the alliance, the applied
29 evolutionary game theory I alliances I public goods game I strategy of the allies, and on the properties of the underlying social
30 volunteer's dilemma I cooperation dilemma. Surprisingly, the degree of coordination only plays a role
31 as alliances become large (in which case a synchronized alliance
has more strategic options than a strategy alliance).
32
33 C ooperation among self-interested individuals is generally
difficult to achieve (1-3), but typically the free rider problem
is aggravated even further when groups become large (4-9). In
To obtain these results, we consider a repeated social dilemma
betweenn players. In each round of the game, players can decide
34 whether to cooperate (C) or to defect (D). A player's payoff
small communities, cooperation can often be stabilized by forms depends on the player's own decision and on the decisions of all
35 97
of direct and indirect reciprocity (10-17). For large groups, how- other group members (Fig. 1A): in a group in which/ of the other
36 98
ever, it has been suggested that these mechanisms may turn out to group members cooperate, a cooperator receives the payoff al ,
37 be ineffective, as it becomes more difficult to keep track of the 99
whereas a defector obtains b,. We assume that payoffs satisfy the
38 reputation of others and because the individual influence on others 100
39 diminishes (4-8). To prevent the tragedy of the commons and to 101
Significance
40 compensate for the lack of individual control, many successful 102
41 communities have thus established central institutions that enforce 103
42 mutual cooperation (18-22). Many of the world's most pressing problems, like the prevention 1(14
However, a recent discovery suggests that we may have un- of climate change, have the form of a large-scale social dilemma
43 with numerous involved players. Previous results in evolutionary 105
44 derestimated the amount of control that single players can exert in 106
repeated games. For the repeated prisoners dilemma, Press and game theory suggest that multiplayer dilemmas make it partic-
45 Dyson (23) have shown the existence of zero-determinant strategies ularly difficult to achieve mutual cooperation because of the lack 107
46 (or ZD strategies), which allow a player to unilaterally enforce of individual control in large groups. Herein, we extend the 108
47 a linear relationship between the own payoff and the coplayer's theory of zero-determinant strategies to multiplayer games to 109
48 payoff, irrespective of the coplayer's actual strategy. The class of describe which strategies maintain cooperation. Moreover, we 110
49 zero-determinant strategies is surprisingly rich: for example, a player propose two simple models of alliances in multiplayer dilemmas. 111
50 who wants to ensure that the own payoff will always match the The effect of these alliances is determined by their size, the 112
5I coplayer's payoff can do so by applying a fair ZD strategy, like Tit- strategy of the allies, and the properties of the social dilemma. 113
52 for-Tat. On the other hand, a player who wants to outperform the When a single individual's strategic options are limited, forming
114
respective opponent can do so by slightly tweaking the Tit-for-Tat an alliance can result in a drastic leverage.
53 strategy to the own advantage, thereby giving rise to extortionate 115
54 ZD strategies. The discovery of such strategies has prompted sev- Author contrbutions: B.W. initiated the project; CN. B.W. AT, and M.A.N. designed 116
55 eral theoretical studies, exploring how different ZD strategies research; C.H.,B.W., AT., and MAN. performed research; At and MAN. analysed data 117
56 evolve under various evolutionary conditions (24-30). and C H and 6 W. wrote the paper. 118
57 ZD strategies are not confined to the repeated prisoner's di- The authors declare no conflict of interest. 119
58 lemma. Recently published studies have shown that ZD strate- 'Mks artkle Is a PNAS Direct Submission. J.B.P. is a guest editor Invited by the mow 120
gies also exist in other repeated two player games (29) or in Board.
59 Freely available online through the PNAS open access option.
121
repeated public goods games (31). Herein, we will show that such 122
strategies exist for all symmetric social dilemmas, with an arbi- 'To whom correspondence should be addressed. Email: nitiorlasriarraisethi.
61 trary number of participants. We use this theory to describe 'nth article contains supporting information online aUnww.pnas.orgiloalruWsupplidoi:10. 123
62 which ZD strategies can be used to enforce fair outcomes or to ion/linos motion imxsowirmental. 124
vnewpnas.orgrcglidoi/10.1073/pnas.1407887111 PNAS Early Edition I 1of 6
EFTA01199747
125 following three properties that are characteristic for social Results 187
126 dilemmas (corresponding to the individual-centered interpretation Memory-One Strategies and Akin's lemma ZD strategies are 188
127 of altruism in ref. 32): (i) irrespective of the own strategy, players memory-one strategies (23, 36); they only condition their behavior 189
128 prefer the other group members to cooperate (aft' ≥ aj and bp,. ≥ bj on the outcome of the previous round. Memory-one strategies can 190
129 for allj); (ii) within any mixed group, defectors obtain strictly higher be written as a vector p= (Pca Pc o.PnA-4 191
Poo)- The
130 payoffs than cooperators > aj for all j); and (iii) mutual co- entries ps) denote the probability to cooperate in the next round, 192
operation is favored over mutual defection (an _i> bo). To illustrate given that the player previously played S E {C. O} and that j of the
131 193
our results, we will discuss two particular examples of multiplayer coplayers cooperated (in the SI Tar, we present an extension in
132 games (Fig. 1B). In the first example, the public goods game (33), 194
which players additionally take into account who of the coplayers
133 cooperators contribute an amount c > 0 to a common pool. knowing 195
cooperated). A simple example of a memory-one strategy is the
134 that total contributions are multiplied by r (with 1 <r <it) and evenly 196
strategy Repeat. prim, which simply reiterates the own move of the
135 shared among all group members. Thus, a cooperator's payoff is previous round, pici.7 =1 and 147 =0. In addition, memory-one 197
136 a = rc (j + 1)/n — c, whereas defectors yield bj=rcj/n. In the second strategies need to specify a cooperation probability pa for the first 198
137 example, the volunteer's dilemma (34), at least one group member round. However, our results will often be independent of the initial 199
138 has to volunteer to bear a cost c> 0 in order for all group members play, and in that case we will drop Po. 200
139 to derive a benefit h>c. Therefore, cooperators obtain cif = b —c Let us consider a repeated game in which a focal player with 201
(irrespective ofj), whereas defectors yield hj = b ifj a 1 and bo =0. memory-one strategy p interacts with n —1 arbitrary coplayers
140 202
Both examples (and many more, such as the collective risk dilemma) (who are not restricted to any particular strategy). Let vs4(r)
141 (7, 8, 35) are simple instances of multiplayer social dilemmas. 203
142 denote the probability that the outcome of round t is (S,j). Let 204
We assume that the social dilemma is repeated, such that in-
v(0= (t) vao(t)] be the vector of these probabilities. A
143 dividuals can react to their coplayers' past actions (for simplicity, 205
limit distribution v is a limit point for a' —• co of the sequence
144 we will focus here on the case of an infinitely repeated game). As tv(1)+ +v(t)Wr. The entries vs, of such a limit distribution 206
145 usual, payoffs for the repeated game are defined as the average 207
correspond to the fraction of rounds in which the focal player
146 payoff that players obtain over all rounds. In general, strategies 208
finds herself in state (S.j) over the course of the game.
147 for such repeated games can become arbitrarily complex, as There is a surprisingly powerful relationship between a focal 209
subjects may condition their behavior on past events and on the
148 player's memory-one strategy and the resulting limit distribution 210
round number in nontrivial ways. Nevertheless, as in pairwise
149 of the iterated game. To show this relationship, let qc(r) be the 211
games, ZD strategies turn out to be surprisingly simple.
I50 probability that the focal player cooperates in round r. By definition 212
of pRiv we can write qc(r) = pRA° • v(1)=Evcs-i (0+ ... +vco(0).
I51 213
Similarly, we can express the probability that the focal player
152 214
A cooperates in the next round as qc(r + I) = p • v(t). It follows that
153 Number of cooperators qc(r +1)— qc(t)=(p— pRc") • v(t). Summing up over all rounds 215
154 a.1 1}.2 .... 2 1 0 216
among co-players from 1 to t, and dividing by t. yields (p — pR•17)• iv(I)+
155 v(r))/r= [qc(r+ I) —qc(1)1/r, which has absolute value at most 217
156 Cooperators payoff an-r arr.2 ... az as no IA By taking the limit r co we can conclude that 218
157 Defectors payoff bn-s bn-2 b2 br bo 219
158 (p —pRe0)•v=0. 220
159 221
B Volunteers Dilemma
160 This relation between a player's memory-one strategy and the 222
3 2 c. ‘. 4 ,
161 Detector resulting limit distribution will prove to be extremely useful. 223
2 b0.0. to 4, Because the importance of Eq. 1has been first highlighted by Akin
162 224
(27) in the context of the pairwise prisoner's dilemma, we will refer
1
163 g a l 225
to it as Akin's lemma. We note that Akin's lemma is remarkably
164 general, because it neither makes any assumptions on the specific 226
165 0 .—e—e—crathr
arb—C
game being played nor does it make any restrictions on the strat- 227
166 0 2 4 6 a 10 0 2 4 6 8 10 egies applied by the remaining n —1 group members. 228
Number et cocperabng co-64eyere Plumber of oocperalfrig co-players
167 229
168 C ZD Alliance Outsiders zero-Determinant Strategies in Multiplayer Social Dilemmas. As an 230
169 application of Akin's lemma, we will show in the following that 231
170 single players can gain an unexpected amount of control over 232
171 the resulting payoffs in a multiplayer social dilemma. To this 233
end, we first need to introduce some further notation. For
172 234
1 1 a focal player i, let us write the possible payoffs in a given round
173 as a vector g = (es), with g'n =a) and eDi =b). Similarly, let us 235
174 write the average payoffs ores coplayers as r= (gr), where 236
175 the entries are given by k g., =fra) + (n —j-1)br ibl(n — 1) and 237
176 gilo =frafri +(n —j —1)64/(n — I). Finally, let 1 denote the 2n- 238
177 Fig. 1. Illustration of the model assumptions for repeated soda! dilemma (A) dimensional vector with all entries being one. Using this notation, we 239
178 We consider symmetric n-player soda' dilemmas in which each player can either can write player Ps payoff in the repeated game as x' = g' • v, and the 240
179 cooperate or defect The players payoff depends on its own decision and on the average payoff of ts coplayers as = • v. Moreover, by defini- 241
number of other group members who decide to cooperate. (B) We will discuss tion of v as a limit distribution. it follows that I • v= 1. After these
ISO two particular examples: the public goods game (in which payoffs are pro- 242
preparations. let us assume player f applies the memory-one strategy
181 portional to the number of cooperators) and the volunteers dilemma (as the 243
182 most simple example of a nonlinear social dilemma). (C) In adcfrtion to individual 244
strategies, we will also explore how subjects can enhance their strategic options P= +4 +//C+71. [2]
183 245
184 by coordinating their play with other group members. We refer to the members 246
of such a ID alliance as allies, and we call group member that are not part of with a, p, and y being parameters that can be chosen by player i
185 the 2D alliance outsiders. Outsiders are not restricted to any particular strategy. (with the only restriction that p#0). Due to Akin's lemma, we 247
186 Some or all of the outsiders may even form their own alliance. can conclude that such a player enforces the relationship 248
2 of 6 I www.pnes.orgfcgildoi/10.1073Mnas.1407837III Hilbe et al.
EFTA01199748
249 311
250 0 = (p - pile/ • v = (cre +fie +71)v =ad +fir' +y. 131 pTFTs- = 312
n —I [71
251 313
252 Player i's strategy thus guarantees that the resulting payoffs of 314
the repeated game obey a linear relationship, irrespective of how For pairwise games, this definition ofpTFT simplifies to Tit-for-
253 Tat, which is a fair ZD strategy (23). However, also for the public 315
254 the other group members play. Moreover, by appropriately
choosing the parameters a, ft, and y, the player has direct control goods game and for the volunteer's dilemma, pTFT is a ZD 316
255 on the form of this payoff relation. As in Press and Dyson (23), strategy, because it can be obtained from Eq. 4 by setting s= 1 317
256 who were first to discover such strategies for the prisoner's di- and ¢=1/c, with c being the cost of cooperation. 318
257 lemma, we refer to the memory-one strategies in Eq. 2 as zero- As another interesting subclass of ZD strategies, let us con- 319
258 determinant strategies or ZD strategies. sider strategies that choose the mutual defection payoff as 320
For our purpose, it will be convenient to proceed with baseline payoff,1=60, and that enforce a positive slope 0 <s < 1.
259 The enforced payoff relation 5 becomes se"' =sx' + (1 —s)bo, im- 321
260 a slightly different representation of ZD strategies. Using the 322
plying that on average the other group members only get
261 parameter transformation 1=-71(a+ fi), s = —alfi, and ¢=—p, a fraction s of any surplus over the mutual defection payoff. 323
262 ZD strategies take the form Moreover, as the slope s is positive, the payoffs x' and le are 324
263 positively related. As a consequence, the collective best reply for 325
p= + OKI -s)(!1-gi) + — [4] the remaining group members is to maximize i's payoffs by
264 326
265 cooperating in every round. In analogy to Press and Dyson (23), 327
and the enforced payoff relationship according to Eq. 3 becomes we call such ZD strategies extortionate, and we call the quantity
266 x= 1/s the extortion factor. For games in which 1=14=0, Eq. 5 328
267 e 1 =ski +(i -s)1. shows that the extortion factor can be written as x = Je/x-I. Large 329
268 extortion factors thus signal a substantial inequality in favor of 330
269 We refer to1as the baseline payoff of the ZD strategy and to s as player i. Extortionate strategies are particularly powerful in so-
270 the strategy's slope. Both parameters allow an intuitive interpre- cial dilemmas in which mutual defection leads to the lowest
271 tation: when all players adopt the same ZD strategy p such that group payoff (as in the public goods game and in the volunteer's
x' =x-', it follows from Eq. 5 that each player yields the payoff 1. dilemma). In that case, they enforce the relation Ki > cc; on
272 average, player i performs at least as well as the other group
273 The value of s determines how the mean payoff of the other
group members e' varies with d. The parameter 0 does not members (as also depicted in Fig. 2B). As an example, let us
274 consider a public goods game and a Z1D strategy pEr with 1=0,
275 have a direct effect on Eq. 5: however, the magnitude of ¢ de- =nIK" —r)sc+rcl. for which Eq. 4 implies
termines how fast payoffs converge to this linear payoff relation-
276 ship as the repeated game proceeds (37).
277 - I [i (1 scir+t -Ir)si.
278
279
280
The parameters 1. s. and efr of a ZD strategy cannot be chosen
arbitrarily, because the entries psi are probabilities that need to
satisfy 0 <psi < 1. In general, the admissible parameters depend
on the specific social dilemma being played. In SI Tier, we show
n- I
independent of the players own move Se {C.D}. In the limit
s I, pa approaches the fair strategypTFT. Ass decreases from
Ig]
1
281 that exactly those relations 5 can be enforced for which either
282 s= 1 (in which case the parameter 1 in the definition of ZD 1, the cooperation probabilities of ey are increasingly biased to
the own advantage. Extortionate strategies exist for all social
283 strategies becomes irrelevant) or for which / and s < 1 satisfy dilemmas (this follows from condition [6] by setting 1=b0 and 345
284 choosing an s close to 1). However, larger groups make extor- 346
285 j —a-_1 +n —j— I b 41.1 347
max fb. </< mM tion more difficult. For example, in public goods games with
286 osisx-i n— 1 —s osisn-i n—1 —s n > r fir — I), players cannot be arbitrarily extortionate any longer 348
287 161 as [6] implies that there is an upper bound on,' (SI Tea). 349
288 As the benevolent counterpart to extortioners, Stewart and 350
289 It follows that feasible baseline payoffs are bounded by the payoffs Plotkin described a set of generous strategies for the iterated 351
290 for mutual cooperation and mutual defection,b0 s! <a„_i, and that prisoner's dilemma (24, 28). Generous players set the baseline 352
the slope needs to satisfy —1/(n — 1) <s < 1. With s sufficiently payoff to the mutual cooperation payoff I= a„._i while still
291 enforcing a positive slope 0<s < 1. This results in the payoff con 353
292 close to 1, any baseline payoff between bo and a„_ can be achieved.
Moreover, because the conditions in Eq. 6 become increasingly re- relation c' =so' + (1 —5)0„_1. In particular, for games in which 354
293 strictive as the group size n increases, larger groups make it more mutual cooperation is the optimal outcome for the group (as in 355
294 difficult for players to enforce specific payoff relationships. the public goods game and in the prisoner's dilemma but not in 356
295 the volunteer's dilemma), the payoff of a generous player sat- 357
296 Important Examples of ZD Strategist In the following, we discuss isfies x' < (Fig. 2C). For the example of a public goods game, 358
297 some examples of ZD strategies. At first, let us consider a player we obtain a generous ZD strategy p' by setting 1=rc—c and 359
who sets the slope to s = 1. By Eq. 5, such a player enforces the ep —r)sc +rc], such that
298 360
299 payoff relation x' =x', such that's payoff matches the average Ge I „ ti 1 n(r —1) 361
300 payoff of the other group members. We call such ZD strategies Ps Ig1 362
fair. As shown in Fig. 24, fair strategies do not ensure that all n +(I s) n —I r+(n—r)s.
301 363
302 group members get the same payoff; due to our definition of 364
social dilemmas, unconditional defectors always outperform For s 1, ptt approaches the fair strategy pTFT, whereas lower
303 values of s make pcw more cooperative. Again, such generous 365
unconditional cooperators, no matter whether the group also
304 strategies exist for all social dilemmas, but the extent to which 366
contains fair players. Instead, fair players can only ensure that players can be generous depends on the particular social di-
305 they do not take any unilateral advantage of their peers. Our 367
306 lemma and on the size of the group. 368
characterization 6 implies that all social dilemmas permit a As a last interesting class of ZD strategies, let us consider players
307 player to be fair, irrespective of the group size. As an example, 369
who chooses =0. By Eq. 5, such players enforce the payoff relation
308 consider the strategy proportional lit -for-Tat (pTFT), for which x'=1, meaning that they have unilateral control over the mean 370
309 the probability to cooperate is simply given by the fraction of payoff of the other group members (for the prisoner's dilemma, 371
310 cooperators among the coplayers in the previous round such equalizer strategies were first discovered in ref. 38). However, 372
Hilbe et el. PNAS lefty Edition I 3 o16
EFTA01199749
373 Fig. 2. Characteristic dynamics of payoffs over the 435
374 A FEW strategy B ExtallCilete Strategy C Gene/00$ SeategY 436
course of the game for three different 2D strategies. 2.5 2_5 2.5
375 Each panel depicts the payoff of the focal player fr'
Other
437
(blue) and the average payoff of the other group Focal Focal
376 paysasi
c 01aYOr 9v members
* 438
members g- ' (red) by thick lines. Additionally, the 13,
377 individual payoffs of the other group members are gists . 439
. Mee 'Sc
378 shown as thin red lines. (A) A fair player ensures as OretiOMeoters Focal 440
°0e(
379 that the own payoff matches the mean payoff of ,I gray marrtors Fiery 441
380 Q:26 the other group members. This does not imply that o.s OA 0. 442
all other groupmembers yield the same payoff. (8) D 20 40 60 DO IDO 0 20 40 BO BO 100 D 20 40 60 HO IGO
381 For games in which mutual defection leads to the
Flotnd Number Round Round Number 443
382 lowest group payoff, extortionate players ensure that their payoffs are above average. (C) In games in which mutual cooperation is the social optimum, 444
383 generous players let their coplayers gain higher payoffs. The three graphs depict the case of a public goods game with r =4, c =1, and group size n=20. For 445
384 the strategies of the other group members, we used random memory-one strategies, where the cooperation probabilities were independently drawn from 446
a uniform distribution. For the strategies of the focal player, we used (A)pift (8) p” with s =0.8, and (C) pc* with s =0.8.
385 447
386 448
387 unlike extortionate and generous strategies, equalizer strategies cooperation and after mutual defection lie., Pea-t =Pao = 1 and 449
388 typically cease to exist once the group size exceeds a critical psi =0 for all other states (S.j)]. We refer to such a behavior as 450
389 threshold. For the example of a public goods game this thresh- WSLS, because for painvise dilemmas it corresponds to the Win- 451
390 old is given by n =211(r — 1). For larger groups, single players Stay, Lose-Shift strategy described by ref. 36. Because of condition 452
391 cannot determine the mean payoff of their peers any longer. 1101, WSLS is a Nash equilibrium if and only if the social dilemma 453
392 satisfies (6„_1 +b0)12sa„_i. For the example of a public goods 454
Stable Cooperation In Multipbyer Sodal Dilemmas. Let us next ex- game, this condition simplifies to r> 2n1(n + 1). which is always
393 plore which ZU strategies give rise to a Nash equilibrium with 455
394 fulfilled for r≥ 2. For social dilemmas that meet this condition, 456
stable cooperation. In S/ Text, we prove that such ZD strategies WSLS provides a stable route to cooperation that is robust to errors.
395 need to have two properties: they need to be generous (by setting 457
396 / =a„_i and s> 0), but they must not be too generous [the slope Zero-Determinant Alliances. In agreement with most of the theo- 458
397 needs to satisfy s ≥ (n — 2)/(n —1)1. In particular, whereas in the retical literature on repeated social dilemmas, our previous 459
398 repeated prisoner's dilemma any generous strategy with s> 0 is analysis is based on the assumption that individuals act in- 460
399 a Nash equilibrium (27, 28), larger group sizes make it increasingly dependently. As a result, we observed that a player's strategic 461
400 difficult to uphold cooperation. In the limit of infinitely large options typically diminish with group size. As a countermeasure, 462
401 groups, it follows that s needs to approach 1, suggesting that ZD subjects may try to gain strategic power by coordinating their 463
strategies need to become fair. For the public goods game. this strategies with others. In the following, we thus extend our the-
402 implies that stable cooperation can always be achieved when ory of ZD strategies for single individuals to subgroups of play- 464
403 players cooperate in the first round and adopt proportional Tit- ers. We refer to these subgroups as ZD alliances. Because the 465
404 for-Tat thereafter. Interestingly, this strategy has received little strategic power of ZD alliances is likely to depend on the exact 466
405 attention in the previous literature. Instead, researchers have fo- mode of coordination between the allies, we consider two dif- 467
406 cused on other generalized versions of Tit-for-Tat, which oo- ferent models: when subjects form a strategy alliance, they only 468
407 operate if at least k coplayers cooperated in the previous round (4, agree on the set of alliance members and on a common ZD 469
408 39, 40). Such memory-one strategies take the form psi =0 ifj <k strategy that each ally independently applies. During the actual 470
409 and psi = 1 if ≥k. Unlike pTFT, these threshold strategies nei- game, there is no further communication between the allies. 471
ther enforce a linear relation between payoffs, nor do they induce Strategy alliances can thus be seen as a boundary case of co-
410 fair outcomes, suggesting that pTFT may be the more natural ordinated play, which requires a minimum amount of coordi- 472
411 generalization of Tit-for-Tat in large-scale social dilemmas. nation. Alternatively, we also analyze synchronized alliances, in 473
412 In addition to the stable ZD strategies, Akin's lemma also which all allies synchronize their actions in each round (i.e., the 474
413 allows us to characterize all pure memory-one strategies that allies cooperate collectively, or they defect collectively). In ef- 475
414 sustain mutual cooperation. In SI Text, we show that any such fect, such a synchronized alliance thus behaves like a new entity 476
415 strategy p needs to satisfy the following four conditions that has a higher leverage than each player individually. Syn- 477
416 chronized alliances thus may be considered as a boundary case of 478
an-i —ao coordinated play that requires substantial coordination.
417 pea-I=1. PCA-2= 0, pal S 479
To model strategy alliances, let us consider a group of is4
418 allies, with 1 <nA cu. We assume that all allies make a binding 480
419 and PODS an-i —00 1101 agreement that they will play according to the same ZD strategy 481
420 bn_i — a„..1' 482
p during the repeated game. Because the ZD strategy needs to
421 with no restrictions being imposed on the other enviespsi. The allow allies to differentiate between the actions of the other allies 483
422 first conditionpe„_i = 1 ensures that individuals continue to play C and the outsiders, we need to consider a more general state space 484
423 after mutual cooperation; the second condition pc,,- 2 = 0 guaran- than before. The state space now takes the form (S.P.j-4). The 485
424 tees that any unilateral deviation is punished; and the last two first entry S corresponds to the focal player's own play in the pre- 486
vious round, j 4 gives the number of cooperators among the other
425 conditions describe whether players are allowed to revert to co- allies, andsA is the number of cooperators among the outsider& A 487
426 operation after rounds with almost uniform defection. Surprisingly, memory-one strategy p again needs to specify a cooperation prob- 488
427 only these last two conditions depend on the specific payoffs of the abilitypsiAtA for each of the passible states. Using this state space, 489
428 social dilemma As an application, condition 10 imply that the we can define ZD strategies for a player i in a strategy alliances as 490
429 threshold variants of Tit-for-Tat discussed above are only a Nash 491
equilibrium if they use the most stringent threshold: k =n —1. Such
430 unforgiving strategies, however, have the disadvantage that they are p= pReP+Okl —sX/1 — g9+ g —(n4 — l)wAgi —(n — 492
431 often susceptible to errors: already a small probability that players 493
Ill]
432 fail to cooperate may cause a complete breakdown of cooperation 494
433 (41). Instead, the stochastic simulations by Hauert and Schuster (5) The vector r 4 contains the average payoff of the other allies for 495
434 showed that successful strategies tend to cooperate after mutual each possible state, and g-4 is the corresponding vector for the 496
4 of 6 I www.poes.orgicgikloii10.10TYpnas.140T88211t Hilbe et al.
EFTA01199750
497 outsiders. The weights w-4 ≥ 0 and iv4 ≥ 0 are additional para- public goods games, their strategic options remain limited in the 559
498 meters that determine the relative importance of outsiders volunteer's dilemma. 560
499 and other allies, being subject to the constraint (n4 -1)w4 + 561
500 (n -nA)w-A = 1. In the special case of a single player forming Discussion 562
501 an alliance, nA = 1, this guarantees that the two definitions of When Press and Dyson (23) discovered the new class of ZD 563
ZD strategies 4 and 11 are equivalent. strategies for the repeated prisoner's dilemma, this came as a big
502 564
Similarly to the case of single individuals, we can apply Akin's surprise (24, 25): after more than five decades of research, it
503 seemed unlikely that any major property of the prisoner's di- 565
lemma to show that strategy alliances enforce a linear relation-
504 lemma has been overlooked. For repeated multiplayer dilemmas 566
ship between their own mean payoff ? and the mean payoff of
505 the outsiders rir4 (for details, see SI Text) the situation is different. Although various Folk theorems guar- 567
506 antee that cooperation is also feasible in large groups (42, 43), 568
507 there has been considerably less theoretical research on the 569
x-4 =eV + (1 - sit 112]
508 evolution of cooperation in repeated multiplayer dilemmas (4, 5, 570
39, 40). This may be due to the higher complexity: the mathe- 571
509 where the slope of the alliance is given by s4 = - (nA - I )w4]/ matics of repeated n-player dilemmas seems to be more intricate,
510 [1 — (n4 - I)r 4]. A strategy alliance can enforce exactly those and numerical investigations are impeded because the time to 572
511 payoff relationships 12 for which either el = 1 or for which I compute payoffs increases exponentially in the number of play- 573
512 and sA c 1 satisfy the conditions ers (5). Nevertheless, we showed here that many of the results for 574
513 the repeated prisoner's dilemma can be directly transferred to 575
+n —j— 1 —a, general social dilemmas, with an arbitrary number of involved
514 max a( b,t — -A 1 -3A ≤/≤ min 576
114-1Q9I-I n —nA 1 -34 ' subjects. The foundation for this progress is a new framework, ()Az 577
515
113) provided by Akin's lemma and the theory of Press and Dyson.
516 Using this framework. we extended the theory of repeated 578
517 multiplayer dilemmas into three direction& The first and most
Interestingly, to reach this strategic power, an alliance needs to
518 put a higher weight on the within-alliance payoffs (i.e., w4 needs immediate direction is our finding that ZD strategies exist in all
519 to exceed ;1r4; SI Tett), such that the allies are stronger affected social dilemmas. These strategies allow players to unilaterally dic-
520 by what the other allies do, as opposed to the actions of the tate linear payoff relations, irrespective of the specific social di-
521 outsiders. For single player alliances, n4 = 1. condition 13 again lemma being played, irrespective of the group size, and irrespective
simplifies to the previous condition 6. However, as the alliance of the counter measures taken by the other group members. In
522
size nA increases, condition 13 becomes easier to satisfy. Larger particular. we showed that any social dilemma allows players to be
523 fair, extortionate, or generous. Each of these strategy classes has its
524 alliances can therefore enforce more extreme payoff relationships.
own particular strengths: extortionate strategies give a player a rel-
525 For the example of a public goods game, we noted that single
ative advantage compared with the other group members; fair
players cannot be arbitrarily extortionate when n > r I (r - 1). Alli-
526 strategies help to avoid further inequality within a group; and
ances, on the other hand, only need to be sufficiently large. generous strategies allow players to revers to mutual cooperation
527
nAIna (r -1)1r. Once an alliance has this critical mass, there are when a coplayer defected by accident. At the same time, ZD
528 no bounds to extortion. strategies are remarkably simple. For example, to be fair in
529 In a similar way, we can also analyze the strategic possibilities a public goods game, players only need to apply a rule called
530 of a synchronized alliance. Because synchronized alliances act as proportional Tit-for-Tat: if j of the n —1 other group members
531 a single entity, they transform the symmetric social dilemma cooperated in the previous round, then cooperate with prob- 593
532 between n independent players to an asymmetric game between ability j/(tt — 1) in the following round. Extortionate and gen- 594
533 n —n4 + 1 independent players. From the perspective of the al- erous strategies can be obtained in a similar way, by slightly 595
liance, the state space now takes the form (S.j), where Se (C.Et) modifying pTFT to the own advantage or to the advantage of
534 596
is the common action of all allies and where 0 ≤j ≤n — via is the the others.
535 597
number of cooperators among the outsiders. ZD strategies for As the second direction, we explored which ZD strategies and
536 the synchronized alliance can be defined analogously to ZD which pure memory-one strategies can be used to sustain co- 598
537 strategies for single players operation in multiplayer dilemmas. Among ZD strategies, such 599
538 strategies need to be generous (such that players never try to 600
539 p = pReP + 0[(1 g-4) +g4 — outperform their peers) (27, 28), but at the same time they must 601
1141
540 not be too generous. The right degree of generosity depends on the 602
541 with le being the payoff vector for the allies and g-4 being the size of the group but not on the specific social dilemma being
played. As a rule of thumb, we obtain that in larger groups, subjects
603
542 payoff vector of the outsiders. For a single player alliance, n4 = 1, 604
this again reproduces the definition of ZD strategies in 4. By are required to show less generosity.
543 As the last direction, we extended the concept of zero- 605
544 applying Akin's lemma to Eq. 14, we conclude that synchronized 606
determinant strategies from single players to subgroups of players,
545 alliances enforce sr-A =SAKA +(I —SA)!, which is the same as re- 607
to which we refer to as ZD alliances. Depending on the degree of
lationship 12 for strategy alliances. Surprisingly, we even find that
546 coordination, we explored two forms of ZD alliances: members 608
for reasonable alliance sizes, nA S n/2, strategy alliances and syn- of a strategy alliance only agree on using a common ZD strategy
547 chronized alliances have the same set of enforceable parameters I 609
548 during the game, but they do not coordinate each of their 610
and ?, as given by Eq. 13 (see SI Tea for details). Thus, for the decisions; members of a synchronized alliance, on the other
549 two models of ZD alliances considered here, the exact mode of 611
hand, act as a single entiryt—they either all cooperate or they all
550 coordination is irrelevant for the alliance's strategic power unless defect in a given round. The effect of such ZD alliances depends 612
551 the alliance has reached a substantial size. on the size of the alliance, the applied strategy, and the prop- 613
552 Table 1 gives an overview of our findings on ZD strategies and erties of the underlying social dilemma. In general, we find that 614
553 ZD alliances in multiplayer social dilemmas. It shows that, al- by coordinating their play with others, subjects can increase their 615
though generally, ZD strategies exist for all group sizes, the power strategic options considerably. The exact mode of coordination,
554 616
of single players to enforce particular outcomes typically dimin- however, only turns out to play a minor role: As long as the size
555 617
ishes or disappears in large groups. Forming ZD alliances then of the ZD alliance is below half the group size, strategy alliances
556 allows players to increase their strategic scope. The impact of and synchronized alliances have the same strategic power. In ad- 618
557 a given ZD alliance, however, depends on the specific social di- dition to their static properties, ZD strategies for the prisoner's 619
558 lemma: although ZD alliances can become arbitrarily powerful in dilemma also have a remarkable dynamic component (23, 44): 620
Mix et el. PNAS Early Edition I 5 ed 6
EFTA01199751
621 Table 1. Strategic power of different ZD strategies for three different social dilemmas 683
622 684
Strategy Typical Prisoner's
623 dass property dilemma Public goods game Volunteer's dilemma
685
624 686
625 Fair strategies A A =A Always exist Always exist Always exist 687
Extortionate Always exist In large groups, single players cannot be Even large ZD alliances cannot be
626 688
strategies arbitrarily extortionate, but sufficiently large arbitrarily extortionate
627 689
ZD alliances can be arbitrarily extortionate
628 >,r 4 690
Generous Always exist In large groups, single players cannot be Do not ensure that own payoff is
629 strategies arbitrarily generous, but sufficiently large ZD below average 691
630 alliances can be arbitrarily generous 692
631 Equalizers eA = i Always exist May not be feasible for single players, but is Only feasible if size of ZD alliance is 693
632 always feasible for sufficiently large ZD alliances nA = n -1, can only enforce 1=b —c 694
633 695
Analogously to the case of individual players, 20 alliances are fair when they set ? = 1; they are extortionate when 1=130 and 0<s-4 <1; they are generous
634 for i=a,,_i and 0 <54 <1; and they are equalizers when ? =0. For each of the three considered social dilemmas, we explore whether a given ZD strategy is 696
635 feasible by examining the respective conditions in ref. 13. In the repeated prisoner's dilemma, single players can exert all strategic behaviors (23, 28, 29). Other 697
636 social dilemmas either require players to form alliances to gain sufficient control (as in the public goods game), or they only allow for limited forms of control 698
(as in the volunteer's dilemma). These results hold both for strategy alliances and for synchronized alliances.
637 699
638 700
639 when a player commits himself to an extortionate ZD strategy, them will be further applications of Aldn's lemma to come. Such 701
640 then adapting coplayers team to cooperate over time. Numerical applications may include, for instance, a characterization of all 702
641 simulations in the SI show an analogous result for multiplayer Nash equilibria among the stochastic memory-one strategies or 703
642 dilemmas: when ZD alliances apply strategies with a positive an analysis of how alliances are formed and whether evolutionary 704
643 slope, they can trigger a positive group dynamics among the out- forces favor particular alliances over others (46, 47). 705
siders. The magnitude of this dynamic effect again depends on the Overall, our results reveal how single players in multiplayer
644 706
size of the alliance, and on the applied strategy of the allies. games can increase their control by choosing the right strategies
645 Here, we focused on ZD strategies; but the toolbox that we and how they can increase their strategic options by joining 707
646 apply (in particular Akin's lemma) is more general. As an ex- forces with others. 708
647 ample, we identified all pure memory-one strategies that allow 709
648 for stable cooperation, including the champion of the repeated ACKNOWLEDGMENTS. C.H. gratefully acknowledges generous funding by 710
649 prisoner's dilemma, Win-Stay Lose-Shift (36,45). We expect that the SchrOdinger scholarship of the Austrian Science Fund 03475). 711
650 712
651 1. Hardin G (1968) The tragedy of the commons. The populationproblem has notechnical 24. Stewart Al. Plotkin 18 (2012) Extortion and cooperation in the prisoners dilemma. 713
solution; it requite fundamental extension in morality. Science 162(3315101243-1248. Proc Hadhad Sd USA (09(26):10134-10135.
652 714
2. Olson M(1971) The Logic or Collective Action:Public Goods and the Theory Ol Groups 25. Ball (2012) PhysicistS suggest selfishness can par' Nature. 0:17
653 (Harvard Univ Press, Cambridge, 6,M), Revised Ed. 26. Hike C Novak MA, Sigmund K (2013) The evolution of extortion in iterated pris- 715
654 3. Nowak MA (2006) Five rules for the evolution of cooperation. Science 314(6805k oners dilemma games. Proc Natl Aced Sci USA 110(17)£9134918. 716
1560-1563. 27. Akin E (2013) Stable cooperative solutions for the iterated prisoners diemma. 0:18
655 4. Boyd R. Rkherson P) 41988) The evolution of reciprocity in sizable groups. r Theo, Dial
717
28. Stewart Al. Plotkin Rs (2013) From extortion to generosity, evolution In the Iterated
656 132(3):337-356. Prisoners Dilemma. Proc Natl Arad Sc) USA 110438):15348-15353. 718
657 5. Hauert C, Schuette' HG (1997) Effects of increasing the number of players and memory 29. Hite C. Nowak MA, Traulsen A (2013) Adaptive dynamics of extortion and come& 719
Q:13 size in the iterated prisoners dilemma A numerical approads. Prue Not Sci 264:513-51g ante. PLOS ONE 801):e77886.
658 6. Suzuki 5, Akyama E (2037)6w:take of indirect reciprocity in groups of various sizes 720
30. Szolnoki A, Pere M (2014) Evolution of extortion in structured populations. Phys Rev E
659 and comparison with direct reciprocity. l Theo, CON 245(3)539-552. Rat NOnlin Soft Matter PhyS 89:022804. ct:19 721
7. Santos FC, Padsoto 1M 42011) Risk of collective failure provides an escape from the
660 31. Pan L, Hao 0, Rang Z Zhou (2014)2ero-determinant strategies in the iterated public 722
tragedy of the commons. Prot Nate Arad Sc) USA 108(26):10421-10425. goods game.
661 8. Amu Chakra M. Traulsen A (2012) Evolutionary dynamics of strategic behavior h 723
32. Kerr B, G.:dirty-SmithP, Feldman MW (2004) What is altruism? Trends Er& Evoll9(3):
662 a colkctive-risk dilemma. PLOS Comput Biol8(8)z1002652. 135-140. 724
9. Yang Vi, et al. (2013) Nonlinear effects of group size oncollective action and resource
663 33. Ledyard JO (1995) Public goods: A suivey of experimental research. The Handbook of 725
outcomes. hoc Ned Aced Sri USA 110(27):10916-10921.
ExperimentalEconomia, eds Kagel JR Roth AE (Princeton Univ Press, Princeton). Q:21
664 Q:14 10. Trivets RL (1971) The ay.:Mien of reciprocal altruism. C) Rev BM:4635-57. 726
Axelrod R 11984) The Evolution of Cooperation Maw Books. New York).
34. Olekmann A (1985) VOlunteerS dilemma. I Contact &whit 29:605-610. ctaz
665 35. Milinski M, Sommerfeld RD, Kra beck II-1, Reed FA Marotthe 1(2008) the collective- 727
12. Alexander R (1987) The Biology of Moral Systems (Aldine de Gruyter, New York).
666 13. Wedekind C. Milinski M (2000) Cooperation through image goring in humans, Sci-
risk social dilemma and the prevention of simulated dangerous climate change, Par 728
Had Arad Sul LISA 105(71:2291-2294.
667 ence 288(5467)350-852.
36. Nowak M. Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-
729
14. Henrichl, et al. (2001) Cooperation, reciprocity and punishment in fifteen small scale
668 for-tat in the Prisoner's Dilemma game. Nature 36404321:56-58. 730
Q:15 societies. Ans Eton Rev 91:73-78.
669 IS. Nowak MA Sigmund K (2E05) Evolution of irarect redprocity. Nature 437(70641291-1298. 37. Mite C, ROM T, Milinski M (2014) Extortion subdues human players but is finally 731
Punished In the prisoners dilemma. Nat Common 5:3976.
670 16. van Veelen M, Garcia 1, Rand DG, Nowak MA (2012) Direct reciprocity in structured
31. Boedijst MC, Nowak MA Sigmund K (1997) Equal pay for all Prisoners-Am Math Mon 732
POPulatiOns. hoc Had AOC( SO USA I09(25):9929-9934.
671 17. 2agonky BM. Reiter 16, Chatterjee K. Nowak MA (2013) Forgiver triumphs in atter- 104303-307. 0:23 733
672 nating Prisoner's Dilemma. PLOS ONE 8412):e80814. 39. KurOkawa 5, Ihara Y (2009) Emergence of cooperation in public goods games. fl oc 734
IS. YamagisN T (1986) The provision of a sanctioning system as a public good Jeers Sou Biel Sri 276(1660)1379-1384.
673 40. Van seeoreed 5. Pacheco 10.4 Lenaerts T. Santos FC (2012) Emergence of falmess in
735
Psycho? 51:110-116.
674 Qit; 19. Ostrom E 41990) Governing the Commons: The Evolution orMstitutions for Collective repeated group interactions. Phys Rev Lett 10/305k158104. 736
675 Action (Cambridge Univ Press, Cambridge, UK). 41. Sigmund K (2010) The Calculus of Selfishness (Princeton Vniv Press, Princeton). 737
20. Sigmund K, De Silva H, Traulsen A, Hauert C (2010) Social learning promotes in- 42. Friedman 1(1971) A non-cooperative equiltdurn for supergames. Rev Eton Stud 38:1-12.0:24
676 43. Fudenberg D. Tirok 10998) Game Theory (MIT Press. cancnoee. MA). 6M Ed. 738
stitutions for governing the commons. Nature 466(73003614363.
677 21. Guala F (2012) Reciprocity: Weak or strong? What punishment experiments do (and 44. Own 1, Zinger A (2014) The robustness Of zero-ckterminant strategies in iterated 739
678 do nod demonstrate. Behav Drain SO 35(0:1-15. Prisoners Dilemma games. / Theo, OM 357:46-54. 740
22. Hite C. Traulsen A. RON Tv Milinski M (2014) Democratic decisions establish stable 45. Kraires DP, KrainesVY (1989) Pavlov and the prisoners dilemma TheogrDecis 28.47-79.0:B
679 authorities that overcame the paradox of second.order punishment. Proc Natl Aced 46. Peleg B, Sudhalter P (2003)4160dt/et/on to the Theory of Cooperathe Games (Springer, 741
680 So USA 111(2):752-756. Berlin). 742
23. Press W14, Dyson Fl (2012) Iterated Prisoner's Dilemma contains strategies that 47. mesterton.ciatons M, Gavrilets S. Gravner 1, Akcay E (2011) Models of coalition or
681 743
dominate any evolutionary opponent. Proc Natl Aced Sri USA 109(26)10609-10413. alliance format:kn..; Meer Biel 274(0:187-204.
682 744
6 of 6 I www.pnaS.OrgfCgild0i/10.1073/pnaS.1407887 11 Hilbe et al.
EFTA01199752
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES 1
Q: 1_Please contact PNAS_Specialist.djs@sheridan.com if you have questions about the editorial
changes, this list of queries, or the figures in your article. Please include your manuscript number in
the subject line of all email correspondence; your manuscript number is 201407887.
Q: 2_Please (i) review the author affiliation and footnote symbols carefully, (ii) check the order of the
author names, and (iii) check the spelling of all author names, initials, and affiliations. Please check
with your coauthors about how they want their names and affiliations to appear. To confirm that the
author and affiliation lines are correct, add the comment "OK" next to the author line. This is your
final opportunity to correct any errors prior to publication. Misspelled names or missing initials will
affect an author's searchability. Once a manuscript publishes online, any corrections (if approved)
will require publishing an erratum; there is a processing fee for approved erratum.
Q: 3_Please review and confirm your approval of the short title: Cooperation and control in multiplayer
dilemmas. If you with to make further changes, please adhere to the 50-character limit. (NOTE: The
short title is used only for the mobile app and the RSS feed.)
Q: 4_Please review the information in the author contribution footnote carefully. Please make sure that the
information is correct and that the correct author initials are listed. Note that the order of author
initials matches the order of the author line per journal style. You may add contributions to the list in
the footnote; however, funding should not be an author's only contribution to the work.
Q: 5_You have chosen the open access option for your paper and have agreed to pay an additional $1350
(or $1000 if your institution has a site license). Please confirm this is correct and note your approval
in the margin.
Q: 6_Please verify that all supporting information (SI) citations are correct. Note, however, that the
hyperlinks for SI citations will not work until the article is published online. In addition, SI that is
not composed in the main SI PDF (appendices, datasets, movies, and "Other Supporting Information
Files") have not been changed from your originally submitted file and so are not included in this set
of proofs. The proofs for any composed portion of your SI are included in this proof as subsequent
pages following the last page of the main text. If you did not receive the proofs for your SI, please
contact PNAS_Specialist.djs@sheridan.com.
Q: 7_Please check the order of your keywords and approve or reorder them as necessary.
Q: 8_If your article contains links to websites (other than the SI links for your article), please verify that
the links are valid and will direct readers to the proper webpage.
Q: 9_Please provide department/section for affiliation a.
Q: 10_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This results in the ...."
EFTA01199753
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES 2
Q: 11_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This may be due of ...."
Q: 12_PNAS discourages claims of priority; is this truly new? If not, please either (a) replace the term new
with a term such as "previously unidentified" or (b) remove it altogether to avoid the claim of
priority.
Q: 13_For ref. 5, please provide issue number.
Q: 14_For ref. 10, please provide issue number.
Q: 15_For ref. 14, please provide issue number.
Q: 16_For ref. 18, please provide issue number.
Q: 17_For ref. 25, please provide volume, issue, and page range.
Q: 18_For ref. 27, please provide journal title, volume, issue, and page range.
Q: 19_For ref. 30, please provide issue number.
Q: 20_For ref. 31, please provide journal title, volume, issue, and page range.
Q: 21_For ref. 33, please provide page range.
Q: 22_For ref. 34, please provide issue number.
Q: 23_For ref. 38, please provide issue number.
Q: 24_For ref 42, please provide issue number.
Q: 25_For ref. 45, please provide issue number.
Q: 26_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This does not imply that ...."
EFTA01199754
Supporting Information 6
64
3 6',
4 66
5
Hilbe et al. 10.1073/pnas.1407887111 67
6 SI Text 68
7 In the following. we show how the mathematical framework introduced by Press and Dyson (I) and Akin (2) can be extended to explore 69
8 cooperation and control in multiplayer social dilemmas. We begin by defining the setup of repeated social dilemmas, and then we discuss the 70
9 existence and the properties of zero-determinant strategies. In particular, we study ZD strategies that allow a player to differentiate between 71
10 the actions of different coplayers. We also identify strategies that give rise to stable cooperation. To this end, we focus on two strategy classes: 72
11 ZD strategies and pure memory-one strategies. Then, we investigate how individuals can extend their strategic options by coordinating their 73
12 behaviors with others, and we apply our results to two examples of multiplayer dilemmas: the public goods game and the volunteer's di- 74
13 lemma. The appendix contains the proofs for our propositions. 75
14 Setup of the Model: Repeated Multiplayer Dilemmas. We consider repeated social dilemmas between n players (as illustrated in Fig. 1). In 76
15 each round, players may either cooperate (C) or defect (D), and the players payoffs for each round depend on their own action and on the 77
16 number of cooperators among the other group members. Specifically, in a group with/ other cooperators, a cooperator receives the payoff 78
17 ai , whereas a defector obtains b1. To qualify as a social dilemma, we assume that one-shot payoffs satisfy the following three conditions (3): 79
18
i) Independent of the own action, players prefer their coplayers to be cooperative
19 81
80
/0 8/
21 ar t a ag and bt u a Di for all/with 0 <j<n —1. [S1] 83
22 84
23 85
24 ii) Within each mixed group, defectors strictly outperform cooperators 86
17
15
/6
KI
or all4with 0 Si <n —1. [Sy]
87
88
root
89
1g 90
/9 iii) Mutual cooperation is preferred over metal defection %yr' 91
30 92
31 en- t > ba• [S3] 93
32 94
33 As particular examples of such social dilemmas, we discuss the linear public goods game (4) and the volunteer's dilemma (5) later. 95
M Ro We will assume here that the social dilemma is repeated infinitely often. This is merely done for simplicity of the argument; similar 96
35 results can be obtained for finitely many rounds or for games with discounting of the future as in refs. 6 and 7. In repeated games, 97
a player's strategy needs to specify how to act in given round, depending on the outcomes of the previous rounds. Given the strategies
36 98
of all group members, let us denote player is expected payoff in round t as ai(r). The payoffs for the repeated game are defined as the
37 99
average payoff per round
38 100
1 T
39
40
d = limCO
r -*
TEAT). [S4]
101
102
ml
41 103
42 oa In the following, we will assume that these limits exist. This always holds, for example, when players only base their decisions on a finite 104
43 (but arbitrarily large) number of past rounds. 105
44 106
45 Zero-Determinant Strategies for Multiplayer Dilemmas. 107
46 Memory.one strategies and Akin's lemma. Although in general, strategies for repeated games can be arbitrarily complicated, we showed that 108
players can achieve a remarkable control over the possible payoff relations by referring to the outcome of the last round only. In particular, we
47 109
focused on players who only consider their own move in the previous round and the number of cooperators in the previous round (this is
48 a consequence of our assumption that the game is symmetric, such that payoffs do not depend on who of the coplayers cooperated, but only on 110
49 how many). Such strategies are particularly relevant when players can only observe the outcome of the game, but not the coplayers' individual 1l l
50 actions. In the context of alliances, however, it is useful to consider a slightly more general strategy set, which allows players to distinguish 112
51 between different coplayers. In this section, we will therefore develop a more general theory of ZD strategies. 113
52 To this end, let us denote player i's action in a given round as 51e {C,D}, and let a= (S1 Sa) e (C.Dr denote the overall 114
53 outcome of that round. A memory-one strategy is a rule that tells a player what to do in the next round, given the outcome of the 115
54 previous round. Formally, such memory-one strategies correspond to a map that takes the outcome of the previous round a as input 116
55 and that returns the cooperation probability p,, for the next round, p= (ps)„,ic..Dr. For example, player i's strategy Repeat, which 117
56 simply reiterates the own move from the previous round, takes the form p P with 118
57 gel, {1 if S, =C 119
58 'Ns,. so = 0 if Si =D . [SS] 120
59 121
60 Additionally a memory-one strategy also needs to specify a cooperation probability for the first round of the game. Because the outcome 122
61 of infinitely repeated games is often independent of the first round, the initial cooperation probability is typically neglected (I, 2, 8-13). 123
62 In the following, we will therefore only specify a player's initial cooperation probability when necessary. 124
Hite et el. www.pnes.orgAglkontentishort/1407811711t 1 of 19
EFTA01199755
125 If all members of the group use memory strategies, the calculation of payoffs according to Eq. S4 becomes particularly simple. In that 187
126 case, the repeated game can be described as a Markov chain, as the outcome in the next round only depends on the outcome of the 188
127 previous round (14-16). Although the assumption of memory-one strategies often simplifies calculations, we will show below that the 189
128 properties of ZD strategies hold irrespective of the strategies of the other group members (in particular, ZD strategists do not require 190
129 their coplayers to apply memory-one strategies). 191
130 For a game between n players with arbitrary but fixed strategies, let ve(t) be the probability that the outcome of the rth round is a and 192
131 let v(t) = [v"(t)]eelcol^ be the vector of these probabilities. For example, for pairwise games v(t) becomes kat). vcn(r), vrac(0.van(01. 193
132 A limit distribution visa limit point for t-. co of the sequence (v(I) + ...v(t)]/t. The entries of v. of such a limit distribution cor- 194
133 respond to the fraction of rounds in which the group members find themselves in state a e {CO}" over the course of the game. If one 195
134 of the players applies a memory-one strategy p, Akin's lemma again guarantees that there is a powerful relationship between p and v 196
135 (which can be shown with literally the same proof as in the main text). 197
136 198
137 Lonna (Akin's Lemma). Suppose the focal player applies an athitraty memory-one strategy p. Then, for any limiting distribution v (irrespective 199
138 of the outcome of the initial round), we have 200
139 201
140 (p — pRel • v = 0, IS6] 202
141 203
142 where the product refers to the usual scalar product, p•v= r__
,._,...(arpos• 204
143 We note that Akin's lemma makes no assumptions on the payoff structure of the game (in particular it also applies to games 205
144 that do not have the form of a social dilemma). Moreover, there are no restrictions on the strategies applied by the remaining 206
145 n —1 group members. 207
146 Zero-detenninant strategies. To define zero-determinant strategies, let us first introduce some further notation. For a given round outcome 208
GE {C. D}". let lad denote the total number of cooperators (i.e., 'al equals to the number of Cs in a). This allows us to write player i's
14Pt3 209
148 payoff g, for that round as 210
149 211
150 IS7] 212
egl. .S.) = { aro; I if :St —
=DC.
151 213
152 2
Let g' = &e,),„.(cmr be the corresponding payoff vector. Using this notation, we can write player i's payoff in round t as ne(t) = g • v(t),
153 21514
and i's expected payoff for the repeated social dilemma according to Eq. S4 becomes d= g' • v. Finally, let 1 = (1).,Icar denote the
154 216
vector with all entries being equal to one. By definition of v, it follows that 1. v= I. We can now introduce ZD strategies as follows.
155 217
156 Theorem (Press and Dyson). Let a, 4, and y be parameters such that E mpi 0 O. If player i applies a memory one strategy of the form 218
157 219
158 II 220
159 P = PReP +cle + E fife + r I • PSI 221
160 Oi 222
161 223
162 then, itrespettive of the strategies of the remaining n —1 group members, payoffs obey the equation
224
163 225
164 a:if + Eft"? + y =O. 091 226
165 or 227
166 228
We refer to strategies of the form S8 as zero-de:eminent strategies or ZD strategies.
167 Proof Follows immediately from Akin's lemma, because 229
168 230
169 231
170 232
171 0 = (p — pate ) • v = (agt + E6e +rt •v = azi + Eisys + T. pie] 233
172 n" or 234
173 235
174 By using ZD strategies, a player can thus enforce a linear payoff relation between his own payoff and the payoffs of the coplayers. 236
Moreover, by appropriately choosing the parameters a, fir and y, the player has direct control on the form of this payoff relation.
175 237
For our purpose, it will be convenient to use a slightly different representation of ZD strategies. For a player i who applies a ZD
176 238
strategy, let us consider the following parameter transformation:
177 239
178 240
179
180 i_ -r - a 4 241
a +Ek.efik.
s=,.-•
Lkonk Mt' =
rAiiil'k
L-
a iii i =°. 46 =Efik• (SII] 242
181 k4i 243
182
Using these new parameters, ZD strategies take the form
183 244
245
184 246
185 P = PRI' +Ok i ''' Eine + (1 •t)111 , 15121 247
186 i* 248
Hite et el. www.pnes.orgicglkontentishort/140783711I 2 of 19
EFTA01199756
249 subject to the constraints #*0, w, =0, and E74,w, =1, which directly arise from the definitions in Eq. S11. When player i applies such 311
250 a ZD strategy, the enforced payoff relation according to Eq. S9 becomes 312
251 313
252 rr'= sx' + (1 —s)/. IS13) 314
253 315
254 where = iH7d is the weighted average payoff of the coplayers. We refer to! as the baseline payoff of the ZD strategy, to s as the 316
255 slope, and tow = (wi ) as the strategy's weights. The parameter # does not have a direct effect on Eq. S13; however, the magnitude of # 317
256 determines how fast payoffs converge to the enforced payoff relation as the game proceeds (7). 318
257 Examples (the Impact of different weights): 319
258 i) Equal weight on all coplayers. Suppose player i applies a ZD strategy with weights voi = I/(n — I) for all j 01. According to Eq. S12, 320
259 the entries of such a ZD strategy have the form 321
260 322
261 it —Ial 323
I +95[(1 0a (blel akr1-01 if Si =C
262 324
Po= (S141
263 325
{ 95[(1—s)(1-41) + if Si =D.
264 326
265 327
266 The cooperation probabilities of player i thus only depend on the player's own action Si in the previous round and on the number of 328
267 cooperators H. That is, the ZD strategies that we discussed in the main text are exactly those ZD strategies that use the same weight for 329
268 each of their coplayers. According to Eq. 513, such strategies enforce tr,' =sr' +(I —s)/, with tr-' being the arithmetic mean of all coplayers 330
269 payoffs tr-' =E 1(rt — I). In Fig. SI, we illustrate this relationship for two different social dilemmas (the public goods game and the 331
270 volunteer's dilemma) and for two different ZD strategies (a generous and an extortionate ZD strategy). 332
271 Full weight on one coplayer. Let us now suppose instead that player i chooses w such that all entries are zero except for w, =I for 333
272 some j# i. It follows that 334
273 335
274 1 +0(1 —s)(1—akh,) if Si=Si=C 336
275 1 +01.saiel..i +(1 s)I] ifSe=C,Si =D 337
ISIS]
276 Pe= 4,[sblei — am _i+ (1 —s)/] if Si =D.S) =C 338
277 0(1 —0 —No if Si =Si =D. 339
278 340
279 That is, player i's reaction only depends on the number of cooperators. the own move, and on player ts move. The enforced payoff 341
280 relation 513 becomes Kr =sd + (I —s)1. 342
281 A player cannot enforce arbitrary payoff relations 513 because the parameters I. s, w, and ¢ need to be set such that the resulting 343
282 cooperation probabilities according to Eq. S12 are in the unit interval. We thus say that a payoff relation (Ls. w) is enforceable if there 344
283 is a ¢#0 such that the resulting ZD strategy p satisfies p,, [0, 1] for all possible outcomes aE (C.Dr. The following gives some 345
284 necessary conditions for enforceable payoff relations. 346
285 Proposition 1(Necessary Conditions for Enforceable Payoff Relations).Any enforceablepayoffrelation (1.s.w) satisfies —r t. '. <s < 1, and ifs < I 347
286 then be <1<an-i. Moreover, the parameter ep# 0 needs to be chosen such that ifi> 0. 348
287 In addition to these necessary conditions, one can also give a characterization of all possible payoff relations. 349
288 350
289 Proposition 2 (Enforceable Payoff Relations). Consider a payoffrelation (1,s. w)forplayer i such that ovi> 0 ford/ j. Let fv) denote the sum of 351
290 the j smallest entries of w (ercluding the entry W1), and let % =0. Then (1.4,w) is enforceablefor a given social dilemma if and only if either 352
291 s =1 or 353
292 354
293 riy(bi —a,) } 355
max —ai-1)}</< min { a IS16I
294 0≤iln -1{ b. — 1—s hsism-1 1 I —s 356
295 357
296 Remark (some observations on enforceable payoff relations): 358
297 i) A direct consequence of Proposition 2 is that ZD strategies exist for all social dilemmas and for all weights w with In> 0 (one only 359
298 needs to set s =1). Moreover, if all weights are positive, i.e., iv, > 0 for all jOi, it also follows that any baseline payoff between 360
299 be s! <an_ i is enforceable for some s < I (because b) >al _i for all j; one only needs to choose an s that is sufficiently close to one). 361
300 ii) For a given slope and given weights, we also note that if the baseline payoffsIt and /2 are enforceable, then so is any baseline payoff 362
301 between Ii and h. 363
302 iii) In the special case of equal weights on all coplayers, wi = 1/(n — I) for allj, the sum of the j smallest entries of w is simply given by 364
=j/(n —1). In that case, a payoff relation is enforceable if and only if either s= 1 or
303 365
304 j -j -1 bit i -a)} 366
305 max {b - — bi-aj-}<7 < +n . (5171 367
01•93-1 ' n -1 1-s twsx.-I n-1 1-s
306 368
307 Fig. S2 gives an illustration of the enforceable payoff relations, again for the two examples of a public goods game and a volunteer's 369
3%4 dilemma. In particular, in both examples the space of enforceable payoff relations shrinks with group size. This holds more generally: 370
309 if the payoff advantage of a defector br i —a1 does not increase with group size, then Proposition 2 implies that larger group sizes n 371
310 make it more difficult to enforce specific payoff relations. 372
HlIbe et el. www.poes.org/cglkontentishort/140788711I 3 of 19
EFTA01199757
373 iv) In the special case of full weight on one coplayer, i.e., we = I for some k 0 i and all other entries of w are zero, the sum of the/ 435
374 smallest entries is either ii7= 0 (if/ <n — I) or go) =I (ifi =n —1). Because bm ≥ hi and am ≥ at, a payoff relation is enforceable if 436
375 either s =1 or 437
376 438
377 max {Dn.-2. iril- fbft-1 <!< min{b a°, al }. (5181 439
I —s 1 —s
378 440
379 441
For games in which ba- 2 >al, condition S18 cannot be satisfied. In particular, the condition cannot be satisfied for social dilemmas
380 with group size n > 3 (because for such social dilemmas ba_2 >ai follows from S1 and S2). Thus, in large groups, the only feasible 4t
381 ZD strategies are those with s = 1, i.e., the only payoff relationship that player i can enforce between his own payoff and player is 443
382 payoff is the fair payoff relationship, s =j/. 444
383 v) A comparison of Eqs. S17 and SIR thus shows that the set of enforceable payoff relations is larger if player i uses the same weight 445
384,ts for all coplayers. This holds more generally: if a payoff relation (ts.w) is enforceable for some weight vector w, then it is also 446
385 enforceable for the weight vector that puts equal weight on all coplayers (this follows from Proposition 2 because go) becomes 447
386 largest when all coplayers have the same weight). 44S
387 449
388 Nash Equilibria of Repeated Multiplayer Dilemmas. Various Folk theorems have shown how repetition can be used to sustain cooperation in 450
389 social dilemmas (17. IS). To prove these Folk theorems, one typically constructs specific strategies that give rise to given payoff 451
390 combinations (xi x") and shows then that these strategies are indeed an equilibrium of the repeated game. Herein, we ask a somewhat 452
391 different question: within a certain strategy class, what are all Nash equilibria? We respond to this question for two different strategy classes, 453
392 the class of ZD strategies, and the class of pure memory-one strategies. The equilibria among these two strategy classes will prove to be stable 454
393 against any mutant strategy (i.e., we do not need to assume that mutants are restricted to ZD strategies or memoryone strategies). 455
394 To simplify the analysis, we will focus on symmetric memory-one strategies for which the cooperation probability only depends on the 456
395 own move in the previous round and on the number of cooperators in the previous round (but not on who of the coplayers cooperated). 457
396 In that case, each possible outcome a can be identified with a pair (S,j), where S e (C,D} is the player's own move and/ is the number 458
39b. of cooperators among the other n —1 group members. This allows us to represent memory-one strategies as vectors of the form 459
398 p = (ps1). Using an analogous notation as in the previous section, v= (v51) corresponds to the frequency of observing each of the states 460
399 (S,j) over the course of the game, g' = (gi.54.) corresponds to player rs payoffs, and r = (r) s) corresponds to the average payoffs of i's 461
400 coplayers (using the arithmetic mean r = n—_Li E rne). Because symmetric memory-one strategies are a subset of all memory-one 462
401 strategies, the previous results (in particular Akin's lemma and the Press and Dyson theorem) naturally carry over. 463
402 Nash equilibria among ZO strategies. Consider a group in which all players apply the ZD strategy p with parameters 1, s, 0, and w) = 1/(n —1) 464
403 for all/ #i. and let us suppose the nth player considers deviating. We will refer to the first n — 1 players as the residents and to the nth 465
4D4 player as the mutant, and we denote a resident's payoff by x and the mutant's payoff by I. Because residents apply a ZD strategy, Eq. S13 466
405 implies that each of them enforces the relationship 467
406 n I 468
407 ir=sn+(l—s)!, (519] 469
—2n+n l
n —1
408 470
409 which can be rewritten as 471
410 472
411 it = Ar+ (I —siz)!, (S20] 473
412 474
413 with 475
414 476
415 ea = s(n — 1) — (n —2). (S21] 477
416 478
417 That is, then —1 residents collectively enforce a linear relationship between their own payoff Jr and the payoff of the mutant ir, with the 479
same baseline payoff I and with slope st. By using the same strategy as the residents, the mutant yields the payoff if = fr. Forst< I, Eq. S20
418 480
then implies that ir = A=l. Forst =s =1, the value ofI is a free parameter that does not have an effect on the entries of the ZD strategy; to
419 481
be consistent with the cases < 1, we define! to be the payoff that the strategy yields if applied by all group members (this payoff depends on
420 the strategy's cooperation probability in the initial round). Thus, by using the residents' strategy the mutant yields the payoff !. 482
421 For p to be a Nash equilibrium, a minimum requirement is that the mutant must not have an incentive to switch to a different ZD 483
4// strategy 0, with parameters i and ,--th ci <1. Such a mutant would enforce the payoff relation 484
423 485
424 n = lir+ (1 —1)I. (S22] 486
425 487
426 Solving the two linear equations S20 and S22 for the mutant's payoff yields 488
427 489
428 it _ 1(1 —IR) 14.TR(1-1) 490
(S23]
429 1 —isia 491
43
4310 The mutant has no incentive to deviate when 1 <1, that is when 492
493
432 494
433 0 —0:71(1 —1) 495
u I - ` SO. (524]
434 I —31/1 496
lite et el. www.petes.ceseglicOntenthhcet/140703711I 4 of 19
EFTA01199758
497 Because —3,—ri ≤s <1 (by Proposition I) and because --4 ct <1 (by assumption), the denominator of Eq. S24 is positive. We can 559
498 therefore diitinguish three cases. 560
499 561
i) sR =0. In that case 1-1=0, and player n cannot improve his payoff by deviating.
500 562
ii) sR > 0. In that case 1 —/ ≤0 if and only if I <1. To prevent the mutant from deviating, the residents thus need to apply a strategy
Jul with maximum possible baseline payoff, 1=a„_i. 563
502 iii) sR <O. Then 1-1<0 if and only if ? — / > 0. To prevent the mutant from deviating, the residents' ZD strategy needs to set 1 to the 564
503 minimum value 1=bo. 565
504 566
505 This result also holds if mutants are not restricted to ZD strategies. 567
506 568
Proposition 3 (Nash Equilibria Among ZD Strategies). Considers social dilemma in which mutual cooperation is the best outcome for the group
507 whereas mutual defection is the burst possible outcome 569
508 570
509 571
Dbr < MU MA + l(11—Db) <6,„_1.
be < onisit. jak..I + (n —
510 IS25) 572
511 n osisn n 573
512 574
Let p be a ZD strategy with parameters 4 s, and # and let I R = (n - 1)5—(n —2). Then p is a Nash equilibrium if and only if one of the
513 575
following three cases holds:
514 576
515 577
516 5R =0 and bo <I <a,r-i ; sit > 0 and 1=a„..t; and .sR <0 and / = bo. 578
517 579
518 Remark (some observations for stable z0 strategies): 580
519 i) The three conditions in Proposition 3 do not depend on 0. Whether a ZD strategy is stable only depends on the payoff relation that 581
520 it enforces, but not on the exact strategy that gives rise to this payoff relation. 582
521 ii) The three conditions do not directly depend on the sign of s, but on the sign of .5R = (n —1)5 — (st — 2). Whether a given ZD strategy 583
522 is stable thus depends on the group size. 584
523 iii) The second condition is of particular interest, because it states that stable mutual cooperation can be achieved by ZD strategies 585
524 with 1=an_ t and s> (n — 2/n —1). For pairwise games these are exactly the generous ZD strategies with I =4,,_i and s> 0 (2, 10). 586
519 7 As the group size increases, generous strategies need to approach the fair strategies (with s= I) to allow for stable cooperation. 587
526 Corollary 1(Convexity of the Set of Nash Equilibria). Consider a social dilemma in which mutual cooperation is the best outcomefor the group. 588
527 Suppose pi and p* are tno ZD smmgies that both give rise to stable cooperation (i.e., l' =I" = tr„_i and ?> (n —21n —1), se> (n —21n — 1)). 589
528 Then any linear combination p =.1p' + (I —A)p° with 0 < A <1 is afro a ZD strategy that gMtc rise to stable cooperation. 590
529 Proof A direct computation shows that p can be written as a ZD strategy with parameters / =an_i, 46 =20' + (1 —2)0' >0, and 591
530 s= (IA24' +(I— A)erfill[Ab' + (1 —A)01}≥ (ti — 2/n —1). 592
531 A similar result can also be shown for ZD strategies that lead to mutual defection. 593
532 Nash equilibria amongpure memory-one strategies. As another application of Akin's lemma, we will show in the following which pure memory- 594
533 one strategies allow for stable cooperation in multiplayer social dilemmas. To this end, let us again consider a group in which the first n — 1 595
534 players apply some pure memory-one strategy p= (psi ) and in which the nth player considers deviating. Let v denote a limit distribution 596
535 from the perspective of the n —1 resident players and let t be the corresponding limit distribution from the perspective of the mutant 597
536 player. The following observation will be useful: 598
537 599
Lemma 1 (Relationship Between Limit Distributions).If the residents apply a pure memory-one strategy, psie (0, 1) for all (S,j), the entries of
538 v swish, 600
539 601
540 602
541 ye.; = 0 for all j<n —2 603
(S261
542 varpi= 0 for all j> 1. 604
543 60
Moreover, the limit distributions v and i are related by
544 605
6
545 607
546 ka-i =vca-t, i)c.o =YD.& 608
(S27]
547 lion-I=VCA-2, i)D.0 = VD.0, 609
548 610
549 and 1,si= 0 for all other states (54). 611
550 Proof As all residents use the same pure strategy, they play the same action in any given round. Thus, if one of the residents cooperates, 612
then there are at least n —2 other cooperators, and therefore the probability to end up in a state with less than n-2 other cooperators
551 613
is zero, vcd =0 forj <n —2. The same argument shows that vD0 =0 forj> 1. Finally, for Eq. S27, we note that the mutant is in the state
552 614
(Cis — 1) if and only if each of the residents is in the state (Can —1). Similarly, the mutant is in the states (C. 0), (An -1), and (D.0) if
553 615
and only if the residents are in the state (D, 1), (Cat — 2), and (D.0), respectively.
554 616
555 Proposition 4 (Pure Memory-One Strategies That Give Rise to Stable Cooperation). Consider a social dilemma in which there is an incentive to 617
556 deviate from mutual cooperation, 1),,,_1>a„_1. Let p be a pure memory-one strategy that cooperates in the first round and that sticks to 618
557 cooperation as long as all otherplayas do so (i.e., pc.„-I = 1). Then p is a Nash equilibrium if and only, if the entries of p satisfy the three 619
558 conditions 620
HIt . et el. www.pnes.orgicglkontentishort/140783711I 5 of 19
EFTA01199759
621 683
622 N A- 2 =0 IS28a) 684
623 685
624 PD.I S an-1 — a0 [bbl 686
625 bn-i—a,,-t 687
626 688
627 an -1 —be 689
IS28O
628 Poo S bff _i —an_i' 690
629 691
630 Remark (some remarks on Proposition 4): 692
631 i) The full proof of Proposition 4 is given in the appendix; the step (*) follows from a straightforward computation of payoffs for two 693
632 possible mutant strategies. The step (a) requires a more sophisticated argument; it is exactly in this second step where Akin's 694
633 lemma comes into play. 695
634 ii) According to Proposition 4, the stability of a cooperative and pure memory-one strategy p is solely determined by the four entries 696
635 pc.„_ ] =1,pc,,_2 = 0, pDA, andpp°. This observation is a consequence of Lemma 1, which allowed us to neglect all other entries of 697
p. For pairwise games, Lemma 1 is not required, and thus pairwise games allow more general versions of Proposition 4, which are
636 698
then also valid for mixed memory-one strategies (2, 6).
63f 4 699
638 Examples (for stable memory-one strategies): 700
639 701
i) The proofs of the various Folk theorems are often based on trigger strategies, which relentlessly play D after any deviation from
640 the equilibrium path (17). An example of such a strategy is Grim, for which pc,' =I andps,, =0 for all other states (S, j). Because 702
641 Grim satisfies the three conditions in Eq. S28, Grain is indeed a Nash equilibrium for all multiplayer dilemmas. 703
642 ii) In their analysis of multiplayer dilemmas, refs. 19 and 20 consider the performance of generalized versions of Tit-for-Tat. These 704
643 TPTk strategies cooperate if at least k other group members have cooperated in the previous round, i.e.,psi = 1 if and only if j≥ k. 705
644 As the conditions 528 are only satisfied for k =n —1, it follows that TPTa-i is the only Nash equilibrium among these generalized 706
645 versions of Tit-for-Tat. 707
646 iii) Unfortunately, neither Grim nor TF7i_i is robust under the realistic assumption that players sometimes commit errors (16). For 708
647 stochastic simulations of variants of the public goods game, ref. 15 found that evolution may promote strategies that only 7®
648 cooperate after mutual cooperation or after mutual defection, i.e., Pea-1 =Pao = I. and psi =0 for all other states (S.j). We refer 710
to such strategies asWSLI For the prisoner's dilemma, this strategy corresponds to the Win-Stay Lose-Shift behavior described by
649 711
ref. 14. According to Eq. S28, WSLS is a Nash equilibrium if and only if the social dilemma satisfies (b„_, +b0)12 ≤an_1. In Fig. S3,
650 we illustrate the stability of these strategies when the social dilemma is a public goods game. 712
651 713
652 With a similar approach, we can also characterize the pure memory-one strategies that result in defection. 714
653 715
654 Proposition S (Pure Memory-One Strategies That Give Rise to Stable Defection). Consider a social dilemma with bo> ao and bn-t ern...t. Let p 716
be a pure memory-one strany that defects in the first round and that sticks to defection as long as all otherplayers do so (Le., PD.o =0). Then
655 717
p is a Nash equilibrium if and only if at least one of the following nvo conditions is satisfied
656 718
657 719
b„_1 +ao
658 pa l =0; and ren-I
n =pca-2=0 and <be. (529] 720
659 2 721
660 Remark (some remarks on Proposition 5): 722
661 723
662 i) As a special case of the above proposition, we conclude that AllD is an equilibrium for any social dilemma with ba _i ≥an_1 and 724
663 Do a ao. 725
ii) Conversely, mutual defection is never stable in social dilemmas with b0 <ao (such as the volunteers dilemma). If bo <ao, it follows
664 726
from condition Si that be <ai for all/. As a consequence, an equilibrium in which everyone defects can always be invaded by
665 a player who switches to Al1C. 727
666 728
667 729
668 Coordinated Play and Zero-Determinant Alliances. In the previous sections, we analyzed the amount of control that single individuals can 730
669 exert over their coplayers in repeated multiplayer dilemmas. Now we are interested in the question whether individuals can gain a higher 731
amount of control by coordinating their play with other group members. To this end, let us consider a set of n-4 < n players, who agree on
670 732
671 a joint strategy. We will refer to these players as the allies and to the remaining group members as the outsiders (as depicted in Fig. 1). 733
672 Depending on the degree of coordination, one can think of various forms of coordinated play. In the following we are going to 734
explore two modes of coordination:
673 735
674 i) Strategy alliances. Here, players only agree on the set of alliance members and on a common memory-one strategy that each ally 736
675 applies. During the actual game, there is no further communication taking place. Strategy alliances thus require a minimum amount 737
676 of coordination. This alliance type seems particularly relevant when allies do not have the possibility to communicate during the 738
677 game or when it is too costly to coordinate the allies actions in each round. 739
ii) Synchronized alliances. Alternatively, players may synchronize their decisions in each round such that all allies play the same action. In
678 740
effect, a synchronized alliance thus behaves like a new entity that has a higher leverage than each single player.
679 741
680 We will use the symbols ri-4 to refer to the payoff of an ally and C A to refer to the average payoff of the outsiders. In the following, 742
681 we investigate which linear payoff relationships ZD alliances can enforce and how their strategic strength depends on the mode of 743
682 coordination. 744
Hite et el. www.poes.oroRglkontentishoit./1407837111 6 of 19
EFTA01199760
745 Strategy alliances. To model strategy alliances, let us consider a group of nA allies, with 1 ≤nit <n. We assume that the allies agree on 807
746 using the same ZD strategy p during the game. The parameters of this ZD strategy are given by /, s, and 0> O. To allow allies to 808
747 differentiate between the actions of other allies and outsiders, the weights w= (wi ) of the ZD strategy may depend on a player's 809
748 membership in the alliance. That is, if player i is a member of the alliance, then 810
749 811
750 wA if j is a member of the alliance 812
751 (530] 813
wi'4 = { w"v1 if j is an outsider.
752 814
753 with w4 ≥ 0 and nr 4 ≥ 0 such that the weights sum up to one, (nA —1)w4 + (n —ItA)nrA = 1. Because all allies use the same strategy, it 815
754 follows that they all get the same payoff. Moreover, by Eq. 513, each of the allies enforces the payoff relationship 816
755 817
756 (x4 — 1)wAr? + (n —nA))v-All-A = srA + (I —sy. (S31] 818
757 819
758 This payoff relationship can be rewritten as 820
759 821
760 fir4 = sANA + (1 —s4)1. (S32] 822
761 823
762 such that 824
763 825
764 s— (nA 826
765 ? — 1 — OA —1)"'4
— 1) (533] 827
766 wA' 828
767 829
is the effective slope of the strategy alliance. For HA = 1 /(n — 1), we recover the case of equal weights on all group members. For general
768 weights; A, we say that the payoff relationship S32 with parameters (/,s4) is enforceable if we can find 0> 0 and 0 < wA < 1/(nA —1) 830
769 such that all entries of the resulting ZD strategy according to Eq. S12 are in the unit interval. The following gives the corresponding 831
770 characterization. 832
771 833
772 Proposition 6 (Enforceable Payoff Relations for Strategy Alliances).A strategy alliance can enforce thepayoffrelation (1.?) ifand only if either 834
773 ? =I or 835
774 j bi —ar ,1</ < min •to +— n —f — I bio —a/ 836
775 sA <1 and sr:7,0 ISM] 837
776 0 {b, n—nA I —et nA-isisn-i 1 n — nA I —sA }. 838
777 839
Moreover, if nA <n12, then —1 <—nA fin —nA)<sA < I.
778 Remark (on strategy alliances): 840
779 841
780 i) Earlier we saw that individuals typically lose their strategic power when groups become large. The above proposition shows 842
781 that players can regain control by forming alliances. In particular, the space of enforceable payoff relations (i,sA) increases 843
with the size of the alliance n4. Larger alliances can therefore enforce more extreme payoff relationships, as illustrated in 844
782
Fig. S4.
783 ii) Somewhat surprisingly, it follows from the proof of Proposition 6 that the set of enforceable payoff relationships becomes maximal 845
784 when 1A approaches II (nA -1) (and therefore w-A -. 0). The most powerful alliances are those in which the outsiders' actions 846
785 only have an infinitesimal influence. 847
786 iii) In contrast to the case of ZD strategies for individual players, we note that the effective slope ? for alliances does not need to be 848
787 bounded from below. As an example, let us assume the alliance has reached a size n4 such that ba _„.. ≤a„A_I (which can only 849
788 happen when nA > n/2). Because b„,A is an upper bound for the left side of Eq. S34 and a...I_ 1 is a lower bound for the right side 850
789 of Eq. S34, it follows that any I with b„,,A 515a,,.4_1 can be enforced, irrespective of the value of sA <1. 851
790 iv) However, for alliances that have reached a size nA such that b„_„A <a„A _i, the theory of ZD strategies becomes somewhat less 852
791 relevant: such alliances are better off by cooperating in each round (if all allies cooperate, their payoff is at least anA _I, whereas if 853
792 they all defect, their payoff is at most b„_„.4). In other words, if n4 individuals are able to make a binding agreement that they will 854
all play the same strategy, in a social dilemma with b„_„A <a„A_I, then unconditional cooperation is a dominant strategy for the
793 alliance. 855
794 856
795 Synchronizedafflattes. In the previous scenario, we assumed that each of the allies decides independently whether to cooperate in a given 857
796 round. Let us now turn to a scenario in which allies meet after each round to decide which action they collectively play in the next 858
797 round. As a result, the alliance members act as a single entity, in a game with n —nA coplayers. To investigate such a scenario, let us 859
798 first adapt our notation correspondingly. For a given set of allies, let (S,j) refer to the outcome in which all allies choose SE {C,D}, 860
799 and in which j of the outsiders cooperate. As in previous sections, the limit distribution v = (vs.') corresponds to the fraction of 861
800 rounds the alliance finds herself in state (S.,i) over the course of the game. A memory-one strategy for a synchronized alliance is 862
801 a vector p= (ps.1)—given the outcome (S,j) of the previous round, the cooperation probabilitypsi is used to determine whether all 863
802 allies cooperate or all allies defect in the next round. The synchronized alliance uses the strategy Repeat if cooperation probabilities 864
803 are given by 865
804 866
805 Rep... f 1 if S=C 867
[535]
806 Ps.) — 1 0 if S=O. 868
lite et el. www.roes.orgleglkontenthhott/140788711I 7 of 19
EFTA01199761
869 With literally the same proof as in the main text, one can verify Akin's lemma for synchronized alliances: if the alliance applies 931
870 a memory-one strategy p then any corresponding limit distribution v satisfies (p — pReP) • v = 0. Let us next write down the possible 932
871 payoffs in a given round. The payoff vector gA for the synchronized alliance has the entries 933
872 934
873 ...A 1 a„or i if S=C 935
(S36
874 55I = bi if S=D, ] 936
875 937
876 and the corresponding vector g-A that contains the average payoffs of the outsiders (using the arithmetic mean) takes the form 938
877 939
878 ja„A,,j_i +(n — nA —))b,,At , 940
if S=C
879 n —nA 941
Ss/ — IS37]
880 fait+(n —nA —.Obi 942
881 n —nA if S=D. 943
882 944
883 Using these payoff vectors, the payoff of each ally is given by x-4 = gA • v, and the mean payoff of the outsiders is sr-4 =r 4 • v. 945
884 Analogously to the case of individual players, we can define ZD strategies for synchronized alliances as strategies of the form 946
885 947
886 P=PR`P +cre+fig-A +71. IS38] 94
887 949
888 with I being the memory-one strategy with all entries being one and with a, II, and y being parameters of the ZD strategy (with fi 0 0). 950
889 Akin's lemma then implies that such alliances enforce the relationship 951
890 952
891 ourA + fir -A + y =O. 1S39] 953
892 Again, we stress the fact that this relationship holds irrespective of the strategies of the outsiders: even if outsiders notice that they are 954
893 facing a synchronized alliance, there is nothing they can do to prevent the above payoff relationship. As above, we use the parameter 955
894 transformation 1= —y/(a +11), sA = --alp, and # = —it to write ZD strategies as follows: 956
895 957
896 p =pRy+ ek[sAg4 —g- A + (1 —sA)11]. IS40] 958
897 959
898 With these new parameters, the enforced payoff relationship according to Eq. S39 takes the usual form 960
899 961
900 yr- A = sAr4 + (1 —sA)l. IS411 962
901 963
902 Proposition 7 (Enforceable Payoff Relations for Synchronized Alliances). A synchronized alliance can enforce the payoff relation (1,sA) if and 964
903 only if either sA = I or 965
904 966
905 j bi-aj-i} , n -j- I bit i -ai 967
sA #1 and osmisnA{b si < min {a + — -- IS42]
906 I n -$1-4 1 —sA — w4-isisx-i 1 n—nA 1—s•A }" 968
907 969
908 Moreover, if nA <n12, then —1 < —nA1(n — nA) <et < I. 970
909 Remark (on synchronized alliances): 971
910 i) For not too large alliances with nA <n12, Propositions 6 and 7 imply exactly the same conditions on enforceable payoff relation- 972
911 ships. Thus, although strategy alliances require considerably less coordination between the allies, they have the same strategic 973
912 power as synchronized alliances. 974
913 ii) When nA > 42, synchronized alliances may be able to enforce a strictly larger set of payoff relationships, because they are not 975
914 restricted to payoff relationships with sA s 1. Whether relationships with s-4 > I are enforceable depends on the social dilemma. 976
915 When the social dilemma satisfies b„_,,A <a„A_I then condition S42 can be satisfied for any b„_„A <I <a,,A_I by choosing sA > 1 977
sufficiently large. Conversely, when b„_„4 >a„.4_ 1 then only slopes s-4 < I are feasible (because for sA > 1 the left side of Eq. S42 is
916 strictly larger than h„-„A, whereas the right side is strictly lower than a„.4_1). 978
917 iii) Overall, we conclude that synchronized alliances are more powerful than strategy alliances if and only if the alliance has reached 979
918 a size nA such that ba„A <a„A_I. However, as noted for strategy alliances, the condition b„,A <a,,A_, transforms the social 980
919 dilemma into a game in which mutual cooperation is the best strategy for the alliance, such that the notion of ZD alliances 981
920 becomes less important. 982
9 983
911
21 Here we explored the strategic power of alliances, assuming that the allies agree on a joint ZD strategy. Given these results, one may 984
ask which ZD strategy the allies should agree on and which combinations of alliance strategies and outsider strategies form an
923 985
equilibrium of the game between allies and outsiders. This question is different from the questions explored above; there we considered
9;4 986
a homogeneous group of players, and we explored which strategies are stable if applied by all group members. To explore equilibria for
925 987
games with ZD alliances, one needs to distinguish between the strategies of the allies and the strategies of the outsiders. This inherent
917 asymmetry makes the equilibrium analysis more intricate, and we thus leave this question for future work 988
989
928 Applications.In this section, we apply our theory to two particular examples of multiplayer social dilemmas: the public goods game and the 990
929 volunteer's dilemma. For simplicity, we will focus here on symmetric strategies that only depend on the number of cooperators but not 991
930 on the cooperators' identities (for ZD strategies this implies that we consider the case of equal weights on all coplayers). 992
Hite et el. www.pnes.oraftglkontenthbort/140788711I 8 of 19
EFTA01199762
993 Public goods games. In a public goods game, each player of a group can cooperate by contributing an amount c > 0 to a public pool. Total 1055
994 contributions are multiplied by a factor r with 1 <r <n and evenly shared among all group members. Thus, payoffs are given by 1056
995 1057
996 (j+1)m ,. fir 1058
is) — c. and ui —. (8431
997 = n =n 1059
998 1060
999 Some of the properties of ZD strategies for public goods games have been recently described by (13), using an independent approach. 1061
1000 Here we complement and extend these results. 1062
ID strategies for public goods games. Plugging the payoff values 843 into representation S14 shows that ZD strategies have the form
1001 1063
1002 1064
1003 I +0[(1 —s)(/— ag r +c) ci if Si =C 1065
1004 n -nn , I 1066
Psi = 1S44]
1005 1067
1006 # [(I —s)(/-1—
re) + ' ci if S, =D. 1068
n n 1
1007 1069
1008 To explore which payoff relationships (1,$) a single player can enforce, we use the characterization given in Eq. 817. Because the 1070
1009 payoffs of the public goods game are linear in the number of coplayersj, the corresponding conditions become particularly simple (as 1071
to be
1010 only the boundary cases j= 0 and j =n — 1 need eobsinereodi:ire.:licoenclude+tIliatsa single player can enforce a linear payoff relation 1072
1011 with parameters O) if either s= 1 or 1073
1012 1074
1013 1075
1014 (n —1)tr c (r n)c c 18481 1076
1015 n 1-s - n 1077
1016 1078
1017 Fig. S2 shows the set of all pairs (1.$) that satisfy the above constraints for various group sizesn. We get the following conclusions for the 1079
1018 existence of extortionate strategies, generous strategies, and equalizers, depending on the size n of the group: 1080
1019 i) Extortionate strategies (7= bo = 0). Let us ask which slopes s an extortionate player can enforce. The inequalities 845 then imply that 1081
1020 slopes s> (r— 1)/r can always be enforced, irrespective of the group size /1. However, slopes s < (r —1)/r are only enforceable if the 1082
1021 group size is sufficiently small - 1083
or lee more
oe
1022 1084
1023 r(I -s) 1085
(S46]
1024 " Sr(1-s)- I 1086
1025 1087
1026 We conclude that in large groups, n -• co, only extortionate strategies with s < (r - 1)/r are feasible. 1088
1027 i) Generous strategies (/ = a„..i = is —c). Using the inequalities 545, one can show that generous players can enforce exactly the same 1089
1028 slopes as extortionate players. 1090
1029 ii) Equalizers (s = 0). For equalizers, the inequalities S4S imply there are three regimes: (a) if n Sri(' —1), all baseline payoffs 1091
1030 0 <I <rc —c can be enforced; (b) if r/(r —1) <n < 2r/(r — I), only a limited subset of baseline payoffs 0 <1<re —c can be enforced; 1092
1031 and (c) if n > 2r/(r — I), there are no equalizers. 1093
1032 1094
In particular, we conclude that for a given multiplication factor r > 1 the set of equalizer strategies disappears as groups become large.
1033 Strategy alliances in the public goods game. By Proposition 6, strategy alliances with nA members can enforce a linear relation with 1095
1034 parameters (l. sA) if and only if either sA =I or if the two following inequalities hold: 1096
1035 1097
1036 0<1<rc—c 1098
1037
(n—n )rc c (nsA — n)c c (S47] 1099
1038 / 1100
1039 n 1 —sA — — n + 1 —sx
1101
1040 1102
For the special cases of extortionate strategies, generous strategies, and equalizers, these inequalities allow us to derive the following
1041 conclusions: 1103
1042 1104
1043 i) Extortionate strategies (1=b0 =0). We can rewrite the inequalities S47 to obtain a critical threshold on the fraction of alliance 1105
members that is needed to enforce a certain slope sA
1044 1106
1045 1107
1046 nA r(I—sA)— 1 1108
> 18481
1047 n — r(I-4) ' 1109
1048 1110
particular, if an alliance wants to enforce arbitrarily high extortion factors x—• co, then sA = 1/x —.0 and the critical threshold
1050 becomes nAin > (r— 1)/r. This condition is always satisfied if nA =n -1, implying that an alliance with n — I members can always Ill'
be arbitrarily extortionate toward the remaining group member.
1051 1113
ii) Generous strategies (l=cen-i = rc —c). The inequalities S47 lead to the same threshold for nApi as in the case of extortionate
1052 strategies, as given in Eq. S48. 1114
1053 iii) Equalizers (s = 0). For equalizers, the inequalities S47 lead to two critical thresholds; to be able to set the payoffs of the outsiders 1115
1054 to any value between 0 < / < or —c, the fraction of the allies needs to satisfy 1116
HIE. et al. wwwones.orolcolkontenthbort/1407837111 9 of 19
EFTA01199763
1117 1179
1118 nA r —1 1180
> 1S49]
1119 II r 1181
1120 182
However, to be able to set the payoffs of the outsiders to some value between 0 <I < rc —c, the number of allies only needs to exceed 183
1183
1122 1184
1123 nA (n -2)(r - 1) 1185
1124 IS50] 1186
n ≥ n + (n -2)r '
1125 1187
1126 1188
1127 1189
1128 Nash equilibria for the public goods game. In the following, let us describe a few strategies that allow for stable cooperation in the public 1190
1129 goods game. According to Proposition 3, this can be achieved by using a ZD strategy with parameters 1=4,,-i, 4>O, and 1191
(n -2/n -1) <s < 1. When we choose the boundary case s=1 and efr= 1/c (which is the maximum value of 0, given the constraint
1130 1192
0 <psi S 1), the resulting ZD strategy according to Eq. S44 is proportional Tit-for-Tat with entries
1131 1193
1132 1194
1133 pTFTsi = i l (S51] 1195
1134 n 1196
1135 1197
which is independent of the player's own move. This rule says that the player's cooperation probability is given by the fraction of
1136 cooperators among the coplayers in the previous round (additionally, we need to specify the cooperation probability for the first round, 1198
1137 which needs to be set to one). 1199
1138 Another boundary case is given by the choice s = (rt - 2/n — 1) and 0= [n(n -1)]/(c[n(n - 2) +r]) (which is again the maximum value 1200
1139 of 0). We refer to the resulting ZD strategy as generous Tit-for-Tat, which has entries 1201
1140 1202
1141 gTFTsi j n -j -1 n(r - 1) 1203
+ 1552]
1142 n- 1 n - I (n -2)n +I 1204
1143 1205
1144 Also gTFT is independent of the player's own move, and it is generally more cooperative thanprit because gTFTsi >pTFTsi for all 1206
1145 j <n - 1. 1207
1146 A last boundary case is given by 0-.0, in which case the resulting ZD strategy approaches the strategy Repeat, independent of the 1208
1147 choice of (n — 2/n — 1) <s <I. Due to Corollary 1, we can conclude that any linear combination of these three strategies of the form 1209
1148 1210
1149 p = Ai • p7FT + ArgTFT +A3 . Repeat, (S53] 1211
150
lift ' with 0 <Ak <1, A3 < 1, and Ai +.22 +A3 =I is also a stable ZD strategy.
Among the pure memory-one strategies, Proposition 4 allows us to conclude that Grim and TFT„..i are always Nash equilibria.
1213
P /
1152 1214
Moreover. the strategy WSLS is a Nash equilibrium if r≥ 2n/(n + 1), as illustrated in Fig. S3.
1153 Volunteer's dilemma. In the volunteer's dilemma, at least one of the players needs to cooperate and pay a cost c > 0 in order for all group 1215
1154 members to derive a benefit b>c> O. Thus, the payoffs are given by 1216
1155 1217
1156 ai = b —c for all b and b, =b if j> 1 and bo = 0. (S54] 1218
1157 1219
1158 ID strategies for the volunteer's dilemma. According to Eq. S14, the ZD strategies with equal weight on all coplayers have the form 1220
1159 1221
1160 0/2
1 + 41( 1 —s)(1—b +c)—n; j : I c] if S, =C
1161 /3
1162
1163
1164
Psi = 0[(1 — s) (1— b) +1+2 ci if S, =Dfl l
1S551
ms
0/4
1226
1165 96 (I — 4 1 if S, =D.j =O. 1227
1166 1228
1167 By condition S17, exactly those parameters 1 ands can be enforced for which either s=1 or 1229
1168 1230
1169 1 c
max{ 0,6 -- —} s /sb -c. 15561 1231
1170 n -1 1 —s 1232
1171 1233
1172 This set of enforceable payoff relations is illustrated in Fig, S28. In the special case of extortionate strategies (1=0), condition S56 1234
1173 implies that a given slopes can only be enforced for sufficiently small groups 1235
1174 c 1236
1175 n <I + 15571 1237
b(1 -s)'
1176 1238
1177 For equalizers (s = 0), the inequalities in Eq. S56 imply that a player can only determine the average payoff of the coplayers if n=2, 1239
1178 and then the only enforceable baseline-payoff is I= b -c. 1240
lite et el. www.imes.org/eglkontenthhort/140788711t 10 of 19
EFTA01199764
1241 Strategy alliances in the volunteer's dilemma. For strategy alliances, the enforceable payoff relationships according to Eq. S34 become 1303
1242 1304
1243 1 c 1305
max I 0.b — —} <1Sb — c. IS58]
1244
In—
sA
4 — 1306
1245 1307
1246 In particular, alliances that aim to enforce an extortionate relationship (with 1=0 and some ? >0) need to have a critical size 1308
1247 1309
list
1248 >1 c (SS9] 1310
1249 n — nb(I —sA). 1311
1250 1312
1251 It follows that alliances cannot be arbitrarily extortionate (setting ? =0 on the right side implies that such alliances would need to 1313
1151 satisfy n't > n - 1). Instead, even a large alliance of size n4 =n —1 can only enforce slopes with s4 > (b —c)/b. 1314
The performance of ZD alliances against adapting outsiders. In addition to the static properties of ZD strategies, Press and Dyson (1) also
1253 1315
highlighted a remarkable dynamic property of ZD strategies: when a player with a fixed extortionate ZD strategy is matched with an
1254 1316
adapting opponent who is free to change his strategy over time, then the adapting opponent will move toward more cooperation. In the
1255 1317
following we present simulations suggesting that ZD alliances can have a similar effect.
1318
1-x' To explore the performance of ZD alliances against adapting outsiders, we consider a group of n subjects. Let us assume that the
1257 1319
players (1. n 4 } form a synchronized alliance with ti4 <n and that the allies commit themselves to act according to a ZD strategy
1258 1320
with parameters 1, r 1, and # (similar results could also be obtained under the assumption that the allies form a strategy alliance instead
1259 1321
of a synchronized alliance). Moreover, let us assume that each of the outsiders applies some arbitrary memory-one strategy. To p-
1260 1322
rametrize memory-one strategies, we note that the possible outcomes of a single round of the game can be written as
1261 1323
0= (S-4.S„.40 S„), where SA E (CD) is the joint action of the allies and where Si e {C. D} is the action of each of the outsiders. As
1/6) 1324
a result, there are 211-11A+1 possible outcomes a. The memory-one strategies for the outsiders are thus modeled as vectors p = (p,) with
1263 1325
2"-" A+I entriespa G [0,1].
12.64 We assume that the ZD alliance and the outsiders interact in a series of repeated games. Although the strategy of the ZD alliance is 1326
1265 assumed to be fixed, outsiders are allowed to adapt their strategy from one repeated game to the next. Specifically, we assume that in each 1327
1266 time step, the group interacts in a repeated public goods game, resulting in the payoff el for each of the allies and the payoffs x) for each 1328
1267 outsider/. Because all players use memory-one strategies, these payoffs can be calculated using a Markov-chain approach (15). In the 1329
1268 next time step, one of the outsiders is randomly picked to change his strategy from p to p'. The entries of the new strategy pi are 1330
1269 independently drawn from a normal distribution around the old strategy (using an SD of 0.01). If the outsider's payoff using the new 1331
1270 strategy is xi, then we assume that the outsider keeps the new strategy with probability 1332
1271 1333
1272 1 1334
1273 (S60] 1335
P — I 4. exp (—racy' — sin
1274 1336
1275 Otherwise, the outsider rejects the new strategy and continues to use the old strategy. The parameter fa> 0 corresponds to the strength 1337
1276 of selection. In the limit of weak selection (a—. 0, this yields p= 1/2, such that the choice between the new and the old strategy is fully 1338
1277 random. For the simulations, we consider the case of strong selection (we used m=100); in that case, the new strategy is likely to be 1339
1278 adopted if x' > x, and it is likely to be rejected when le <x. This elementary process, in which outsiders are allowed to experiment with 1340
1279 new strategies, is repeated for r time steps. For the initial population of outsiders, we assume that all outsiders start with unconditional 1341
1280 defection. 1342
1281 Fig. S5 reports the outcome of this evolutionary scenario for three different ZD strategies: a fair strategy, an extortionate strategy, and 1343
1282 a generous strategy. In all three scenarios, the outsiders become more cooperative over time; as a consequence, also the allies cooperate 1344
more often, because all three ZD strategies are conditionally cooperative (Fig. S5, Upper). However, there are clear differences in the
1283 1345
1284 final cooperation rates between the three scenarios. Extortionate alliances seem to be least successful to incentivize cooperation, 1346
whereas fair alliances tend to achieve full cooperation in the long run. This success of fair strategies can be attributed to their higher
1 85 1347
1286 slope values: because fair strategies uses =1, they perfectly recoup outsiders for increasing their cooperation rates. Both other strategy 1348
1287 classes use slopes with s < 1, which makes it less attractive for the outsiders to become more cooperative. As depicted in the lower 1349
panels of Fig. 85, fair ZD alliances therefore also yield the highest payoffs by the end of the simulation.
1288 1350
Of course, the numerical simulations presented here only provide a snapshot of the full dynamical properties of ZD strategies (which
1289 1351
deserve a careful analysis on their own). The simulations serve as a proof of principle: as previously shown for the iterated prisoner's
190 1352
dilemma (1, 21), ZD strategies can be used to generate a positive group dynamics in multiplayer dilemmas.
1291 1353
1292 Appendix Proofs. 1354
1293 ProofofProposition 1: By the definition of ZD strategies. the cooperation probabilities after mutual cooperation and mutual defection are 1355
1294 given by 1356
1295 1357
1296 row n= 1 +0(1- s)(1—a„_ t) 1358
1297 P61] 1359
po D) = 0(1 —s)ff — bo).
1298 1360
1299 As these two entries need to be in the unit interval, it follows that 1361
1300 1362
1301 0(1 —s)(1—a„_i ) 5 0 1363
(S621
1302 0 S 0(1 —*/ — AO- 1364
HIlbe et el. www.pnes.orsecglkontenthhoit./140783711I II et 19
EFTA01199765
1365 Adding up these two inequalities implies 0(1 —s)(bo —a„..1)≤0, and because of Eq. S3 1427
1366 1428
1367 0(1 -s)a. O. (8631 1429
1368 1430
1369 Analogously. let us consider outcomes a in which all players but one cooperate (i.e., a is a permutation of (C,...,C, D), in which case 1431
1370 1432
1371 Pe . { 1+ O[sa„_2 — (I — wl )a„_2 — wibn-i + (I —s)/) if the defector is a coplayerf 0 i (364] 1433
1372 O6.-1 -an-2 4' (1 -s)/I if the defector is player i 1434
1373
Because all these entries pe need to be in the unit interval 1435
1374 143743386
1375 O[sa,,-2— (1— wl )an-2 — Wibe-i + (I —s)/}≤ 0
1376 1S65] 14
0 ≤ egsba-t — an-2 + (1 — s)1.
1377 1439
1378 Adding up these inequalities yields #(s+ wi)(b, _1 — a„_2)≥ 0 for all jOi, and because of Eq. S2 1440
1379 1441
1380 0(s + tv.,) a 0 for all j*i. 1566] 1442
1381 1443
1382 Combining the inequalities 563 and 566 then yields 1444
1383 1445
1384 0(1+ii/j) ≥ 0 for all /0/. (567] 1446
1385 1447
1386 and because at least one of the iv' is larger than zero (because all wi sum up to one), it follows that #≥ 0. The restriction efr$ 0 then 1448
1387 implies 417 O. Due to the inequalities S63 and 866, we may also conclude that —minfowi ss ≤ I. Because minf,,,w, ≤ 1/(n — 1) (again 1449
because the wi sum up to one), it follows that —1/(n — 1)≤s ≤ 1.
1388 1450
For s# 1, the inequalities S62 and S63 imply bo ≤/≤a„_,.
1389 1451
Proof of Proposition 2: For a given a= (Si S',,), the entries p„ of a ZD strategy according to Eq. 512 can be written as
1390 1452
1391 1453
1392 pn = lir + 46[(1 —s)(1-10,) + E wi(gf, —gni. 1568] 1454
114
1393 1455
1394
with Al, given by Eq. SS and with t ie and g, given by Eq. S7. Let e denote the set ofi's coplayers who cooperate in state a, nd let up
1395 denote the corresponding set of coplayers who defect. Using this notation, the entry pe is given by 1456
7
1396 1458
1397 1459
1398 I +# [(I -s)(/ -aki..i) - Ew,0,,,, -al,-0I if S, = C 1460
1399 AAP 1569] 1461
Pe =
1400 1462
1401 #1 1-s)(/-4,1)+ Ew,(blei-ale ! ) if S, =D. 1463
1402 icor 1464
1403 1465
1404 Because et, > 0 can be chosen arbitrarily small, the conditionpe ep, I] is thus satisfied for all a if and only if the following inequalities hold 1466
1405 1467
(1—s)(/—akf ri) — Eivi(bm — apt ') SO for all a with S, =C
1406 1468
Se
1407
1408
(I -s)(/ -4 4) + Eiv,o,,,,
-nicht) ≥ 0 for all a with ,
SD = .
18701 1469
1470
1409 160r 1471
1410 1472
Ifs =1, these inequalities are independent of the parameter I. and they are satisfied for any social dilemma (because wi≥ 0 for all/ by
1411 assumption and because bk,i >tilt , by Eq. S2). Fors < 1, we may divide the above inequalities by (I —s)> 0, implying that Eq. S70 is 1473
1412 equivalent to 1474
1413 1475
1414
1415
atel-I + E
jadow)(biel —°101-1)
I —s
≥l for all a with Si =C 1476
1477
(S71]
1416 Enew)(biel - °101-1) 1478
1417 voi — ≤/ for all a with Si =D. 1479
1—s
1418 1480
1419 The inequalities S71 in turn are satisfied if and only if 1481
1420 1482
≤/≤ min atei.., + EGew)(biel—alel-0) .
1421 if ,150161 — alel-I)) 1483
max thcei (S72]
1422 elY,•11 I —s elvic I —s 1484
1423 1485
1424 Because the terms (blei- Ott! )/(1—s) are positive, the respective maxima and minima are attained by choosing the weights wi as small 1486
1425 as possible. That is, for a given number of total cooperators lal, the extrema are attained for those states a for which E irrews and 1487
1426 E ja,,,,w, are minimal. This observation implies that condition S72 is equivalent to 1488
Hite et el. www.pnes.orgfcglkontentMort/1407811711I 12 of 19
EFTA01199766
1489 1551
1490 max b —a/A)) Ittn_j_i1—s
(bit , —ai)} 1552
<1< min + 073]
1491 O 1—s asisn-I 1553
1492 1554
1493 with 45 being the sum of the j smallest entries in (vi) . 1555
Proof of Proposition 3: We already know that for a ZD strategy to be a Nash equilibrium, one of the three conditions need to be fulfilled
1494 1556
(otherwise there would be a different ZD strategy that yields a higher payoff). Conversely, let us assume that one of the three conditions
1495 1557
of the proposition is fulfilled.
1496 1558
1497 I) Ifsit =0, then by Eq. S20, the mutant's payoff is 1=1, irrespective of the mutant's strategy. In particular, there is no incentive to deviate. 1559
1498 ii) Suppose s R > 0,1= an-I, and let us assume to the contrary that the zero-determinant strategy is not a Nash equilibrium. Then there 1560
1499 is a mutant strategy such that 1> a,,A . Because the residents collectively enforce the relation if =st k+ (1 —sR)a,,_i and because 1561
s' >0, we can conclude k>a„_i. However, then the average payoff of all group members exceeds a„..i, contradicting the
1500 1562
assumption that an_i is the maximum average payoff per round.
1501 1563
iii) Under the assumption that be is the minimum average payoff per round, the case s R <0 and 1 = bo can be treated analogously to
1502 the previous case. 1564
1503 1565
1504 Proof of Proposition 4: Because p is a Nash equilibrium, the payoff of any mutant strategy j, satisfies I <an-1. Let us first consider a mutant 1566
1505 who applies the strategy i.e., h i =0 for all (S.j). Because the mutant never cooperates, =0 for all j, and by Lemma 1 also 1567
1506 Poi =0 for allj except/ e {0.n — 1}. The values of fia„_i and ODA can be obtained by calculating the left eigenvectors of the transition matrix 1568
1507 (An -1) (D.0) 1569
1508 IS74] 1570
n — 1) PC4-2 I - PC„-2 •
1509 1571
(D,O) PD.0 1 -pD.0
1510 1572
1511 If we hadpt .-2 =1, then the assumption that p players start with cooperation would imply i Da_i =1, such that the payoff of AHD was 1573
1512 ir=b„_, >a„..1. Because this contradicts the assumption tha p is a Nash equilibrium, we conclude that pc.„_2 =O. 1574
1513 A calculation of the left eigenvector of Eq. S74 with respect to the eigenvalue 1 then yields 1575
1514 1576
oot
1515 no +Pas) 1577
~tDn 11 .[P
1516 (S751 1578
1 / (I -Fpoo)
1517 1579
1518 1580
1519 As a result, the payoff of AID is 1581
1520 1582
+ bo
1521 k = b„_APD„,,A + bean! IS761 1583
1 +pn.o
1522 1584
1523 1585
The requirement ir <an-I then implies the condition S28c.
1524 1586
As another special case of a possible mutant strategy, let us consider a mutant p such that pca_ l =0, and pso =I for all other states
1525 (S,j). With a similar calculation as in the previous case, one can determine this mutant's payoff as 1587
1526 1588
1527 (ao-i + bo-0 'Nu + 00 1589
= °n-iPco-1+bo-ipao-t +koPco • IS77]
1528 1 + 2pn.1 1590
1529 1591
1530 The requirement if <€2,,_1 implies condition S28b. 1592
1531 Suppose the n —1 residents apply the pure memory-one strategy p. Due to Akin's lemma 1593
1532 n-I n-I 1594
1533 E (pci- Ovci + Expo;=0, (S781 1595
1534 fro 1596
1535 which because of Lemma 1 and =1, N o-2=0 simplifies to 1597
1536 1598
1537 1599
VC„-z=paimi +pnevn.o. (S791
1538 1600
1539 1601
Then, irrespective of the mutant's strategy, the payoff k satisfies
1540 1602
1541
1542
— a n _t = r ).(1
(ai—aa_Octi + Qtt,
Lemma
= (ae — a o.4)ke + (bo_i — a o..4)1t.o.4+(bo — )ip.°
1603
1604
1543 Lemma I, 1605
1544 = Igo +(bo-3 — °n-i)vca-2+ (bo — ao-i)voa
1606
1545 Etl 1607
= l"0 — an-0(Prkiva3 +PD.ovass)+ (b0 — an-dvao
1546 1608
1547 (528bal 1609
1548 = frbn-I (72e-t — clo)] • VAl + [(bn-I "an-I)PD.0 (ao-i bad • VD.0 S 0, 1610
1549 1611
1550 that is, p is a Nash equilibrium. 1612
Hite et el. www.pnas.oegfcglkontent/thort/140788711 13 of 19
EFTA01199767
1613 Proof of Proposition 5: The proof follows along the same lines as the proof of Proposition 4. 1675
1614 Suppose pis a Nash equilibrium and assume thatpal 00 (because p is a pure strategy it follows that pal =1). We have to show that 1676
1615 these assumptions imply pc,' =pca-2 =0 and (b,,-.1+a0)12<bo. To this end, let us first consider a mutant with strategy AMC, such 1677
1616 that psi =I for all (S./). Because of Lemma 1, and because the mutant always cooperates, the only possible outcomes in a given round 1678
1617 (from the perspective of the mutant) are (C. is — 1) and (C.0). The transition matrix is given by 1679
1618 1680
1619 — 1) (c 0) 1681
1620 (MO) 1682
(C,ri —1) PCA-1 I
1621 (C.0) 1 0 1683
1622 1684
1623 1685
1624 The limit distribution of this transition matrix is 1686
1625 1687
1626 1 ) 1688
1627 1689
(PcA-1) = ( 2 —Pca-1
1628 , lS811 1690
Pco 1—Pc„ - 1
1629 1691
1630 2 PC„-1 1692
1631 1693
such that the payoff of ABC becomes
1632 1694
1633 1695
1634 + (1 — 1696
= an-a0c0-1+ agto (582)
1635 2 —Pc„-1 1697
1636 1698
1637 If we had pc.„- I =1, this payoff would equal to ir=a„_i >60 =x, contradicting our assumption that p is a Nash equilibrium. Thus, 1699
1638 pc„-1 = 0. 1700
1639 Let us now consider another mutant strategy p with PD.o= 1 and psi =0 for all other states. Again by constructing the transition 1701
1640 matrix [with possible states (C. n —1). (C.0), (D.n —1), and (D,0)], one can compute the payoff of this mutant as 1702
1641 1703
N-1+ —Pc.-2)(bo+ao)
1642 = l 5831 1704
1643 3 2Pcn-2 1705
1644 1706
If pc,,_2 =1, then 1=6„-i >bo= if, again contradicting the assumption that p is a Nash equilibrium. Therefore, isc.,,_2 = 0, and the
1645 mutant's payoff becomes ir= (b„_i +b0 +a0)/3. For p to be an equilibrium, this payoff needs to satisfy it <b0, which yields 1707
1646 (b„..) +ao)/2sbo. 1708
1647 Let us first consider the case pDA =0. Because the memory-one strategy also prescribes to defect in the first round and because 1709
1648 pao= 0, it follows that all residents play defect throughout the game, irrespective of the strategy of the mutant. Thus, any mutant's 1710
164gio payoff can be written as k=i)a.obo +itoao <bo= A. This shows that p is a Nash equilibrium. 1711
1650 Let us now consider the second case, pc,,_ ) = pc„..2 =0. Without loss of generality, we can also set pal = 1, and by assumption, 1712
1651 pa0= 0. Under these conditions, Alcin's lemma becomes 1713
1652 1714
= ka-1 +VCa-2. ISM)
1653 1715
1654 Then, irrespective of the strategy of the mutant, the mutant's payoff satisfies 1716
1655 1717
1656 rt-1 1718
1657 ir - b0 = Da / - bo)fti +(bi -bo)Poi 1719
1658 i-0 1720
1659 1721
Ltm=ma +(ao -bo)Pc.o+ (bn-i
1660 1722
1661 Isma I, 1723
= -bo)vc,„_) +(ao -60)voi + (b,,_ ) - bo)vc,,,_2
1662 1724
1663 Ell 1725
I CI(an-1 - bo)vcA-1 + (ao -bo)(Vca-1+VCA-2) bO)9C4 -2
1664 1726
1665 = (an-1 no - 2b0)' VC„-1+ (bn-i +a0 - 2b0). van-2 5 0, 1727
1666 1728
1667 with the last inequality being due to the fact that (b„_) + ao)/2 sb0 and a,,..) <b,,_ ). Again, we conclude that p is a Nash equilibrium. 1729
1668 Proof ofProposition 6: Due to our construction, strategy alliances require each ally to apply a ZD strategy p with parameter I, s, and 0 1730
and weights w. To enforce an effective slope r 1, Eq. S33 implies that the parameter s needs to be chosen such that
1669 1731
1670 s = sA + (11A DIVA (] —SA). (5851 1732
1671 1733
1672 For? =1, we get s=1, and Proposition 2 guarantees that the payoff relationship is enforceable (independent ofI and the weights wA). 1734
1673 Let us now assume that s <I (and therefore? < I). Because of Proposition 2, the alliance can enforce the payoff relationship (I.?) if 1735
1674 and only if we can find an appropriate weight vector w= (w)) such that 1736
Me et el. www.pnes.orgicglkontentishoit/140788711I 14 et 19
EFTA01199768
1737 1799
1738 i) gin- I bit . —a) (S861 1800
max {Is' — 1 _ 08.4 _ oirt ll'nfi-i
—sii 1 </ ≤ min fa, + l _ friA _ owA 1 _ 5,4 }.
1739 asssn-i asisn-I 1801
1740
1741 As in Proposition 2,W) refers to the sum of the smallest entries in w (excluding the entry corresponding to the focal player). Because w 1802
1803
only has the two possible entries w4 and tv- A, we can write 67 as
1742 1804
1743 1805
1744 fivA if wA <w- A. j <nA —1 1806
{(n A — Owl + (1— nA +1)W-A if wA <w-A. j> n A - I
1745 w., _
—
j w-A
15871 1807
1746 if wA >w- A, tot -n4 1808
1747 (I +11-4 -n)tvA + (n -nlitr A if wA >WA, j > it -nA 1809
1748 181
1749 The constraint (nA — 1)IvA + (is — nA)w- A =I implies iv- A= [1— (nA —1)w/fin —nA). By plugging this into Eq. S87, we can calculate 1811
1750 the expression
1812
1751 1813
iwA 1
1752 if w.4 <_ j < nA -1 1814
1753 1— (nA —1)wA n-1,
-1' 1815
1754 1 is —j— 1 1816
1755 if iiP4 < , j> nA - 1 1817
if 1- (nA -1)wA n -flA
1756 MB] 1818
1 - (nA -1)wA 1 j<
1757 1 if HA > _. It -nA 1819
1758 n —nA n —1 18210
9
1759 1- (n - j -1)wA 1 1821
if wA i .j> n —nA. 1822
1760 1— (nA —1)wA
1761 1823
1762 Due to condition 586. the space of enforceab e payoff relations becomes maximal when we choose the weight wA such that 1824
1763 k/[1— (nA —1)wAl becomes maximal. Eq. S88 suggests that by/[I — (nA —1)wAi is monotonically increasing in wA. Thus, considering 1825
1764 the restriction 0 <wA <1/(nA — 1), the maximum is attained for HA-. 1/(nA -1), which also implies 1A> 1/(n- 1). From Eq. S88 1826
1765 we obtain 1827
1766 1828
1767 1829
if m ii if jsn-nA
1768 _ { n nA 1S891 1830
1769 .a -Inn' -If 1 — (nA —1)wA 1831
co if j >n — nA .
1770 1832
1771 Thus, for wA sufficiently close to 1/(nA -1), condition S86 is satisfied if and only if 1833
1772 1834
1773 i 1 . bs
n. —ai 1835
max {bi— j Ai r): 1 ) <I < -min {a, +n —* , (S981 1836
1774 0sisn-n-4 ' n — it— 1 .5" nA -1 ' is .11'- 1 SA ) .
1775 1837
I 71011 This is condition S34. Moreover, if nA <n/2, setting j=n/2 in the left side of Eq. S90 suggests 1838
1777 1839
1778 n/2 bna — a/t12-1 1840
bn/2 I. 1591]
1779 n — nA 1 — sA 1841
1780 1842
1781 Similarly, setting j=n/2 — 1 in the right side of Eq. 890 leads to 1843
1782 z-a 1844
n —n/2 k ip —anf
1783 1≤64/2-1 + n _ nA 1 — '1
sA9
S 845
2]
1784 1846
1785 Summing up these two inequalities shows that sA > — [nA /(n —nA)[. 1847
1786 Proof ofProposition 7: The proof follows the lines of Propositions I and 2. By its definition (Eq. S40), a ZD strategy for a synchronized 1848
1787 alliance has the form 1849
850
1788
1789 1851
{ 1+0[0 —sA)(1—a„,,,n_i )—II: nn7 i (b„Ay —an.,n_i d if S =C
1790 1852
1791 Psi = P 931 1853
1792 4,[(1-sA)(/-M+ 4 01-ai_1)] if S=O, 1854
1793 1855
1794 1856
for 0 <j S n —nA. The conditions pci <1 and poi ≥ 0 imply the following sign constraints
1795 1857
1796 1858
1797 ., n -nA -j ,.. 1859
0 a. 0[(1-sio- a„An_ij n n.„1 (a04,5 - anAti-id (S9421
1798 1860
Hltbe et al. www.pfsas.org/eglkontenthhortn40703711I 15 et 19
EFTA01199769
1861 1923
1862 0 ≤ 0 [(I -sA)e -1)1) +n
* - 1.4 0 1- afri)]. [S94b] 1924
1863 1925
1864 for all 0 ≤j < n -n-4. Setting j=n—nA in Eq. S94a yields 1926
1865 1927
1866 441 —sA)(1— an.° 1. 0. (S951 1928
1867 1929
1868 and setting j=0 in Eq. S94b yields 1930
1869 1931
1870 0≤0(1—sA)(1—bo). (S961 1932
1871 1933
1872 Adding up Eqs. S95 and S96 shows that 1934
1873 1935
1874 #(1 -sA) ≥ O. 1S971 1936
1875 1937
1876 For s 4 =1, we can ensure that 0 ≤psi S I for all entries in Eq. S93 by choosing a 0> 0 that is sufficiently small. Let us therefore 1938
1877 assume ? # 1. Because we also have 0# 0 by definition, condition S97 becomes 0(1 — s-4) > 0. Dividing the sign constraints in Eq. S94 1939
1878 by 0(1 —sA) then implies 1940
1879 1941
i b. -a._i
1880 max {b.- J 2—} Vs min la„A 4) _i +n -"A -j bnAli - anA+)-1). (8981 1942
1881 0s)sn-nA I n flA I SA ssisn-nA n —11A I —sA 1943
1 882 944
1883 This condition in turn is equivalent to condition S42. 1
1945
Conversely, suppose Eq. S42 is satisfied for some (/,?) with : A O 1. It follows that the sign constraints S94 are satisfied for every
1884 1946
choice of A. subject to the condition 0(1 —sA)> 0, which implies that the entriespsi in Eq. S93 satisfypci s I and ppi ≥ 0 for all/ By
18112 choosing # sufficiently close to zero, we can also ensure that pc) >0 and ppo ≤1. This shows that the payoff relationship (1,sA) is 1947
188 enforceable. 1948
1887 Finally, suppose nA ≤n/2. Setting/ =0 in Eq. S94a yields 1949
1888 1950
1889 0[(1 -sA) (I - a„A _I ) - (b„A - a„A_ I )) 5,. O. 1S991 1951
1890 1952
1891 and setting j =nA in Eq. S94b results in 1953
1892 1954
„A
1893 1955
1894 0 ≤ 0 [(1 —?)(/ — b„.i) + ..1 (bo — anA ..1 )1. 1S1001
n —n 1956
1895 1957
1896 Adding up these two inequalities shows that 1958
1897 1959
1898 .4
/I 1960
1899 0 (s4 + H ≥ O. 01011
n ItA 1961
1900 1962
1901 Combining Eqs. S97 and SIO1 gives 0> 0, which in turn implies —[nA /(n —n-4)] ≤sA ≤ I. 1963
1902 1964
1903 1. Press WH, Dyson PI (2012) Iterated Prisoners Dilemma contains strategies that dominate any evolutionary opponent. Prot Nati Arad Sc) USA 109126):10409-10411 1965
wan 2. Akin E (2013) Stable cooperative solution for the iterated prisoner's dilemma. 1966
3. Kerr B, Godfrey-Smith P. Feldman MW (2004) What is altruism? Trends (col Evol 190i:13S-140.
1905 4. ledyard 10 (1995) Pubk goods: A survey of experimental research. The Handbook of Experiments Economic. eds Kagel lit Roth AE (Princeton Univ Press. Prin(eton). 1967
19061e S. Diekmann A (1985) Volunteer's dilemma. 1 Conflict Resoled 29:60S-610. 1968
6. HAW C. Trauisen A. Sigmund K Partners co nvals7 Strategies to, the Iterated prisoners dilemma. Working Paper. 2013.
i96I" 7. HAW C Rohl T. Milinski IA (2014) Extortion subdues human players but Is finally punished in the prisoners dilemma. Nat Common 5:3976.
1969
1908 8. Stewart Al, Plotkin M (2012) Extortion and cooperation in the prisoner's dilemma Poe Nail Aced Sci USA 109(26):10134-1013S. 1970
1909 9. Hilbe C. Nowak MA. Sigmund K (2013) Evolution of extortion In Iterated Prisoners Dilemma games. Prot Nate Atad Sc? USA 110071:6913-6918. 1971
10. Stewart Al, Plotkin 1B(2013) From extortion to generosity, evolution in the Iterated Prisoner's Dilemma. Proc Nat1Aced So USA 110(30:15348-15353.
1910 11. Hite C, Novak MA, Trauben A (2013) Adaptive dynamics of extortion and compliance. PtoS ONE 8111)x+77886. 1972
191m n. szomoki A. Pert M (2014) Evolution of extortion In structured populations. Phys Rev E stet monansott matter PhyS 89:022804. 1973
iusin 13. Pan 1., Ito D, Rang 2, thou T (2014) Zero-determinant strategies in the iterated pubic goods game. 1974
14. NOwak 6A. Sigmund K (1993) A Strategy 01 win-Start lose-shill that outperforms tit-fa-tat in the Prisoners Dilemma game. Nature 344(6432):56-93.
19 Via IS. Haven C. Schuster HG (1997) Effects of increasing the number of players and memory size In the iterated prisoners dilemma: a numerical approach. I00( WO SO 26t513-519. 1975
1914 IL Sigmund K (2010) The Calculus of Se/Pshness (Princeton Mir Press, Princeton). 1976
A. Friedman 1O971) A noncooperative eguikbrium for suPergarnes. Rev Ron Stud 313i -12.
19 811 1B. Fudenberg D. Mackin E (19.96) the folk theorem in repeated games with discounting or with incomplete information. Econometrica S4(3):533-554.
1977
1916 19. Boyd R. Richerson Pi (1988) The evolution of reciprodty in sizable grown./ Meer anal 132(3):337-356. 1978
1917 20. Kurokawa 5, tiara V (2009) emergence of cooperation In pude( goods games. Prix Old SCi 2760660):1379-13M. 1979
21. Chen 1, Zinger A (2014) The robustness of zero-determinant strategies in iterated Prisoner's Dilemma games. 1 Theor Bid 387%6-54.
1918 1980
1919 1981
1920 1982
1921 1983
1922 1984
Hilbe et al. www.pnas.OrgAgikontentishort/1407887111 16 of 19
EFTA01199770
1985 A Public Goods Game B Volunteers Dilemma 2047
1986 towage peon 2048
1987 or co-players
Average00AB
OICO-PlOYenl 2049
1988 R' 2050
1989 •c -c ltd 2051
1990 *o av ha
2052
1991 2053
1992 2054
1993 2055
1994 2056
1995 2057
1996 10.03 1. Payoff of Payoff of 2058
1997 bcal player fccatplayer 2059
1998 2060
1999 Fig. 51. Illustration of 20 strategies in the case of equal weights, iv) =1/(n —1), for all j ii, and for (A) the linear public goods game and (8) the volunteers 2061
2000 dilemma. The blue-shaded area represent all feasible payoffs, with the x axis representing the payoff of player i and they axis representing the mean payoff
2062
2001 of is coplayers. The dashed diagonal gives the payoff combinations for which rri =re'. In both graphs, the strategy of player i is fixed to some ZD strategy,
wfiereas for the coplayers we sampled 104 random memory-one strategies. Red dots represent the resulting payoff combinations, and the gray line gives the 2063
2002 prediction according to Eq. 513. For both graphs, we considered an infinitely repeated game in a group of size n=0. Parameters: (A) public goods game 2064
2003 1a1= Q+ 1)rcin — c and bi =jrcinj with r = 2.4 and c = 1. For the strategy of player i, we used a generous ZD strategy with parameters / =rc—c, s = 2/3, 4=1/2. 2065
2004 (8) Volunteers dilemma (ai =b - bli p =b, and by =0) with b=1.S, c=1; player i applies an extortionate strategy with parameters 1=0, s =9/10, 0=1/2.
2066
2005 2067
2006 2068
2007 A Public Goods Game B Volunteers Dilemma 2069
2008 snot Slope. 2070
2009 2071
2010 1 2072
2011 2073
2012 *It 2074
2013 *AS- 2075
2014 n 2076
2015 2077
2016
2017
2018 o
n b et
Bawer* of l Baseline
2078
2079
2080
2019 payoff -e Parr(
2081
2020 2082
2021 Fig. 52. Enforceable payoff relations in the case of equal weight on all coplayers for (A) the linear public goods game and (8) the volunteers dilemma. A pair
2083
2022 (1,$) is enforceable for a given group size n if the point is within the respectively shaded area. The set of enforceable pairs for large n is a subset of the re-
spective set for smaller n, i.e., the set of enforceable pairs shrinks with increasing group size. Parameters: (A) linear public goods game with r = 2.4, c=1; (B) 2084
2023 volunteer's dilemma with b=1.5, c=1. 2085
2024 2086
2025 2087
2026 2088
2027 2089
2028 2090
2029 2091
2092
2030
2031 2093
2032 2094
2033 2095
2034 2096
2035 2097
2036 2098
2037 2099
2038 2100
2039 2101
2040 2102
2041 2103
2042 2104
2043 2105
2044 2106
2045 2107
2046 2108
Hite et al. www.poes.org/cgikontentishortiMON14711I 17 of 19
EFTA01199771
2109 2171
2110 2172
2111 2173
2112 2174
2113 2175
2114 2176
5
2115 a 2177
2116 a
04
2178
2117 to 2179
2118 Grim & TFT WSLS 2180
n-1
2119 2181
2120 g
2
2182
2121 2183
2122 2 2184
2123 2185
Grim & TFT
2124 2186
2125 2187
2126 4 5 6 7 2188
Group size n
2127 2189
2128 Ng. 53. Stable memory-one strategies for the linear public goods game. The figure illustrates for which parameter regions the strategies Grim, TETA, and 2190
2129 WSLS are Nash equilibria, provided that the public goods game constitutes a social dilemma 1<r <n). Grim and TfT,i are always Nash equilibria. 7P7 for 2191
2130 k <n— I is never a Nash equilibrium. WSLS is a Nash equilibrium when (N.' +b0)/2 san.i, which yields r z2n,t(n +1). In particular, WSLS is always a Nash 2192
equilibrium when r22, irrespective of group size.
2131 2193
2132 2194
2133 2195
2134 Slopes 2196
2135 2197
2136 1 2198
2137 2199
2138 =1 2200
2139 2201
2140 =2 2202
2141 3 2203
2142 2204
2143 2205
2144 2206
2145 Baseline 2207
2146 payoff 2208
2147 2209
2148 2210
2149 2211
2150 2212
2151 2213
2152 2214
2153 -1 2215
2154 2216
Ng. 54. Strategic power of strategy alliances in the public goods game. Each colored area illustrates the set of enforceable payoff relations according to
2155 2217
Proposition 6 for different alliance sizes M. T e set of enforceable payoff relations for small nA is a subset of the respective set with larger nA. Consequently,
2156 the larger the alliance, the more extreme payoff relations it can enforce. Parameters: linear public goods game with r= 2 4. c =1, and group size n=10. 2218
2157 2219
2158 2220
2159
2160 1177
2161 2223
2162 2224
2163 2225
2164 2226
2165 2227
2166 2228
2167 2229
2168 2230
2169 2231
2170 2232
Hite et el. www.pnes.OrgeglkOntenthhcet/140788711I 18 of 19
EFTA01199772
2233
2234
2235
2236
2237
2238
2239
A
a 0.B
5 0.6
10.4
i.o
02
C.2 0.0
Extortionate alliance B
l.0
OA
OA
0.4
02
OA
( Outsiders
Fair alliance
PAisnce
C
1.0
OA
OA
0.4
02
00
Generous alliance 2295
2296
2297
2298
2299
2300
2301
2240 a 100.000 200.000 100.000 200.000 0 100.000 200.000 2302
1141 2303
2242 2.0 2.0
Miaow
2.0
2304
2243 row=
Outsiders 2305
1g 1.6 Is I.5
2244 AIlance 2306
ros 1.0 IA ance
2245 2307
0.5 OWSklerS os OA
2246 2308
Os OA AO
2247 2309
2248 0 100.000 200.000 0 100.400 200.000 0 100.000 200400
2310
lime Time lime
2249 2311
2250 Fog. SS. Performance of three different 2O alliances against adaptive outsiders. For each simulation, the strategy of the 2O alliance was fixed, whereas 2312
outsiders were allowed to adapt their strategy as described in the text. (Upper) Average cooperation rate during each repeated game. (Lower) Resulting
2251 2313
payoffs for allies and outsiders. fad) panel depicts the average of 20 simulations. All simulations were run for a public goods game with r =3 and c= 1, in
2252 a group of size n = S with n" =2 allies. For the strategies of the ZO alliances we used (A) an extortionate strategy with =0, s = 0.8, and #=1/2; (8) proportional 2314
2253 Tit-for-Tat with s = 1 and 0=1; and ft') a generous strategy with i= rc — c= Z 5=0.8, and #=1/2. 2315
2254 2316
2255 2317
2256 2318
2257 2319
2258 2320
2259
2260
1161
2262
2263
PNAS prooi 2321
2322
2323
2324
2325
2264
2265
2266
Embargoed 2326
2327
2328
2267 2329
2268 2330
2269 2331
2270 2332
2271 2333
2272 2334
2273 2335
2274 2336
2275 2337
2276 2338
2277 2339
2278 2340
2279 2341
2280 2342
2281 2343
2282 2344
2283 2345
2284 2346
2285 2347
2286 2348
2287 2349
2288 2350
2289 2351
2290 2352
2291 2353
2292 2354
2293 2355
2294 2356
14110e et el. mvw.pelaS.049/CgikOntent/160a/1407687111 19 of 19
EFTA01199773
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES 1
Q: 1_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This is merely done...."
Q: 2_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This always holds ...."
Q: 3_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This allows us to...."
Q: 4_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This holds more generally...."
Q: 5_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This holds more generally ...."
Q: 6_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": "This allows us to...."
Q: 7_Please clarify if (2, 10) refers to references or equation numbers.
Q: 8_Please clarify if (2, 6) is equations or references.
Q: 9_Please rewrite the following for clarity: "...is also a stable...."
Q: 10_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This shows that...."
Q: 11_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
`This": "This is condition ...."
Q: 12_PNAS mandates unambiguous pronoun antecedents. Please provide an appropriate noun after
"This": `This shows that the...."
Q: 13_For SI ref. 2, please provide journal name, volume, issue, and page range.
Q: 14_For SI ref. 5, please provide issue number.
Q: 15_Is SI ref. 6 published. References must be published to be listed in the reference list. Please provide
type of reference and all necessary publishing information or delete if unpublished and renumber
reference section accordingly.
Q: 16_For SI ref. 12, please provide issue number.
Q: 17_For SI ref. 13, please provide journal name, volume, issue, and page range.
Q: 18_For SI ref. 15, please provide issue number.
EFTA01199774
AUTHOR QUERIES
AUTHOR PLEASE ANSWER ALL QUERIES 2
Q: 19_For SI ref. 17, please provide issue number.
EFTA01199775