by date← Nov 10, 2015·Nov 10, 2015 →

EFTA01139453

Dataset 9

November 10, 201534 pages24,253 words

https://www.justice.gov/epstein/files/DataSet%209/EFTA01139453.pdf

Extracted Text

l het CrossMark Computational complexity of ecological and evolutionary spatial dynamics Rasmus Ibsen-lensenbi, Krishnendu Chatterjeeb, and Martin A. Nowakb °Institute of Science and Technology Austria, A-3400 Klosterneuburg, Austria; and °Program for Evolutionary Dynamics, Departments of Organismic and Evolutionary Biology and Mathematics, Harvard University, Cambridge, MA 02138 Edited by Christos Papadimitriou, University of California, Berkeley, CA. and approved November 10, 2015 (received for review June 10, 201S) There are deep, yet largely unexplored, connections between by modifying the fixation probability. The classic studies of computer science and biology. Both disciplines examine how Maruyama (2) and Slatkin (3) showed that symmetrical pop- information proliferates in time and space. Central results in ulation structures, such as regular lattices, do not change the computer science describe the complexity of algorithms that solve fixation probability compared with the well-mixed population. certain classes of problems. An algorithm is deemed efficient if it can The effect of population structure on evolution is also at the solve a problem in polynomial time, which means the running time of heart of the famous Wright-Fisher debate (4-6). the algorithm is a polynomial function of the length of the input. There In 2005, evolutionary graph theory was introduced as a tool to are classes of harder problems for which the fastest possible algorithm study how generalized population structures affect evolutionary requires exponential time. Another criterion is the space requirement outcome (7), and it has been studied in many other works (8-12). The individuals occupy the vertices of the graph. The links (edges) of the algorithm. There is a crucial distinction between algorithms determine who interacts with whom for receiving payoff and for that can find a solution, verify a solution, or list several distinct reproduction. There can be a single graph for game dynamical in- solutions in given time and space. The complexity hierarchy that is teraction and evolutionary replacement, or the interaction and re- generated in this way is the foundation of theoretical computer placement graphs can be distinct (13). Often, the graph is held science. Precise complexity results can be notoriously difficult. The constant during evolutionary updating, but it is also possible to famous question whether polynomial time equals nondetenninistic study dynamically changing graphs (14-22). The original Moran polynomial time 0.e., P = NP) is one of the hardest open problems in process is recovered as a special case given by the complete graph. computer science and all of mathematics. Here, we consider simple It turns out that all isothermal graphs. where all vertices have the processes of ecological and evolutionary spatial dynamics. The basic same rate of updating (the same "temperature"), have the same question is: What is the probability that a new invader (or a new fixation probability as the well-mixed population (7). All symmet- mutant) will take over a resident population? We derive precise com- rical population structures lead to isothermal graphs, but the con- plexity results for a variety of scenarios. We therefore show that verse is not true. some fundamental questions In this area cannot be answered by Many evolutionary contests, however, are not fought out with simple equations (assuming that P Is not equal to NP). constant fitness. Instead, the fitness of individual types depends on the frequency (is, relative abundance) of types in the pop- evolutionary games I fixation probability I complexity classes ulation. A well-known approach to frequency-dependent selection is evolutionary game theory (23-27). Here, the fitnesses of in- dividuals are linear functions of the frequencies. The coefficients E volution occursin populations of reproducing individuals. Mutation generates distinct types. Selection favors some types over others. The mathematical formalism of evolution de- of the linear function are the elements of the payoff matrix. Again, constant selection is a special case of frequency-dependent selection; for constant selection, all entries in a row of the payoff scribes how populations change in their genetic (or phenotypic) matrix are the same and this property holds for all rows. Evo- composition over time. Deterministic models of evolution are based lutionary game theory has traditionally been investigated with on differential equations. They assume infinitely large population size and ignore demographic and other stochasticity. The more Significance precise descriptions of evolutionary dynamics, however, use sto- chastic processes, which take into account the intrinsic randomness An important question in evolution is: how does population of when and where individuals reproduce and how many of their structure affect the outcome of the evolutionary process? The offspring survive. They also describe populations of finite size. theory of evolution in structured population has provided an A well-known stochastic process of evolution was formulated impressive range of results, but an understanding of the com- by Moran in 1958 (1). In any one-time step, a random individual putational complexity of even simple questions was missing. We is chosen proportional to fitness for reproduction and a random individual is chosen for death. The offspring of the first indi- prove that some fundamental problems in ecology and evolution vidual is added to the population. The total population size re- can be precisely characterized by well-established computational mains constant and is given by N. The original process was complexity classes. This implies that the problems cannot be formulated for constant fitness, which means the fitness value of answered by simple equations. For example, there cannot be individuals does not depend on the relative abundance of various simple formulas for the fixation probability of a mutant given types in the population; it is a fixed number. The crucial question frequency-dependent selection in a structured population. is: What is the probability that a newly introduced mutant will We also show that, for example, calculating the molecular generate a lineage that takes over the entire population? This clock of neutral evolution in structured populations admit quantity is called the fixation probability. For the original Moran efficient algorithmic solutions. process, there is a simple formula. If the resident has fitness 1 and the mutant has fitness r, then the fixation probability of the Author contributions: LC. and MAN. designed research performedresearch, and mutant is given by p= (1 — 1/r)/(1 — 1/rw). wrote the paper. The Moran process assumes that the biological population is The author declare no conflict of interest. well mixed. The offspring of any one individual can replace any This article Is a PNAS Direct Submissica. other individual. If there is a spatial or social population struc- 'To whom correspondence should be addressed. Email: ribseninst ac et ture, then such is not the case. The question arises as to which This article contains supportin information online at v.ww.pnas.orgilcokuprsupplidoi:10. population structures affect the outcome of evolution, for example, I073/tinat. 1511366f IZNIX5uPPlernentat www.poeS.0rg/Cgild0i/10.107344MS.15I B66112 PNAS Early Edition I 1 of 6 EFTA01139453 deterministic equations describing infinitely large populations problem P2, is a translation such that a solution for P2 can provide (23-26) and, more recently, has moved to finite population size a solution for Pi. More precisely, if there is a polynomial-time and stochastic processes (27-38). reduction from Pr to P2, then a polynomial-time algorithm for P2 Evolutionary games have a long history of being studied on implies a polynomial-time algorithm for Pi. spatial lattices (28, 39-42) and, more recently, on graphs (7, 27, 43, A given problem is NP-hard if for every problem in NP, there 44). The crucial quantity that needs to be calculated to evaluate is a polynomial reduction to the given problem. A problem is NP- natural selection is the fixation probability, p, of a newly introduced complete if it is both NP-hard and there is an NP algorithm for mutant that arises at a random position on the graph. If p > 1/N, the problem. where N is the population size, then natural selection favors the For example, consider a Boolean formula over variables, and fixation of the new mutant, because a neutral mutant would have a the question of whether there exists an assignment to the vari- fixation probability of I/N. In a contest between two strategies, ables such that the formula is true. A polynomial candidate so- another question would be if the fixation probability of the first lution is an assignment of truth values to variables, and given a strategy exceeds the fixation probability of the second strategy. If candidate assignment, the formula can be evaluated in poly- such is the case, then the first strategy would be more abundant in nomial time. This question is the famous satisfiability (SAT) mutation-selection equilibrium for low mutation rate. The crucial problem in computer science. The SAT problem is NP-complete. problem is of the following form: Given a graph and a payoff The class P is contained in NP, and a major long-standing open matrix, what are the fixation probabilities of an individual of a new question in computer science is whether P = NP. A polynomial- type arising in a population of the other type? time algorithm for an NP-complete (or an NP-hard) problem would Spatial structure plays an important role in many cases. For imply that P = NP, resolving the long-standing open problem. example, spatial structure can affect the rate of neutral evolution 'Ile class sharp (# P) intuitively corresponds to counting the (45), and there are results that describe which spatial structures number of solutions. A problem is in # P if it counts the number do or do not affect the outcome of constant selection (46-48). of distinct solutions such that (i) every possible candidate for a Some population structures can be amplifiers or suppressors of solution is of polynomial length, and (ii) given a candidate for a constant selection (7, 49, 50), meaning that they modify the in- solution, it can be checked in polynomial time whether the candi- tensity of selective differences. Finally, for evolutionary games, date is a solution. For example, given a Boolean formula, the spatial structure can favor evolution of cooperation (28, 51). problem of whether there are at least k distinct satisfying assign- The study of spatial dynamics also has a long tradition in ments to the formula is a # P-problem. A given problem is ecology (52-56). Here, the typical setting is that different species # P-hard, if for every # P-problem, there is a polynomial-time re- compete for ecological niches. Many evolutionary models are duction to the given problem. A # P-complete problem is a problem formally equivalent to ecological ones, especially if we consider that is both # P-hard and for which there is a # P-solution. For only selection and not mutation. Then we can interpret the dif- example, counting the number of solutions in SAT is # P-oomplete. ferent types of evolutionary games as different species. Again, a The class NP is contained in # P because, given the enumer- crucial question is: What is the probability that a newly introduced ation of solutions for # P, it is easy to check if there exists at least species can get established or take over an ecological niche? one solution. Intuitively, an NP problem asks whether there is at This paper is structured as follows. First, we give an intuitive ac- least one solution, whereas # P is the counting version that asks count of the foundation of theoretical computer science. We de- if there are least k distinct solutions (and the special case ofIt =I scribe classes of problems that can be solved by algorithms in certain gives NP). Again, a major open question is whether NP = # P. time and space constraints. Subsequently, we present two simple Note that a polynomial-time algorithm for a # P-complete problems of evolutionary dynamics in spatial settings. The first problem would be an even bigger result, because it would imply problem is motivated by a very simple ecological dynamic: the sec- both P = NP and P = # P. ond problem is the general setting of evolutionary games on graphs. The class PSPACE consists of problems that can be solved with In both (-Agog the basic question is to calculate the takeover prob- polynomial space. Note that a polynomial space algorithm can ability (or fixation probability) of a new type. That is. we introduce a reuse space and can, in general, require exponential time. Every new type in a random position in the population, and we ask what is # P problem can be solved in PSPACE by simply enumerating the complexity of an algorithm that can characterize the probability each candidate for a solution and checking if it is a solution. Because that the new type takes over the population (becomes fixed). Un- we can reuse space to enumerate the candidates for solutions, the expectedly, we are able to prove exact complexity, results (Table 1). enumeration can be achieved in polynomial space. Moreover, The class PTIME (denoted as P) consists of problems whose every polynomial-time algorithm uses, at most, polynomial space. solutions can be computed by an algorithm that uses polynomial Hence, it follows that # P is contained in PSPACE. The notion of time. Formally, an algorithm uses polynomial time if the running PSPACE-hardness and PSPACE-completeness is similar to the time of the algorithm grows as a polynomial function of the size of notion of NP-hardness and NP-completeness, but with respect to the input. In computer science, P represents the class of problems the problems in PSPACE. Again, a long-standing open question in that can be solved efficiently. computer science is whether #P = PSPACE, and a polynomial- The class nondeterministic polynomial time (denoted as NP) time algorithm for a ['SPACE-complete (or ['SPACE-hard) consists of problems for which solutions exist that are of polynomial problem would imply P = NP = # P = PSPACE. length. and given a candidate for a solution of polynomial length, We have mentioned that the major questions about the equality whether the candidate is indeed a solution can be checked in of the complexity classes are open problems, but the widely be- polynomial time. Therefore, an NP algorithm can verify a solution lieved conjecture is that P is strictly contained in NP, NP is strictly in polynomial time. contained in # P, and # P is strictly contained in PSPACE. To proceed further, we need the notion of "reduction" between In other words, it is widely believed that NP-complete problems classes of problems. A reduction, from a given problem Pi to a cannot be solved in polynomial time, # P-complete problems are harder than NP-complete problems, and ['SPACE-complete problems are harder than # P-complete problems. A pictorial Table 1. Complexity results for various models and illustration of the complexity classes is shown in Fig. I. computational questions Results Model Qualitative Quantitative The first problem is motivated by ecological dynamics. There is Ecological scenario NP-complete N P-complete an ecosystem occupied by resident species. The spatial structure Linear fitness PSPACE-complete PSPACE-complete of the ecosystem is given by a graph. An invading species is in- Exponential fitness P PSPACE-complete troduced (an illustration is provided in Fig. 2). We assume the invading species has a competitive advantage in the sense that 2 of 6 I www.pnes.agfegilda/10.10734,natISI1366112 Ibsen-Jensen et al. EFTA01139454 We are interested in the probability that a single A individual starting in a random position on the graph generates a lineage that will take over the entire population; this probability is generally called fixation probability. As before, there are two types of ques- tions. The qualitative question is whether the fixation probability is positive. The quantitative question is concerned with computing the Fig. I. Pictorial illustration of the complexity classes P, NP, a P, and PSPACE. The o complexity class P is contained in NP, NP is contained in 0 P, and I P is contained in PSPACE. The widely believed conjecture is that these complexity classes are * different. A problem is NP-hard if it is at least as hard as each problem in NP, and the case is similar for a P-hardness and PSPACE-hardness. The intersection of NP and NP-hard gives the NP-complete problems, the intersection of fl P and a P-hard gives the I P-complete problems, and the intersection of PSPACE and PSPACE-hard gives the PSPACE-complete problems. A polynomial-time solution for an NP-hard or NP-complete problem would imply P = NP. once a position is occupied by the invading species, the resident cannot get it back. The invading species, however, has a density constraint: If the number of invaders around a focal invader is Qt: Probability >IR above a threshold, it, then the invader in the focal vertex cannot Q2: Appracimate probability? colonize another vertex. We are interested in the probability that the invader starting from a random initial position will take over the entire ecosystem (and therefore drive the resident to extinction). There are two types of questions The "qualitative question- is whether the takeover prob- ability is greater than 0. The "quantitative question- is concerned with computing the takeover probability subject to a small enor. Fig. 2 gives a pictorial illustration. We prove the following results. The qualitative question is NP-complete (SI Appendir, Theorem 4). The quantitative question is # P-complete (SI Appendix, Theorem 8). The second problem is concerned with evolutionary games in structured populations. There are two types, A and B, whose reproductive rates depend on local interactions. We consider the setting of games on graphs. Each vertex is occupied by one in- dividual, which is either A or B. Interactions occur pairwise with all neighbors. The payoff matrix is given by Fig. 2. Illustration of mutant introduction. The residents (type A) are col- ored blue, and the mutants (type 8) are colored red. The black edges are the A B edges of the interaction graph and the red edges are the edges of the re- A la b\ production graph. The probability of introducing a mutant in a specific Ill vertex is always 1 over the number of vertices. The computational questions B kc di' of interest regarding the takeover probability are as follows: whether the probability is positive (qualitative question) and what is an approximation of The entries of the payoff matrix can be positive or negative (or the probability (quantitative question). 0). Each individual interacts with all of its neighbors on the graph to derive a payoff sum. The payoff sum is translated into repro- ductive success as follows. If the payoff sum is positive, then the fixation probability subject to a small error. We prove the following fecundity equals the payoff sum. If the payoff sum is negative, results. The qualitative question is NP-hard and in PSPACE. The then the fecundity is 0. We refer to this translation as linear quantitative question is # P-hard and in PSPACE. The results fitness. In any one time step, a random individual is chosen for follow from SI Appendir, Theorems 4. 8, and 15. reproduction proportional to its fecundity. The offspring, which Note that the first problem can also be obtained as a special is of the same type as the parent, is placed into an adjacent case of the second problem. In the payoff matrix (1), we can set, position on the graph (illustrations are provided in Figs. 3 and 4). for example, a =-1, b =1,c =d= O. This "game- has the property Ibsen-lensen et al. PNAS Fatty Edition I 3 of 6 EFTA01139455 We also consider a variation of the second problem. In par- ticular, we change the mapping from payoff to fecundity. We now assume that fecundity is an exponential function of payoff, and refer to it as exponential fitness (an illustration is provided in Fig. 4). Therefore, the fecundity of an individual is always positive (even if its payoff sum is negative). In this setting, the qualitative question can be decided in polynomial time. The reason is that the IA fixation probability is positive if the graph is connected. Thus, to answer the qualitative question, the algorithm only needs to check whether the graph is connected; this problem is in P. However, the VA quantitative question has the same complexity as the previous problem (SI Appendix, Theorems 16 and 17). a A very special case of games on graphs is constant selection. Type A has constant fecundity a and type B has constant fecundity b independent of any interactions (i.e., fecundity is independent of the population structure). The qualitative question concerning the fixation probability of A is in P. The quantitative question is in PSPACE, but any nontrivial lower bound is an open question. Finally, although we establish computational hardness for sev- eral problems, we also show that two classic problems can be solved in polynomial time (SI Appendix, section 7). First, we consider the molecular clock, which is the rate at which neutral mutations ac- cumulate over time. The molecular clock is affected by population structure (13). We show that the molecular clock can be computed in polynomial time because the problem reduces to solving a set of linear equalities, which can be achieved in polynomial time using Gaussian elimination. Second, we consider evolutionary games in a well-mixed population structure, where the underlying structure is the complete graph (38). We show that the exact fixation proba- bility can be computed in polynomial time. In this case the prob- lem can be reduced to computing absorption probabilities in Markov chains, where each state represents the number of mu- tants. Hence, the Markov chain is linear in the number of vertices of the graphs, and because absorption probabilities in Markov chains can be computed in polynomial time (by solving a set of linear equalities), we obtain the desired result. Methods: Proof Ideas We now present the key intuition and main ideas of our results. The most interesting and technically insightful results are the lower bounds (i.e., the hardness proofs), and we present the key ideas only for them. NP-Hardness of the Qualitative Ecological Problem. One of the most classic he- complete problems is the 3SAT-problem. whidy is the SAT problem where every dause has exactly three literals (a literal is a Boolean variable x or a negation of a a a variable x). Given instances of the 3SAT problem, we construct itstances of the Sg. 3. Illustration of reproduction with matrix a ( e ) . The residents ecological problem where we have a start vertex, where the mutant arises, fol- lowed by a sequence of vertices fie., each vertex can reproduce a mutant to the (type A) are colored blue, and the mutant (type 8) afe adored red. The black next), one for each dause. By means of mg construction, a vertex in this sequence edges are the edges of the interaction graph, and the red edges are the can reproduce, at most, three times, one of which must be the next vertex of the edges of the reproduction graph. In the first figure beside each vertex, the payoff sequence, with the others corresponding to, at most, two literals of the clause of the vertex (which is the sun of the payoff of the interactions) is shown. Because (intuitively, these two literals represent the ones that are not set to true by a the first figure shows the payoff computation, the interaction edges that are re- candidate-satisfying assignment). The last vertex of the sequence reproduces to a sponsible for the payoff calculation are boldfaced. In the second figure, the vertex new sequence of vertices that corresponds to an assignment of truth values to the labeled 3 is selected for reproduction. The reproduction edges from vertex 3 are variables. Each vertex in this new sequence can reproduce twice, one to the next bddfaced and each edge has the probability 1/2. Finally, the successor S is chosen vertex of the sequence and other to a variable or its negation. The variables or the for replacement (i.e. vertex 3 reproduces to vertex 5). corresponckm negation can then reproduce mutants to the corresponding literals of the clauses. After this sequence, all vertices that do not correspond to a literal in a clause become mutants. In essence, our construction ensures that if there is that type B never reproduces and type A reproduces until half of a satisfying assignment then with positive probability, all vertices can become its neighbors are also of type A. This parameter choice leads to mutant, and conversely. if there is no satisfying assignment then the proba- the same qualitative behavior and the same complexity bounds as bility that all vertices become mutants is O. described in the first problem. A generalization of games on graphs is the setting where the Madness of the Quantitative Ecologkal Problem. A Y P-complete problem interaction graph and the replacement graph are distinct (13). is counting the number of perfect matthings in a bipartite graph (which also Thus, each individual interacts with all of its neighbors on the corresponds to computing the permanent of a Boolean matrix). A bipartite graph consists of two set of vertices, a set on the left side arid a set on the interaction graph to receive payoff. Subsequently, an individual right side, with edges from the left side to the right side. A perfect matching is chosen for reproduction proportional to its fecundity. The off- is a one-to-one mapping of each vertex of the left side to a vertex on the spring is placed randomly among all neighbors of the focal indi- right side such that there is an edge between them. First, we argue that for vidual on the replacement graph. In this case, both the qualitative the hardness proof, it suffices to consider bipartite graphs in which each and quantitative questions become PSPACE-complete (S1 Ap- vertex on the left side has an out-degree of 2v for some integer k. A key idea pendix, Theorem 15). in our construction is that in a full binary tree, if the root becomes a mutant 4 of 6 I wwwonts.orgfcgildoi/10.1073/6nas.1511366112 Ibsen-Jensen et al. EFTA01139456 Peve eq144.161 &lino now Fig. 4. Illustration of different payoffs to fitness A with A ( -I z II . The residents (type A) are blue. -2 and themutants (type B) are red. The black edges are the edges of the interaction graph, and the red edges are the edges of the reproduction graph. In the fgure of the first row, we show the payoff for every vertex_ ki the next row, we show the fitness, which is either a linear function of the payoff but at least 0 or an n4•111,. exponential function of the payoff. Finally, in the thid row, with each vertec we show the mobabikty, which is the normalized fitness, that the vertex is selected for reproduction fin the last figure. the number xis the sum of the fitness x =e2 +2e+ 2ei and every vertex can reproduce exactly once, then the set of mutants will dynamics in structured populations. Our main results are summa- eventually consist of a path from the root to a leaf, chosen uniformly at rized in Table 1. We now discuss the significance of our findings. random. Our construction is then as follows: We have a start vertex where the mutant arises, which reproduces to turn each of the vertices on the left interdisdplinary Connection. Although both computer science and side of the bipartite graph to mutants. Each of the vertices on the left side is biology examine the proliferation of information in time and space, the root of a full binary tree, where the leaves correspond to the right side the deep connection between them has been largely unexplored. of the bipartite graph. We show that the fixation process corresponds to a Our work provides precise computational complexity insults for perfect matching (defined from the path in the full binary trees), and given several well-studied problems in biology and can be viewed as a step an approximation of the fixation probability, the exact number of perfect to establish a connection between the two disciplines. matchings of the bipartite graph can be computed. Well-Studied Open Problem. The problems we have considered are PSPACE-Hardness for the Game on Evolutionary Graph Problem. Our PSPACE- hardness proof shows that the evolutionary process can solve the following basic aspects of well-studied questions for ecological and evo- concurrent-if problem, which we show is PSPACE-hard. The concurrent-if lutionary dynamics in structured populations (7, 28, 37, 42, 50, problem consists of a set of Boolean variables xi,x2, ,x,,, with a given 51). Several reviews have been written on this topic (8, 11, 43, initial truth assignment to the variables, and a set of if-statements. Each if- 57). We first discuss the significance of an algorithmic approach statement s, is of the following form: If a conjunctive clause C, over the in evolutionary graph theory. An efficient algorithm, which con- variables is true, then assign a truth value to a variable (e.g„ if (xinx.nis), siders all (even worst-case) graphs for evolutionary processes, is then xi is assigned false). The problem is to decide whether the first variable important for the following reasons. (Which is the accepting variable) eventually becomes true. We show that First, it has been shown that some population structures each variable can be represented as four vertices, and each if-statement as a (called amplifiers) can increase the effect of natural selection single vertex, in the evolutionary graph and the evolutionary process can (7,51), but amplifiers are rare and constructing them is difficult (7, mimic the execution of the concurrent-if problem. Finally, if the accepting 11, 50, 51, 58, 59). If there were an efficient algorithmic approach variable becomes true, then it corresponds to making a special vertex in the that worked for all graphs, then one could design candidates for evolutionary graph as a mutant. There exists a part of the evolutionary amplifiers and efficiently check their fixation probabilities. Be- graph that can only become mutants after the special vertex has become a cause there exists no algorithmic approach, research has to focus mutant. Using this construction, we show that both the qualitative and on special classes of graphs to identify simple formulas, such as quantitative problems are PSPACE-hard for evolutionary games on graphs. calculating the fixation probabilities on star-like graphs (58). Second, it is known that some population structures and evo- Discussion lutionary dynamics promote evolution of cooperation but that In summary, we have established computational complexity results others do not (51). An important open problem is to characterize for some fundamental problems of ecological and evolutionary the set of graphs that promote cooperation. An efficient algorithmic Ibsen-Jensen et al. PNAS Early Edition I 5 of 6 EFTA01139457 approach would be useful to check candidate structures. Because widely believed conjecture that P is different from NP. A simple no efficient algorithm exists, one has to study special cases, for equation-based solution would give an efficient algorithm; thus, our example, by considering nearly regular graphs (51). result formally shows that for evaluating the fixation probability in Thus, a general algorithmic approach is a very important problem spatial settings there does not exist a simple equation-based solu- for the well-studied question concerning the effect of population tion in general. Our results are significant for the following reasons: structures on evolutionary dynamics. An algorithmic approach has (1) They establish the computational complexity for fundamental been studied for important special cases, such as for complete problems of ecological and evolutionary dynamics in structured graphs (60), and NP-hardness was stated for the quantitative populations (e.g., considered in 7, 8, 11, 28, 37, 42, 43, 50, 51, 57), problem (7). The review by Shakarian et al. (57) identifies the complexity of computing fixation probabilities on evolutionary and (ii) they significantly improve the complexity result of Lieber- graphs as an important open question in the area In that review, man et al. (7) and solve the computational complexity questions of two open problems (2.1 and 2.2) are identified that ask for the the area as identified in the review by Shakarian et al. (57). complexity of computing the exact fixation probabilities for graphs and for games on graphs. Our results not only present answers to Methodological Insight. Our proof ideas also reveal some important thaw crucial questions but also show that both the approximation points. We show how evolutionary processes in structured pop- problem and the qualitative question are computationally hard. The ulations can mimic aspects of computation. This insight could be most interesting aspects of our results are the lower bounds, which useful for future research on understanding the computational show that there exists no efficient algorithm in most cases, under the complexities of other stochastic processes on population structures. 1. Moran PAP (1958) Random processes in genetics. Proc Camb Philos Sac 59(1):60-71. 32. Szabo G, Antal T, Stab:. P, Droz M (2000) Spatial evolutionary prisoner's dilemma 2. Maruyama T (1970) Effective number of alleles In a subdivided population. Theo, game with three strategies and external constraints. Phys Rev E Stet Phys Names Papal Nol 1(31:273-306. fluids Rear Intesdisca Topics 62(1 Pt €1:1095-1103. 3. Slatkin M 09811Fixation probabilities and fixation tines in a subdivided population. 33. Kerr B. Riley MA. Feldman MW, Bohannon B/M (2002) local dispersal promotes EvohrtIOn 35131:477-898. blodiversity m a real-life game of rock-paper-scissors. Nature 418(6694):ln-174. 4. Wright S (1931) Evolution in Mendelian populations. Generics 16(21:97-159. 34. Hating D. Yu W (2008) Migration as a mechanism to promote cooperation. Adv 5. Mir 5 (1932) The roles of fatilatION inbreeding crosstreedng and selection in evo- Complex Syst 110):641-652. iution. PrOststatiVS of the Sixth International Congress oFGenetics Mach New Yerl). PP 35. TarnitaCE,Ohtsukift Antal T, fu F, Nowak MA (2009) Strategy selection in structured 356-M4 populations. 1 Meer Riot 259(3):570-581. 6. Fisher RA (19501 The - Sewell Wright Effect'. Heredity (Edina) 4(1)117-119. 36. Pert M 520fiCki A 17010) Cr:evolutionary €amesa Mill review. Biosystems 9903:109-125, 7. Ueberman E. Haven C. NOwak MA (2005) Evolutionary dynamo on graphs. Nature 37. van Veelen PA, Garcia 1, Rand DG, Nowak MA (2012) Direct reciprocity in structured 931(7023):112-316. populations. Prot Nati Arad Sd USA 109(25):9929-9934. 8. Stab0 G. Rath G (2007) Evolutionary games on graphs. Phys Rep 446(4-6)17-216. 38. Nowak MA Sasaki A Taylor C. Fudenberg D (2004) Emergence of cooperation and 9. Yang It.X. Wu 2X. Du WB (2012) Evolutionary games on scale free networks with evolutionary stability in finite populations. Nature 428(69831646-650. tunable degree distribution. Europhys tett 99(1):10806. 39. KIllingisadC T. DOeteli M (1996) Spatial evolutionary game theory: Hawks and doves 10. Own YT (2013) Sharp benefit-to-cost rules for the evolution of cooperation on reg- revisited. PIO( Viol Sc? 263(1374):1135-1144. ular graphs. Ann App.( Nasals 23(2)617-664. 40. Szabo G, Take C (1998) Evolutionary prisoner's dilemma game on a square lattice. Allen B. Nowak MA (2014) Games on graphs. European Mathematical Society Surveys Phys Rev E Star Phys Plasmas Hui& Rein Mterdiscip Topics S8(1):69-73. in Mathematical Sciences 1(1):113-151. 41. Static) G. Hauert C (2002) Phase Venation, and volunteering in spatial pubic goods 12. Oebarre f, Hauert C, Doebeli M 0014) Social evolution in structured populations. Nat games. Phys Rev Lett 89(111:118101. Common 5:3409. 42. Hallett C. Doebell M (2004) Spatial structure often Inhibits the evolution of co- 13. Ohtsuki µ Pacheco lµ Nowak MA (2007) Evolutionary graph theory: Breaking the operation in the snowdrift game. Nature 426(6983).643-646. symmetry between interaction and replacement. f Theor Riot 246(4681-69a. 43. Nowak MA Tamita CE, Antal T (2010) Evolutionary dynamics in structured pop- 19. Skyrms B. Penland. R (2000) A dynamic model of social network formation. Not Natl ulations. Philos Trans R Soc Lond a Riot Sc) 365(15371:19-30. Aced Sci USA 97(104340-9346. 49. Allen B, Tamita CE (2012) MeasureS of success In a dass of evolutionary models with Is. Pacheco /M. Traulsen A. Nowak MA (2006) Coevolution of strategy and structure m fixed population size and structure. I Math Blot 680-21:109-143. complex networks with dynamical linking. Phys Rev Lett 97(25):258103. 45. Allen B, et al. (2015) The molecular dock of neutral evolution can be accelerated or 16. fu F, Rouen C. Now* MA Wang L (2008) Reputation-based partner choice promotes slowed by asymmetric spatial structure. PLOS Comput Biel 11(2):01004108. cooperation In sooal new.orks My,kW E Stet Months Soft Matter Phys 78(2 Pt 2) 026117. 46. Adlarn B, Nowak MA (2014) Universality of fixation probabilities in randomly struc- 17. Antal T, Ohtsuicilt Wakeley s. Taylor PO, Nowak MA (2009) Evolution of cooperation tured populations. Sc? Rep 4:6692. by Phenotypic similarity. Not Nati Aced SO USA 106(21):8597-8600. 47. Martryarna T (1974) A Markov process of gene frequency change In a geographically It Tamita CE, Antal T, Ohtsuki µ Nowak MA (2009) Evolutionary dynamics In set structured populations. Proc Nati Arad Sc? USA 106(211:8601-8604. structured population. Genetics 76(21:367-377. 19. Szolnoki A. Peet M (2009) Resolving social dilemmas on evolving random networks. 401. Barton NH (1993) The probability of fixation of a favoured allele in a subdivided population. Genet Res 62(21:149-157. EurophyS Lett 86(3)30007. 20. Cava€ere M, Sedwards S, Tamita CE. Nowak MA, Csiluisz.Nagy A (2012) Prosperity is 49. Nowak MA Who, F. lwase Y (2003) The linear process of somatic evolution. Pr* associated with instability in dynamical networks.), Meth: Skil 299(0):126-138. Ned Aced Sci LISA 100(25):14966-14969. 21. Rand DG, Arbesman S, Christakis NA (2011) Dynamic social networks promote co- 50. NOwak MA (2006) erolutionary Dynamics (Harvard Unit/Press. Cambridge. MA). operation in experiments with humans. ProcNadArad Sri LISA 108(48):19193-19198. 51. Ohtsuki µ Hauert C. Lieberman E. Nowak MA (2006) A simple rule for the evolution 22. Wu B. et al. (2010) E./OK:Hall of cooperation on stochastic dynamical networks. PLOS of cooperation on graphs and social networks. Nature 441(7092):502-505. One S(6)e11187. 52. Durrett R. Levin S (1994) The importance of being discrete (and spatial). Meer Popo' 23. Smith /M (1982) Evolution and the Theory of Games (Cambridge Univ Press. Blot 4601:363-394. Cambridge. UK). 53. Levin SA Paine RT (1974) Disturbance, patch formation, and community stnxtwe. 24. Hoibauer 1, Sigmund K. Sr (1988) The Theory of Evolution and Dynamical Systems: Pros Nad Aad Sci USA 71(7)2744-2747. Mathematkal Aspects of Selection (Cambridge Vniv Press. Cambridge. UK). 54. Hassell MP, [mins MN, May RM (1904)kPaCies coexistence and self-organizing spatial 25. Hofbauer 1, Sigmund K (1998) Evolutionary Games and Population Dynamics (Cam- dynamics. Nature 370(6487):290-292. bridge Univ Press, Cambridge, UK). 55. Tilman D, Karelva PM (1997) Spatial Ecology. The Role of Space in Population 26. CreSSman R (2003) Evolutionary Dynamics and Extensa.. Form GEM'S, Economic DYnamks and Mterspecilk Interactions (PIntetta INN Press. Princeton). Learning and Social Evolution (MIT Press. Cambridge, MA). 56. Dieckmann U, Law R, Metz NU, eds (2000) The Geometry of Ecological Interactions 27. Broom M. Rychter 1 0013) Game-theoretical models in biology. Chapman 6 HaNCRC (Cambridge Univ Press, Cambridge, UK). Mathematkal and Computational Biology (Chapman P. mewcac. Boa Baton. FL). Si. Shakarian P, Roos P, Johnson A (2012) A review of evolutionary graph theory with 28. Nowak MA May RM (1992) Evolutionary games and spatial chaos. Nature 359(6398) applications to game theory. Biosystems 107(2):66-80. 826-829. 58. 441am B, Chatterlee K. Nowak is (2015) Amplifiers of selection. Prot R Sot Load A 29. Eliseo G (1993) teaming. bed iMeraction, and coordna6on. Economehica 61(5k1047-1071. Math Phys Sat 4710181):20150114. 30. Herz AV (1994) Collective phenomena in spatially extended evolutionary games. 59. Diaz 1, et aL (2013) On the fixation probability of superstars. Roc R Soc Land A Math I 771e0r Blot 1690).65-87. Phys Sci469(21S6)tt0130191. 31. Nakamaru M, Nogami µ Nana Y (1998) Score-dependent fertility model for the 60. Diaz a at al. 0014) Approximating Nation probabilities in the generalized Moran process. evolution of cooperation In a lattice. $ Theo, &NJ 194(0:101-124. Algodrivnica 690X78-kk 60f 6 I www.pnes.orgfcgildoi/10.1073/06.44.1511366112 lbSen.lensen et al. EFTA01139458 Supplementary Information: The computational complexity of ecological and evolutionary spatial dynamics Rasmus Ibsen-Jensent Krishnendu Chatterjeet Martin A. Nowakt t IST Austria PED, Harvard University 1 Introduction and Organization In this supplementary information we will present the detailed proofs of the results mentioned in the main text. To present a uniform treatment of the results we will consider the following notations: I. We will always consider that there are two types of individuals that occupy the vertices of the graph, and call them as mutants and residents (they represent type A and type B individuals, respectively, as mentioned in the main article). 2. To model the ecological scenario, that the mutants has an advantage that once they occupy a position, then the residents cannot win it over, we will model it as the case that the residents do not reproduce (note that a residents reproducing to another vertex which is a resident does not change the scenario of the graph). Organization of the results. The supplementary information is organized as follows: I. We start with the formal definitions of the model and the computational questions in Section 2. We will consider two functions to change payoffs to fitness, namely, linear bounded fitness (where the fitness is linear function of the payoff, but at least 0), and the exponential fitness function. In the three following sections after Section 2 we present results about the linear bounded fitness model. 2. In Section 3 and Section 4, we consider linear bounded fitness and no resident reproduction (that models the ecological scenario with advantage for the mutants). We establish NP-completeness of the qualitative question in Section 3, and #P-completeness of the quantitative question in Section 4. 3. In Section 5 we consider linear bounded fitness with resident reproduction, and show that both the qualitative and quantitative questions are PSPACE-complete. 4. We consider the exponential fitness function in Section 6. We show that the quantitative question for no resident reproduction, and the qualitative question (even with resident reproduction) can be solved in polynomial time. We show that the quantitative problem with resident reproduction is PSPACE-complete. 5. Finally, in Section 7 we argue that evolutionary games on well-mixed population, and finding the rate of molec- ular clock problem can be solved in polynomial time. 2 Models of Evolution on Graphs In this section we present the basic definitions related to the different models of evolution on graphs and the basic computational questions. Evolutionary graphs. An evolutionary graph C = (V, El , ER) consists of a finite set V of vertices; a set Et C V x V of interaction edges; and a set Eft C V x V of replacement (or reproduction) edges [121. The sets El and ER consist EFTA01139459 of directed edges. and the graph Of = (V, El) is called the interaction graph, and OR = (V, ER) is called the replacement graph. The graph GI is responsible for determining the interaction of individuals in the graph (which affects the fitness or payoff), and the graph OR captures the underlying structure for reproduction and replacement of individuals in the graph. Given an edge (v, ti) we say a is a successor of v and v is a predecessor of a. Payoff of individuals. Each vertex of the graph will be occupied by one of two types of individuals, namely, the resident type and the mutant type. In evolutionary games, along with the evolutionary graph there is a payoff matrix, which is defined as follows: R R (a b M c d where the entries of the matrix are rational numbers and represent the payoff of an interaction, i.e., a (resp., b) is the payoff of a resident type interacting with another resident (resp., mutant) type, and c (resp., d) is the payoff of a mutant type interacting with a resident (resp.. mutant) type. Given two vertices, x and v. we denote by pay(x, y) the payoff of the type of vertex x versus the type of vertex y. Fitness of individuals. The fitness of an individual denotes the fecundity (or reproductive rate) and must be a non- negative number. Let El(v) = {u I (v, u) E Er} denote the set of interaction successors of v. We define two natural (but not equivalent) ways of defining the fitness of v, denoted as f(v), as follows: I. Linear boundedfitness. The linear bounded fitness is the average payoff of the interactions but at least 0, i.e., {EuEEt (v)PaY(u, u) f(v) = max IE1(4 Note that since the fitness is non-negative it is bounded from below by 0. 2. Exponentialfitness. The exponential fitness is an exponential function of the average payoff of the interactions, i.e., f (a) exp EuE PaY(u, u) IE,(v)I Note that the fitness function ensures that the fitness is always positive. We will use LBF to refer to the linear bounded fitness function and ExF to refer to the exponential fitness function. The evolutionary process. The evolutionary process we consider is the classical birth-death process on an evolution- ary graph defined as follows: I. Initially all vertices of the graph are of the resident type and a mutant type is introduced uniformly at random at one of the vertices of the graph. 2. Repeat the following step (referred to as a generation): In every generation, a vertex v is selected proportional to the fitness of the individual at the vertex to reproducer. A new born individual replaces one of the replacement successors of v, i.e., it replaces a vertex chosen uniformly at random from the set ER(v) = fts I (v,u) E ER}. Step 2 (or generations) is repeated until nothing can change (in particular, if all vertices have fitness 0 or have the same type, then nothing can change). Fixation probability. The most relevant question from an evolutionary perspective is thefixation probability which is the probability that the mutant takes over the population, i.e., eventually all vertices become the mutant type. Computational questions. Given an evolutionary graph, a payoff matrix, and the payoff to fitness function (linear bounded, or exponential) we consider the following questions: I. the qualitative decision question asks whether the fixation probability is positive; and If every vertex has fitness 0. then no venex is selected for reproduction. EFTA01139460 2. the quantitative approximation question, given e > 0, asks to compute an approximation of the fixation proba• bility within an additive error of e. In this work we will establish several complexity bounds for the problem, and our most interesting results are the lower bounds. Lower bounds establish computational hardness of a problem, and if the lower bounds can be established even in restricted cases, then it shows that even special cases of the general problem is computationally hard, and thus the lower bounds become even more significant (e.g., a single lower bound for a special case can be applied to all generalizations of the special case). Special cases. There are several special cases of interest that we will explore. I. Constant fitness with density constraints. A special case of the payoff matrix is the constant fitness (aka constant selection) matrix defined as follows: R Df R r r M k 1 1) i.e., the mutant types always have fitness I and the resident types fitness r, where r ≥ 01. Along with the evolutionary graph and the payoff matrix, we have two thresholds, namely, Oj and Om, for the resident type and the mutant type, respectively. Intuitively, the thresholds represent a density constraint, and if an individual is surrounded by a lot of individuals of the same type, then its reproductive strength decreases. The density con• straint is relevant in many applications of evolution (see books [2, page 4701 [13, page 3201, also see Remark I). Let the selected vertex for reproduction be v. Let Same(v) denote the number of vertices in El(v) that are of the same type as v. If v is a mutant type, and SiatV < Ohy (resp., if v is a resident type, and ≤ OR), then the individual gives birth to an individual of the same type. Note that the density constraint implies that if the constraint is violated, then the selected individual does not reproduce. 2. The l&R and IEQR models. One important special case is when the interaction and the replacement graphs coincide. i.e., El = Eft [9, III. We refer to the general model as the l&R model (with possibly different interaction and replacement graphs) and the special case where the graphs coincide as the IEQR model. 3. No resident reproduction. Another special case is when the payoff matrix is the constant payoff matrix with r = 0. In this case, the resident types cannot reproduce. This represents the scenario that a mutant has an advantage over the residents such that if a mutant occupies a position, then the residents cannot win it back. Remark 1 (Matrix encoding of density constraints in LBF.). For many of our lower bounds, we will use constant selection with density constraints, and we argue that the density constraints of our lower bounds, are special cases of the linear bounded fitness without any density constraints. In our results for lower bounds we consider two types ofdensity constraints: (1)6iAr = s— 6, for 0 < d <1,110 (in Section 3 and Section 4), where there is no resident reproduction (hence O is irrelevant); and and (2)0m = O = 0 in Section 5. In all the lower bounds, the payoff matrix is constant. These two density constraints can be encoded as a payoffmatrix (that is not constant) with linear boundedfitness function as follows: R M R M R r0 0 \ R —N 1 \ M 1 —1) ; M -fto The first payoff matrix encodes that a vertex that is a mutant can reproduce only if strictly less than half of the successors in El are mutants, and thus encode Om = s - 4, for 0 < d < 1/10, in graphs where the outdegree is at most five. The second matrix (for a graph with N vertices) encodes that a vertex can reproduce only if all the successors in Et are of the opposite type. 21n the literature, an alternative notion is to consider that the mutant have fitness r and the residents have fitness 1. we follow the notation that leads to uniform treatment EFTA01139461 Organization. Our results are organized as follows. I. In Section 3, Section 4, and Section 5, we consider the linear bounded fitness function. We will present upper bounds for the general case and lower bounds for the special case of constant payoff matrix with density con- straints (which is a restricted case as explained in Remark I). In Section 3 we consider no resident reproduction, and present results for the qualitative case; and in Section 4 we again consider no resident reproduction and present results for the quantitative approximation. Finally, in Section 5 we present results for the qualitative and quantitative analysis in the general model with resident reproduction. 2. In Section 6 we present the results for the exponential fitness function. No Resident Reproduction Resident Reproduction IEQR model I&R model IEQR model I&R model Qual. NP-c ((LB) Lem. 3) NP-c ((UB) Lem. 2) NP-h, PSPACE PSPACE-c ((LB) Lem. II, (U11) Lem. 9) Appr. #P-c ((LB) Thm. 8) #P-c ((UB) Thm. 8) #P-h, PSPACE PSPACE-c ((LB) Lem. II, (UB) Lem. 9) Table I: Complexity of evolution on graphs with linear bounded fitness. Qual is short-hand for qualitative and appr for approximation. Our main contributions of lower bounds (LB) and upper bounds (UB) are boldfaced. NP-c (resp., #P- c, PSPACE-c) means NP-complete (resp., #P-complete, PSPACE-complete). Similarly, NP-h (resp., #P-h) means NP-hard (resp., #P-hard). 3 Qualitative Analysis: No Resident Reproduction with LBF In this section we establish two results for the no resident reproduction model with LBF: the qualitative analysis problem is (I) in NP for the general I&R model; and (2) is NP-hard in the special case of IEQR model, and even in a special case of LBF, where we have constant fitness with density constraints, (using density constraints mentioned in Remark l). 3.1 Upper bound The upper bound is relatively straightforward. We simply check if there exists an initial choice v1 for the initial mutant and a sequence (ei)2<i.c„ of edges of length n — 1 in the replacement graph for reproductions that ensures that all vertices are mutants. The initial vertex v1 and the sequence of edges together define a unique sequence of vertices for reproduction; and at every stage we check that for the vertex chosen for reproduction can reproduce (i.e., has fitness strictly positive), and it is a mutant. We also need to check that in the end all vertices are mutants. The choice of the initial vertex and the sequence of reproductions then happen with positive probability and we are done. Observe that since there is no resident reproduction, if a vertex becomes a mutant, then it remains a mutant. Note that there always exists a sequence of length it - 1, because if the fixation probability is positive, then we can WLOG assume (till all vertices are mutants) that in each step i there is a vertex v that is a mutant, with strictly positive fitness, and an edge ( v, vi) in the replacement graph such that if is not a mutant (and becomes a mutant in step i), as otherwise nothing can change. This shows that if the answer to the qualitative decision question is yes in the no resident reproduction model, then there is a polynomial witness and polynomial-time verification procedure. Lemma 2. The qualitative decision question for no resident reproduction in the general la model with LBF is in NP 3.2 Lower bound In this section we present an NP lower bound, and we will prove it for the IEQR model with no resident reproduction. We will also consider a special of LBF, which is constant fitness with density constraints. Moreover, since there is no EFTA01139462 Figure I: Illustration of a predecessor gadget (u, v). resident reproduction, the threshold OR does not matter. We will present a reduction from the 3-SAT problem (which is NP-complete 13, 8, 5I) and use threshold Om as z— 8, for any 0 C d < A. However it would be easy to modify our construction for any threshold Om in (0, 1). The "right" way to think of the threshold is that it is , and that the density constraint uses a strict inequality. The upper bound is chosen because we will use vertices with degree five or less; recall Remark I. Notation. Let X = {xi ,.... . ..x,,} be a set of n Boolean variables. Consider a 3-CNF formula co = Cr A C2 A • • • A C„„ where each C, is a clause of a list of (precisely) three literals (where a literal is a variable x or its negation where x E X). Each clause represents a disjunction of the literals that appear in it. An instance of the 3-SAT problem, given a 3-CNF formula <p, asks whether there exists a satisfying assignment. We will now construct an evolutionary graph G(,o), given an instance of a 3-SAT problem, with (i) El = Eft, (ii) no resident reproduction, and (iii) threshold = z — 8, for 0 C d ≤ A such that there is a satisfying assignment iff the answer to the qualitative decision problem is YES. We first present two gadget constructions that will be used in the reduction. Predecessor gadget. We present a predecessor gadget for a vertex pair (s, v) such that v is the only successor of it The gadget ensures the following property (namely, the predecessor gadget property): if all vertices become mutants, then the vertex to must have become a mutant before vertex v. The construction of the gadget is as follows: Add a new dummy vertex Let the successors of u be v and til, and the successor of u' be only v. Then the only way for til to become a mutant is if u is a mutant, since u is the only predecessor of te. But le can only become a mutant if u is a mutant and v is not (since otherwise the threshold condition with Om = s- 8 is not satisfied for it, for any 0 < 8 < 16). Hence, if all vertices become mutant, then n must become a mutant before v. There is an illustration of the predecessor gadget for (u, v) in Figure I. We will denote by PredEdges(te, v,It') the set ((It, v), (u, u'), (u', v)} of edges of the predecessor gadget. (Extended) Binary tree gadget. Given a vertex rt, and a set L of vertices, we will denote by BinTr(rt,L) a binary tree with rt as root and L as leaf vertices3. In a binary tree, every non-leaf vertex has out-degree 2. Note that the binary tree gadget adds additional vertices, and has Oa LI) vertices. By an abuse of notation we will use BinTr(rt,L) to denote both the set of vertices and the set of edges of the binary tree, and it would either be clear from the context or explicitly mentioned. Given a binary tree T and an extension vertex z 0 T, an extended binary tree (EBT) consists of T and an edge from every non-leaf vertex to z. Given a root vertex rt, a set L of leaf vertices, and an extension vertex z, we denote by ExBinTr(rt, L, z) the edge set of the extended binary tree that extends the binary tree of rt and L. We will explicitly use the following property for an EBT (namely, qualitative EBT (QEBT) property): • (QEBT Property). In an EBT, every non-leaf vertex has out-degree 3, and for density constraint with threshold 2 — 8, for 0 < b < -I- (the construction works even if 43 is up to l), if the root becomes a mutant and z is not - 10 a mutant, then the root can be responsible for making every vertex in the tree a mutant. However, note that if z is a mutant, then any vertex in the tree with out-degree 3 cannot make both its children in the underlying tree mutants due to the density constraint. There is an illustration of a binary tree BinTr(x, v2, v31) and the corresponding EBT ExBinTr(x, {vi , v2, v3}, z) in Figure 2. The evolutionary graph G(yo). We now present the evolutionary graph G(9), see Figure 3 for an illustration, where we first describe the vertex set and then the edges. Recall that n is the number of variables and m the number of clauses of the 3-SAT instance 0. 3For a fixed L and rt there exists many possible binary trees BinTr(rt, L). however every one of them will work for our purpose EFTA01139463 Figure 2: A binary tree BinTr(x. {v1, v2, v3}) and the corresponding EBT ExBinTr(x, {vi , v2, v3}, z), where we extend with the vertex z. The edges to z are dotted to make the similarities easier to see. The vertex set. The set V of vertices is as follows (intuitive descriptions follow): {v-r, zs, ys, es). U I C21 • • • cm} Li {4,4,4 I1 ≤i ≤ m} u {1}g>4 I 1 ≤i≤rn} U {xi,4,xf I xi E U {vo, U {uLuill<i<n} U (,)r<;<„(BinTr(4,L;)UBinTr(xf,L;)) The vertex v-r will be the start vertex; and the vertices z1 , ys, and es are end vertices (that will form a predecessor gadget for (zi , ) with dummy vertex es). We have a vertex r:i for each clause Ci (named the clause vertices); and one for each literal 4,4, and c in the clause (named the clause-literal vertices). Similarly, we have a vertex xi for each variable in X (named the variable vertices), and vertices 4 and xt (named the variable-value vertices) to represent the truth values to be assigned to xi. Corresponding to xl and 4 we also have vertices and of (named the duplicate vertices). The vertex Lb forms a predecessor gadget (using the dummy vertex 4) to tip Let L4 = {Pk I 1 ≤ k ≤ in, 1 ≤ j < 3, 4 = xi } denote a copy of the clause-literal vertices that correspond to xi and Lf = {4 I 1 ≤ k ≤ m, 1 ≤ j ≤ 3, 4= z} denote a copy of the clause-literal vertices that correspond to negation of xi. The set BinTr(4, (resp., BinTr(rif,Lt)) represents the vertices of a binary tree with the root vertex 4 (resp., xi') and leaf vertices L4 (resp., Lb. The edge set. We now describe the edge set: • There is an edge from the initial vertex trr to the first clause vertex c1; and we have two predecessor gadgets; (i) (zs • ys ) with dummy vertex es; and (ii) (vo,u1) with dummy vertex 1(3. 4 • For each clause vertex ci, there are five edges, three to clause-literal vertices (for j = 1, 2,3) of the clause, one to the next clause vertex (for c„, this next vertex is x1), and to the vertex tel. • For each variable vertex xi, there are three edges: to 4 and 4, and to the next variable vertex x1÷1 (for x7, the next vertex is v )). • Each duplicate vertex 14 has three edges: to uf, to 4, and to ys. Similarly, each vertex of has three edges: to 14.1 (4, has edge to zs instead), to xi', and to ys. • Finally, we have the EBT with x7 (for a E {t, f}) as root, L? as leaf vertices and y1 as the extension vertex. For each vertex in L7, for a E {t, f }, we add edges to the corresponding clause-literal vertex and to 14. This ensures that every internal vertex of the binary tree has degree three, and leaf vertices have degree two. EFTA01139464 The formal description is as follows: {Or ern U Fred Edgets , , ily) U FredEdgetro, al , v,S) {(ci, 4) I 1 tn, 1 ≤ j U {(ci, up I 1 i U {(ci, ci+i)Il≤i≤m— U {(c,„, xi)} 1.) {(xi, 4),(x; x!) I 1 ≤ U xi-Er) I 1 n — 1} U {(xn, von U {04,4) I 1 ≤ {(4,14.") I 1 ≤ i ≤ n — 1} U {(u,f„ as)} U {OLT, xn, (LT, Y.L ) I 1 n, a E {t, f}} (t_h<i<n(ExBinTr(xli , ys ) U ExBinTr(xf U {(4,4), g, ut) I4 E L7',1 ≤ tha E {t, f}} Example. We will now give an example of the graph G(co) for so = (2 V y V x) A (z V x V2). See Figure 3. The edges to rei are dashed and the edge from rre for all 1 ≤ i ≤ 3 and a E {t, f } are dotted, for readability. Also, the vertex at is included twice to make it clearer that it is in a predecessor gadget. Basic facts. We first mention some basic facts about the evolutionary graph obtained. I. First, observe that the predecessor gadget property implies that for fixation the vertex tb must become a mutant before vertex uI; and vertex z1 before vertex ys . 2. Second, for a vertex with degree t, it can reproduce a mutant as long as at most I' • (1 — 8) successors are mutants. In particular, for vertices with five (resp., three) successors, like the clause (resp., variable) vertices, it can reproduce a mutant until at most three (resp., two) successors are mutants, because of the bounds on em. If a vertex has out-degree two (or one), then it can reproduce a mutant until at most one successor is a mutant, because of the bounds on Om. The conditions follow from the density constraint with threshold z— b. Two phases for fixation. For mutants to attain fixation (i.e., all vertices become mutants), certain conditions must be fulfilled. The first basic fact above implies that for the evolutionary process to attain fixation, it must make vertex x„ a mutant (then vertex vo a mutant) before vertex t4. We thus split the process of fixation in two phases: in the first phase al is not a mutant, and in the second phase rel will be a mutant. We further split the first phase into two sub-phases, the first sub-phase is related to clause vertices becoming mutants, and the next sub-phase is related to the variable vertices becoming mutants. The description of the phases for fixation are as follows: I. (Phase I:Part A). The mutant must be initialized at the start vertex vT (since vT has no predecessor). After yr, the clause vertex c1 becomes a mutant. Since at most half (three) successors can become mutant from c1 (recall that c1 has five successors), and one of them must be c2 (as the only incoming edge for c2 is from c1), it follows that c2 and at most two clause-literal vertices for clause C1 becomes mutant from c1. This process is then repeated for all the clause vertices c, till x1 becomes a mutant. 2. (Phase I:Part B). Each of the vertices xi has three successors, and hence can make two of them mutants. One of them must be x,+i (as 21+1 has only xi as the predecessor), and the other one is at most one of x! or xf . This continues till we reach vo. Note that once xt, becomes a mutant, then the entire EBT under 4, including the corresponding clause-literal vertices, but not y1 and ut, can become mutants, as long as yl and al are not mutants. The reasoning is as follows: the leaf vertices has two out-going edges, and since ul is not a mutant, it can reproduce a mutant to the corresponding clause-literal vertices, and the rest follows from the QEBT property. The phase 1 ends with the predecessor gadget of ( vo, up becoming mutants. Note that this phase corresponds to a partial assignment of truth values to the variables as follows: for a variable x„ if x: was chosen (made mutant), it corresponds to assigning true to x,; if x; was chosen, it corresponds to assigning false to x,; otherwise, if neither was chosen, then it corresponds to no assignment to x, (if fixation is reached without having made an assignment to some set U of variables, then any possible assignment of values to the variables of U will make the partial assignment a satisfying assignment). 3. (Phase 2). This phase starts after al is a mutant. We establish a key property of this phase that will be used in the proof. Consider the EBT under some variable-value vertex. Each leaf vertex of the tree has out-degree two: EFTA01139465 one of the successors is t4 and the other is a clause-literal vertex. It follows that once 211 has become a mutant, then the leaf vertices cannot reproduce any more. Thus the key property of Phase 2 is as follows: leaf vertices of EBTs cannot reproduce mutants to clause-literal vertices after Phase 2 starts. The graph G(9) has positive fixation probability iff is satisfiable. We present two directions of the proof. I. Satisfiablity implies positive fixation. Consider a satisfying assignment to co, and intuitively the assignment chooses at least one literal in each clause. The sequence of mutants reproduced in the two phases for fixation is as follows: • (Phase I). The sequence in Phase I is the following: (I) initial vertex in- becomes a mutant which then reproduces a mutant to ci ; (2) in vertex ci, it reproduces upto three mutants, one to ci÷i (to xi for i = in) and upto two mutants for vertices of the clauses which are not chosen by the satisfying assignment (this corresponds to Phase 1:Part A); (3) for a vertex xi it reproduces two mutants, one to xi÷i (to vo for i = n), and the other to xt, (resp., 4) if the assignment chooses xi to be true (resp., false); and moreover, the entire EBT under x! (resp., 4) including the clause-literal vertices become mutants (other than e4 and yi ); and (4) then viS becomes a mutant and thenui becomes a mutant from vo, and proceed to Phase 2. • (Phase 2). The sequence in Phase 2 is the following: (1) In every vertex u7 (for a = t or f) it makes x7 mutant (if it is not already a mutant) and then it makes the next vertex in line a mutant (if i = n and a = f, then the next vertex is as, otherwise, the next vertex is 24 if a = t and u:+1 if a = f); moreover, once x7 becomes a mutant, so does the entire binary tree (other than yi ) under it (but not the clause-literal vertices since ui is a mutant); and (2) finally the (z1 , yi ) predecessor gadget becomes mutants. The claim follows. 2. No satisfying assignment implies no fixation. Note that for fixation we need the two phases. In every clause c, at least one of the clause-literal vertices was not made a mutant by c, in Phase l:Part A (or even after that). This implies that if Phase 2 has started and not all clause-literal vertices c; of a clause r, have become mutants, then at least one of these vertices cannot become a mutant, by the key property of Phase 2. For each (partial) assignment that is not satisfying, there exists at least one clause, in which no literals are chosen. Recall that the reproduction of mutants in Phase I:Part B gives a partial assignment of truth values to variables. Hence, in the process of reproducing mutants in Phase I:Part B, there must remain a clause where at most two clause-literal vertices are mutants. Therefore it implies that if there is no satisfying assignment, then fixation is not possible. We obtain the following result. Lemma 2 and Lemma 3 give Theorem 4. Lemma 3. The qualitative decision questionfor no resident reproduction in the IEQR model with LBF is NP-hard. Theorem 4. The qualitative decision question for no resident reproduction in both the general l&R model and the IEQR model with LBF is NP-complete. 4 Quantitative Approximation: No Resident Reproduction with LBF In this section we show that in the no resident reproduction model with LBF the following assertions hold: (i) the precise fixation probability can be computed in #P (for the general I&R model); and (ii) for c > 0, the problem of approximating the fixation probability within an additive error of c is #P-hard (even in the IEQR model). Again in our lower bound we will consider a special case of LBF where we have constant fitness with density constraint. 4.1 Upper bound Consider a sequence of reproductions (( v„ u,)), that leads to fixation, where vi is the first vertex to be a mutant. In other words, for a given i, the i-th reproduction consists of vi reproducing a mutant to ui. The sequence ((vi, EFTA01139466 Clause vertices • Clause- iteral vertices Variable vertices 8 ...... I ........ ................ 913A 3111CA-3i .... .. ........ v .. u2 . . ....... ...... ........ ......... ......... . . . ...... .................. X: Predecessor gadget (zi , yj ) u i< Predecessor gadget (v2, tit ) Figure 3: The graph G(<p) for co = (2 V y V x) A (z V x V ±). Edges to ul are dashed and edges from et,' are dotted for readability. The vertex t4 is included twice to make it clearer that it is in a predecessor gadget. The notation c? = y indicates that the second variable of the first clause is variable y. The notation x1 = x indicates that he first variable is variable x. EFTA01139467 defines the sequence of probabilities (p,), such that p, is the probability for vertex v, to reproduce to u1 in the graph where (vi , ut, is the set of mutants. Let d be the least common multiplier of the denominators of (i) -k, where N is the number of vertices (this represents the probability that a given vertex is the start vertex), and (ii) p, for each i in each sequence (p1)1 defined from a sequence of reproductions leading to fixation. We will next argue that d is at most exponential. Consider some payoff matrix R M R r0 0 \ (1) M b where a and b are integers given in binary. The denominator for selecting a certain mutant to reproduce is the sum of the fitness of the mutants that can reproduce (SFMR4). The SFMR is a number on the form a•i-Fb•j where i (resp., j) is the number of edges (u, v) E Eh where u is a mutant that can reproduce and v is a resident (resp., mutant). Note that the description of i and j shows that they are non•negative numbers such that i + j ≤ N2. Hence, there are at most k < N4 different SFMRs. A mutant can only reproduce if it has positive fitness and thus each SFMR is positive. Also, each SFMR have value at most S ≤ lal • N2 + • N2. Therefore, d ≤ N • Sk, which is at most exponential (the N factor is to select the start mutant). For a fixed sequence of reproductions ((v„ u,)),, let p =DIP. Observe that p is the probability that the first mutant is v, and the reproductions happen exactly according to ((v,,u1)),. Also, note that t, is an integer. We consider the following probability counting problem: Pick a sequence of reproductions ((v„ u,)), leading to fixation, and an integer c between I and -di. This is a candidate solution, and it is easy to check in polynomial time if it is indeed one. Thus counting the number of solutions is in #P. Observe that a fixed vertex v and sequence of reproductions ((v1, u,)), is used in exactly * solutions. Hence, by multiplying the number of solutions of the probability counting problem with d" we get the probability that the original system fixate for the mutants. We get the following result: Lemma 5. The fixation probability computation for no resident reproduction in both the general l&R model and the IEQR model with LBF is in #P. Remark 6. Note that the quantitative approximation problem is not defined as a decision problem. For #P upper bound above, for the approximate (as well as exact) fixation probability we mean that given the number of solutions to an #P problem we can compute in polynomial time the exactfixation probability. 4.2 Lower bound The remainder of this section will be on our lower bound, where we reduce the problem of counting perfect matchings in bipartite graphs to approximating the probability that mutants fixates. Like in the previous section, we will consider constant fitness with density constraints, and the threshold Om will be q — 43, for any 0 < b < k ) (because the degree is again bounded by 5). Moreover our lower bound will be for the IEQR model. Perfect matching in bipartite graphs. We present a reduction from the computation of the number of perfect match• ings in a bipartite graph G = (V,E). In a bipartite graph G, the vertex set V is partitioned into vertices Vt (left vertices) and Vr (right vertices) and all edges go from a vertex in Vt to a vertex in Vr (Le., E C Vt x Vr ). We also have lilt' = I Vr I = n. A perfect matching PM is a set {ei , e2, e„ } of n edges from E such that for every vertex vt E Vt (resp., v,. E Vr) there exists an edge et = (vt,v;.) (resp., er = (4, v,.)) in PM. Given a bipartite graph, the problem of computing the number of distinct perfect matchings was shown by Valiant 1161 to be #P.complete. Uniform degree property. First, we will show that we only need to consider bipartite graphs for which there exists an integer k such that all vertices in Vt have either degree 2k or I. We refer to the property as the uniform degree property. 4S stands sum. F for fitness. M for mutant R for reproduce EFTA01139468 Reduction to uniform-degree graphs. We present a reduction from counting the number of perfect matchings in a general bipartite graph G = (V, E) (with I VtI = I Vr I = n) to counting the number of perfect matchings in a bipartite graph G' = (V', E') with at most 6n vertices and which has the uniform degree property. Let k = [log 4.1, where dm,„ is the maximum degree of any vertex in G. The graph G' will have precisely as many perfect matchings as G. Observe that 2k < 2n. We construct G' by adding 2k new pairs of vertices, one on each side, and for each new pair (v, vi), we add an edge from v to Then, for vertex v E Vt, we add edges from v to some newly added vertex in 11,f until v has degree 2k. It is clear that any perfect matching in C corresponds to a perfect matching in G' using the same edges, and the edges between newly added pairs. Conversely, we also see that in each perfect matching in G', for each newly added pair (v, v'), the matching must use the edge between v and since the vertex in (Vg! Vt) has degree I. Thus every perfect matching in C' corresponds to one in G. Perfect binary trees. We will consider perfect binary trees as gadgets. • A perfect binary tree (PBT) is a balanced binary tree (every internal vertex has exactly two children) with all leaves at the same level (i.e. with 2k leaf vertices, for some non-negative integer k). For a PBT we will use the following property, which we refer to as the pmbabilistic PBT (PPBT) property: if the root becomes a mutant, then eventually all vertices in a path from the root to some leaf will become mutants, where such a path is chosen uniformly at random. Since every non-leaf vertex has out-degree two, due to the density constraint, each internal vertex can make one of its children (chosen uniformly) a mutant and hence the PPBT property follows. The graph Red(G). Given a bipartite graph G with the uniform degree property, let the vertex sets be Vt and respectively. Let N(v) = iv I (v, u) E E} denote the successors of a vertex v E Vt. Let VP = iv E Vt I IN(v)I = 29 be the set of vertices with degree 2k; and 12 = Vt 1 Vek be the set of vertices in Vt with degree I. Our reduction, denoted Red(G), will construct an evolutionary graph (with Ef = ER and hence we only specify one set of edges), which consists of three parts: part I sub-graph, then edges related to V., and a copy of part I with some additional edges. We first describe the part I sub-graph and then its copy. • (Pan 1). We have a start vertex v,, a final vertex vs , and we create an EBT B, as follows: ExBinTr(v,, Vt, i.e., the start vertex is the root, Vt is the set of leaf vertices, and yi is the extension vertex. For every vertex v E VI', let N(v) = itst, u2,..., ta and we consider the set Liu' = , un of j = 2k vertices and construct a PBT Pt, = BinTr(v, L!). Note that B, is an EBT, but the underlying binary tree is not necessarily perfect. • (Edges related laic). From every vertex v E VI, and every est in 14, we add two edges: one to ul E N(v) and one to From every vertex v E Vet (with degree I), we add two edges: to the unique rt E N(v) and to Every vertex in V, has an edge to ys . • (Copy I of Pan 1 with additional edges). First, we create a copy of the part of the graph described in part 1, along with one additional vertex z1 . For every vertex v of part I, let the corresponding vertex in the copy be called v, and the copy of the extension vertex is pa. We describe the difference in the copy as compared to the graph of part I: (i) first there is an edge from y1 to the copy 79, of the start vertex; (ii) for every vertex z which is a copy of a non-leaf vertex z in for some v E Vic, (i.e., z sf L„k ), there are three additional edges from (a) to z (i.e., from the copy to the original vertex), (b) to y1 , and (c) to z1 ; and (iii) for every vertex z which is a copy of a leaf vertex z in P„, for some v E Vic, (i.e., z E Lb, there is only one edge which goes to z (i.e., there is no edge to V, or y1 , but an edge from the copy to the original vertex). Hence in the copy of Ps, for any v, internal vertices have degree five, and leaf vertices have degree I. • Finally, we have the following edges: {(ys , Vs ), zi )}. We denote by at the number of vertices in Red(G), and note that ire = 0(m), where in is the number of edges in G. Example. We consider the graph G with six vertices, where Vt = v2, vs} and V, = {v4, v5, vg}, such that v1 and v2 each have edges to v4 and v5 and v3 has an edge to vs. See Figure 4 for an illustration. Observe that G satisfies the uniform degree property. In Figure 5 we have part I of the graph Red(G) along with V7. In Figure 6 we have the remainder of Red(G). Consider some fixed perfect matching PM in G, i.e. v1 —> v4 and v2 -, v5 and vs —) v6. The EFTA01139469 graph Red(G)PM is then the same graph as in Figure 5 and Figure 6, except that in Figure 5 it does not contain the edges from or4 q. The process of fixation in Red(G). The process of fixation in Red(G) can be decomposed in two phases. The first phase (Phase I) is over when yj. becomes a mutant; and the second phase (Phase 2) is over with the fixation. A key property of Phase 2 is as follows: vertices in V,. cannot become a mutant after yj. has become a mutant: This is because for each vertex is in Vr, every predecessor v of u has exactly two successors, and one them is y1 (and hence the density constraint with threshold i — 8 ensures that if yj. is a mutant, then vertices in Vr cannot become mutants after that). • Phase 1. In Phase 1. the vertex v, must be the first vertex to become a mutant (since it has no predecessor). After vt , all vertices in Bs turn into mutants (by the QEBT property). Once a vertex v E Vek becomes a mutant, then a path in the PBT P, under v is chosen uniformly at random to become mutants (by the PPBT property), and then the leaf of the path can make the corresponding vertex in V,. a mutant. Once a vertex v in V/ with degree I becomes a mutant, then it can reproduce a mutant to the unique neighbor in V,.. In the end, some vertex in V,. reproduces a mutant to yj. and Phase 1 ends. • In Phase 2. first the copy rp, becomes a mutant from yj.. After all vertices which are copy of vertices in Bs become mutants (again by the QEBT property). Once copies of vertices in V, become mutant, then the tree underneath them in the copy become mutants. Consider a vertex 11 which is a copy of a vertex it E P„, for some v E Vek, and there are two cases: (i) if 21 is a non-leaf vertex, then to has degree five, and can reproduce mutants until the two children in the tree and the original vertex 2/ are mutants (note if v, or z1 is a mutant, then both the children and the original copy cannot all become mutants due to the density constraint); (ii) if is is a leaf-vertex, then ff has degree one, and can reproduce mutant for is. Finally, yj. makes y1 a mutant, which then makes z1 a mutant. Fixation and a perfect matching. Observe that fixation implies that all vertices in V. have become mutant, and no vertex in V, can become a mutant in the second phase. Each vertex in lig is responsible for making at most one neighbor in It,. a mutant (for vertices with degree I it is the unique successor in V,, and for vertices with degree 2k, it corresponds to the leaf of the path in the perfect binary tree chosen uniformly at random by the PPBT property). This defines a perfect matching. Conversely, given a perfect matching, Phase I and Phase 2 of fixation can be described using the pairs of the matching (to chose paths uniformly at random in the perfect binary trees). Thus given fixation, it defines a perfect matching, and we say that fixation has used the perfect matching. Exact fixation probability. Consider some perfect matching PM. Observe that if there are s > 0 perfect matchings, then the exact fixation probability is s xpm, where xpm is the probability that we have fixation and used PM. This is because each perfect matching has the same probability to be the chosen matching in Phase 1 by the PPBT property. In Phase 2. any vertex v which is either a vertex in Ve' or a leaf in P,,, for v E Vic, cannot reproduce by the key property of Phase 2 (and thus can be viewed as having no out-going edges). Thus in Phase 2, by symmetry, the probability Xp0,4 of fixation for a perfect matching PM is independent of PM. Bounds on x and s. We show that the probability x for fixation of a fixed matching is at least q = where al is the number of vertices in Red(G). Each possible way that all vertices can become mutants happens with probability at leasta-2", because there are at most ii reproductions (effective reproductions which produce a new mutant) and each specific reproduction chooses two vertices v and v' at random from some set of vertices and thus, a specific choice happens with probability at least fi -2 . Thus the lower bound n on x follows. Finally, observe that the number 12 of perfect matchings can be at most rt! (i.e., upper bound on .s is n!). The graph Red(G)'M. Given a perfect matching PM. we can find x as the fixation probability for the graph Red(G)'M, which is similar to Red(G), except that each leaf vertex 1st in P„, for v E Vek, if (v, ui) is not in the matching, then we remove all out-edges from ut, and otherwise ist has the same edges as in Red(G). It is clear that the fixation probability in Red(G)'M is x. Approximating the fixation probability is #P-hard. Our reduction is as follows: Given a graph G with the uniform degree property, we want to find the number of perfect matchings .s in it. First, (i) we find an arbitrary perfect matching PM in polynomial time using the algorithm of [61 (if there exists none, we are done); (ii) construct Red(G) EFTA01139470 and Red(G)F94 in polynomial time; and (iii) compute the approximation y' of the fixation probability y* in Red(G) for e = 316- , and the approximation x' of the fixation probability x in Red(G)PM for epm = ni• —9— 16 = We now show how to obtain s from y' and x'. We have that y' is such that x•s e < (il + er,m) • s e= •s+ —< • s -I- ??! • 16 16 — 8 and similarly y' ≥ xi • 13 - This shows that s— 7/ < <st —21 8x' x' 82" . Since we also have x' ≥ x — e =17 — we see that < 1/3 and thus s is the integer closest to 14. Lemma 7. The quantitative approximation problemfor 0 < e < 1, with e given in binary, for no resident reproduction in both the general I &R model and the IEQR model with LBF is #P -hard. Lemma 5 and Lemma 7 gives the following result (also recall Remark 6). Theorem 8. The quantitative approximation problem, where the approximation number 0 < e < 1 is given in binary, for no resident reproduction in both the general I&R model and the IEQR model with LBF is #P-complete (and even the exact fixation probability can be computed in # P). EFTA01139471 Figure 4: The graph G. Pan I The EBT The PBT Pt, is Figure 5: Part 1 and the edges related to V,. of the graph Red(G). The edges to yi are dotted for readability. EFTA01139472 Copy of part I The copy of B, The copy of P,, The copy of P., Part I I V •1 Figure 6: The copy of part I of the graph Red(G). Most edges to -gi and to z1 are dotted for readability. 5 Qualitative Analysis and Quantitative Approximation: I&R Model with Resident Reproduction and LBF In this section we will establish the polynomial space upper bound and lower bound in the I&R model with resident reproduction, when the fitness function is LBF. 5.1 Upper bound Our algorithms is based on an exponential Markov chain construction. We first describe what is a Markov chain and Markov chains associated with an evolutionary graph. Markov chain. A Markov chain M = (S, A) consists of a finite set S of states, and a probabilistic transition function A that assigns transition probabilities A(s, s') for all s, E 5, i.e., 0 ≤ A(s, ≤ 1 for all s, s' E S and for all E S we have a •Es A(s, s') .1. Given a Markov chain, its graph (5, E) consists of the set S as the set of vertices, and E = {(s, s') I A(s, s') > 0) positive transition probabilities as the set of edges. EFTA01139473 Exponential Markov chain. Given an evolutionary graph G = (V, El , ER), with a payoff matrix, and the LBF function, an exponential Markov chain ME = (.9,A) is constructed as follows: (I) consists of subsets of V which denotes the set of vertices of V which are currently mutants; (2) for s E S and s' E there is positive transition probability if the cardinality of 13 and s' differ by 1 and the transition probability A(s, s') is computed depending on the payoff matrix, El, and ER. Observe that for the Markov chain Aft, the transition probabilities of a state in the Markov chain can be constructed in polynomial space. and hence the Markov chain can be constructed in polynomial (working) space. Qualitative analysis and approximation of Markov chains. We sketch the arguments for the upper bounds. • The qualitative analysis is achieved by simply checking if in the graph of the Markov chain the state sh = V is reachable from some state s = {v} for v E V. Since the reachability problem can be solved in non-deterministic logspace [141, applying such a reachability algorithm on the Markov chain ME (constructed on the fly) we get a non-deterministic polynomial space algorithm. Since by Savitch's theorem [151 non-deterministic polynomial space is equivalent to deterministic polynomial space, it follows that the qualitative question is in PSPACE. • For the approximation problem we simulate the Markov chain as follows. We start at an initial state uniformly at random among those where there is exactly one mutant. Consider a trial run of the Markov chain as follows. Given the current state, we first check if (i) the current state is V; else we check (ii) if there is a path from the current state to s! = V. If (i) is true we have a success; and if (ii) is false we have a failure. If we neither succeed or fail, we use the transition probability of the Markov chain to obtain the next state till we succeed or fail. Note that each trial run succeeds or fails eventually with probability I. We can view the outcome of each trial run as the outcome of a Bernoulli distributed random variable with success probability equal to the fixation probability. Hence repeating the trial runs independently an exponential number of times, we can approximate the fixation probability using Chemoff bounds, within any given e 7 0, with double-exponential small error probability. Then using the result of [101 we get a polynomial space upper bound for the quantitative approximation problem. Lemma 9. For the general I &R model with resident reproduction and LBF, the following assertions hold: (1) The qualitative decision problem is in PSPACE; and (2) the quantitative approximation problem can be solved in polyno• mial space. Remark 10. Observe that since precise probabilities to reach a state in a Markov chain can be computed in polynomial time in the size of the Markov chain f71, itfollows that the precisefixation probabilities can be computed in exponential time. 5.2 Lower bound In this section we present two lower bounds: (i) the qualitative decision question is PSPACE-hard; and (ii) the question that given an evolutionary graph with the promise that the fixation probability is close to either 0 or I, deciding which is the case is PSPACE-hard (which implies PSPACE-hardness for the quantitative approximation problem). For simplicity, we present our lower bounds in two steps. We will first reduce the problem to a problem which we call concurrent-if and then show that the concurrent-if problem is PSPACE-complete. Concurrent-if problem. The intuitive description of the concurrent-if problem is as follows: it consists of a set of Boolean variables, and a set of if statements where each conditional is a conjunction of some of the Boolean variables or their negation, and if the conditional is true, then a Boolean variable is set to a truth value. At each step, any of the if-statements can be executed. The process ends either when the first Boolean variable is true or nothing can change (i.e., the conditional of all if-statements are false). Note that the execution can loop, and perhaps run forever. We first define an if-statement. If-statement. Let B = {61,62, ...,b„} be a set of it. Boolean variables. An if-statement a is as follows: A(cn1, cnk) := val, EFTA01139474 V 1 • Figure 7: Boolean-value gadget: Dashed edges are in Et and non-dashed are in Eq. where 1 ≤ a, k ≤ nr, val is either true or false, each cnj is either a Boolean variable be or its negation Ige. An if- statement is satisfied if each of the cnj is true (i.e., cnj is true if one of the following holds: if crt, is be and be is true, or cnj is bi and bt is false). Concurrent-if system. A concurrent-if system consists of a set B , b2 b„} of n Boolean variables and a set P = {at, 82, s„,} of rn if-statements over the Boolean variables in B. The set of statements defines an execution from an initial setting of the Boolean variables as follows: repeatedly, a satisfied if-statement Mcn , cnk) := val is selected and then bi is set to val. If the first Boolean variable Eh is eventually true, then the execution is accepted. If at each point of an execution there is at most one satisfied if-statement, then we say that the execution is deterministic. The decision problem. Given a concurrent-if system, the associated decision problem is as follows: Given a set B of Boolean variables, an initial setting of the variables of B, and a set P of concurrent if-statements, such that the execution e from the initial setting is deterministic, whether e is accepting. 5.2.1 Reduction of the concurrent-if problem to evolutionary games on graphs We first describe how we encode the Boolean variables and the if-statements of a concurrent-if system in evolutionary games on graphs. Later we show how to construct a graph such that if the fixation of the mutant happens, then the fixation happens according to three phases which are as follows. I. First phase: The setup phase (to initialize the Boolean variables). 2. Second phase: The execution of the concurrent-if system. 3. Third phase: The fixation phase. The fixation can only happen if the execution of the concurrent-if system accepts. Density constraint Again our lower bound result will be for a special case of LBF, where we have constant fitness with density constraints (recall Remark l). Our construction will be for Oft = Bit = 0, but a similar construction will work for any choice of OR,Oni E [0, 1). The thresholds Oft = b yt = 0 indicates that a vertex v can reproduce precisely as long as all its successors in El are of the opposite type of v, because of the density constraint. Ideas and gadgets behind the reduction. We first introduce some key ideas and gadgets behind the reduction. • States which are nearly always a mutant/resident: Similar to the previous lower bounds, we have a vertex vs without any predecessor in Eq. Thus, if vs is not made a mutant at the start, then it cannot become a mutant. Hence we will only consider the case when vs is a mutant in the beginning and stays a mutant forever. We will also have a vertex and our construction will ensure that it stays a resident until all other vertices are mutants and then (after a few more steps) all vertices become mutants, and we get fixation. We will use the vertices vs and it s to ensure that a given vertex has a desired type, and otherwise the vertex cannot reproduce. Our construction will ensure (using the density constraint) the following properties: — A vertex v with 6, as a successor under El can only reproduce if it is a mutant (using the density constraint and re, is a resident). Similarly, a vertex v with vs as a successor under Ef can only reproduce if it is a resident. EFTA01139475 • Boolean-value gadgets: We describe how to implement boolean-value gadgets in evolutionary graphs for the Boolean variables of the concurrent-if system. Each boolean-value gadget j consist of four vertices vt, (the true-value-vertex) t4 (thefalse-value-vertex), iris (the true-setter-vertex) and t4, (the false-setter-vertex). In the second phase (the execution of the concurrent-if system phase) each boolean-value gadget is such that the two setters, vt, and t4,, are mutants. Also, at most one of the value vertices vio and v:;„ can be a mutant at any given point. If 4.is a mutant, then the value of j is true. If viv is a mutant, then the value of j isfalse. If neither is a mutant, then we say that j has no value. The edge set is as follows: (i) both vt, and vis have vs, , t4 as successors under El ; (ii) vis (resp., q v) has only via, (resp., vild as a successor under Eft (see Figure 7). The purpose of the edges in E1 are as follows: the edge to vs enforces that the setter vertex is a mutant before reproduction; and the other two edges enforce that only if the gadget has no value (i.e., both value vertices are resident), then the setter vertex can reproduce a mutant (by the density constraint and that Oj = B,w = 0). Observe that when the gadget has no value, then each of the setter vertices can set the value of the gadget to either true or false with positive probability in any such step. • If-statement gadgets: Each if-statement gadget, for the if-statement A(crit , cnk) b, := val, is imple- mented using a single vertex v (the if-statement-vertex). The if-statement gadget works under the requirement that v is a resident, and our construction will ensure that in the second phase (the execution of the concurrent-if system phase) each if-statement-vertex v is a resident. The edge set is as follows: I. The vertex v has the following edges in Ei: an edge to v,; and for each Boolean variable j in (cn t, cnk) an edge to tp,„, and for each negation of a Boolean variable j' in (cni , cnk) an edge to vilvt. 2. The vertex v has v.},, (resp., d„) as successor under Eft if val is true (resp., false). The purpose of the edges in Et are as follows: the edge to v, enforces that the if-statement-vertex is a resident before v reproduces; the other edges enforces that each literal in (cn , cnk) has the correct value before reproduction. Consider the case where val is true (the case where it is false is similar). If v can reproduce at a given point in time, then Mcn cnk) must be true. In that case, if the boolean-value-gadget for b, has value false, then v reproduces to set bi to no value. This then allows the setter-vertices of bi to reproduce, and set b, eventually to a value again. Observe that even though v tries to set b, to true, the value of b, might not be set to true immediately. The process is as follows: v tries to set b, to true by ensuring that if it is false, then it sets it to no value, and ensures that the true-setter vertex has positive probability to set it to true. Hence eventually with probability I it is set to true. Note that given A(cn cn k ) is still true, v can simply reproduce until bi becomes true. Since there is a fixed positive probability that the setter-vertices will set b, to either value, eventually b, becomes true with probability I. We will only use the boolean-value gadgets for deterministic executions and thus, the condition A(cn , cnk ) remains true until bi becomes true. This is because the execution is deterministic and thus, no other if-statement is satisfied in the current situation as long as b, is false or has no value. Especially, for the next if-statement to be satisfied it must depend on b, being true. We now describe how to construct the graph such that fixation implies the existence of an accepting execution of the concurrent-if system. First we will describe the vertices that reproduces in the setup phase, then how to use the boolean-value gadgets and the if-statement gadgets to encode some execution of the concurrent-if system, and finally how to ensure fixation if the execution is accepted. The first phase: Setup. First, as mentioned, we consider the case where vs becomes a mutant at the start (as otherwise fixation does not happen). The setup phase is split into two parts. The first part ensures that each boolean-value-gadget of the concurrent-if system has the right initial value, and the second part ensures that all setter-vertices of the boolean- value-gadgets are mutants. Each part corresponds to a boolean-value-gadget, 81, 92. respectively, and starts when the false-setter vertex, v1, v2, respectively, of the corresponding gadget becomes a mutant, and ends when the gadget has true value. The vertex v, has only 11, as successor in E1 (ensuring that it is a mutant when it reproduces) and v1 as a successor in Eft. Hence, eventually v1 becomes a mutant and this starts the first part of the setup phase. • The start of the first part of setup: The vertex v1 has onlyu, as successor in Et (ensuring that it is a mutant when it reproduces), but has the following successors under Eft: The setter vertices of al and the value vertices EFTA01139476 corresponding to the initial value of each Boolean variable of the concurrent-if system. Hence, eventually all vertices which are successor of vl in Eft become mutants as well. • The end of the first part ofsetup: There is an if-statement vertex 91, (which, since it has not become a mutant yet, must be a resident), who has v1 and all states which are successors of vi in Eft as successors under Er and v1 as the lone successor in Eft. This vertex then can first reproduce once v1 is done reproducing and then eventually sets st to true. This completes the first part. • Between first and second pan: The true-value-vertex vk, of 81 has only 9, as successor in El ; and vt, v2, and 91 as successor in Eft. Hence, after 81 has become true, each of the vertices vt, v2 and 91 becomes mutants. • The start of the second part of setup: The vertex v2 has only 9, as successor in Ef (ensuring that it is a mutant when it reproduces), but each setter vertex of .s2 and the boolean-value gadgets used in the concurrent-if system as successors in Eft. Hence, eventually, every successor of v2 under Eft becomes a mutant. • The end of the second pan ofsetup: Similarly to the end of the first part of setup, there is an if-statement vertex 92, (which, since it has not become a mutant yet, must be a resident), who has v2 and all states which are successors of v2 in ER as successors under Et; and v2 as the lone successor in Eft. This vertex then can first reproduce once v2 is done reproducing and then eventually sets .92 to true. This completes the second part of setup. • The end ofsetup: The true-value-vertex of s2 has Q. as successor in Et (ensuring that it can only reproduce if it is a mutant) and v2 and v2 as successors in Eft. They then eventually become mutants. The second phase: Execution of the concurrent-if system. We extend the construction of the graph for the concurrent-if system slightly as follows: Each if-statement-vertex v has some additional successors in El , besides the ones described in construction: The vertex v, (ensuring that v is a resident before reproduction), the true-value- vertex of 82 (ensuring that the setup phase is complete before v reproduces), and the false-value-vertex of the boolean- value-gadget for the first Boolean variable bi (ensuring that the second phase is not over). Hence, this ensures that the if-statement vertices are only active in the second phase. Clearly, if the execution is accepting, then the Boolean variable bi is eventually true. The third phase: Fixation. The fixation part uses two special vertices 4 and (that have not been used before). If the Boolean variable bi is true, then fixation will be achieved in the following steps as follows: • (step I): the true-value vertex of the boolean-value-gadget for bi will reproduce to set to a mutant; • (step 2): then 4 reproduces to turn all vertices (other than v, and 9,), including 4, into mutants; • (step 3): after this step, 4 again becomes a resident (but other than 9, and 4, all vertices are mutants); • (step 4:) finally, v! makes 9, a mutant, and at the end the true-value-vertex for to, again makes 4 a mutant. We now describe the above steps. • The true-value-vertex ti = it of to, has no successor in El, and 4 as a successor in Eft. Hence once v' is a mutant, it reproduces to turn t4 into a mutant (step I). • The vertex has 9, as successor in El (hence can only reproduce if it is a mutant); and each vertex (including 4) other than vertex v, (since no vertex should have v, as a successor in the replacement graph) and vertex 9, as successors in ER. All the successors of 4 in Eft becomes mutants (observe that the if-statement vertices might try to make the value-vertices of the boolean-value gadgets into residents, but sooner or later 4 will have made all the if-statements vertices into mutants). This is step 2 above. • After this only v, is a resident. The vertex v, has all other states as successors in Er (ensuring that it can only reproduce at this point), and vertex 4 as a successor in ER. Thus now 9, turns 14 into a resident. Note that at this point other than and 9, all vertices are mutants. This is step 3 above. EFTA01139477 • The vertex v: has vet as successor in El and re, as successor in Eft. Notice that when v! has been made a resident by v, both v' and v: can reproduce. Whenever v' does so, we are back to the situation before u, reproduced, which then reproduces again. Hence, sooner or later ve2 reproduces making ii, a mutant before v' and then v' reproduce after that making v! a mutant. At this point all vertices are mutants and fixation is achieved. This is step 4 above. Hence it follows that if the first mutant is vs, then (i) if the execution is accepting, then fixation happens with proba- bility I, and (ii) if the execution is not accepting, then the fixation probability is 0. Also note that if the first mutant is not v,, then the fixation probability 0 because no vertex has v, as successor in Eft. Lemma 11. Given a concurrent-ifsystem that is deterministic we can construct in polynomial time an evolutionary game graph in the I&R model with resident reproduction and LBF such that (i) if the concurrent-if system accepts, then thefixationprobability is positive; and (ii) if the concurrent-ifsystem does not accept, then thefixation probability is O. Fixation amplification. In the construction described above, the fixation happens only if vs is the first mutant and the concurrent-if system executes an accepting run. However, the probability that the first mutant is v, is as the first mutant is selected uniformly at random, where n is the number of vertices. We now present a construction that amplifies the fixation probability. Modified construction. Consider a set S of new vertices. Each vertex in S has v, as the only successor in Er and vs as the only successor in ER. The vertex vl has also the vertices of S as successors in Eft. (Also, the vertex v, still has all other vertices as successors.) Correctness argument Observe that if a vertex of S becomes the first mutant, then v, becomes the next mutant, and then fixation happens if and only if the concurrent-if system accepts similar to before. Hence, we get the following statement. • If the execution is accepting, then the fixation probability is at least Rlit (i.e., the probability that any of the vertices in S is the first mutant). • If the execution is not accepting, then the fixation probability is at most "÷n (i.e., the probability that none of the vertices in S is the first mutant). By making S much larger than the remaining graph (e.g., I-9I = n 2 or 151 = na), the fixation probability is close to 1, if the execution is accepting, and close to 0 otherwise. Lemma 12. Given a concurrent-if system that is deterministic we can construct in polynomial time an evolutionary game graph in the l&R model with resident reproduction and LBF such that(i)ifthe concurrent-ftsystem accepts, then the fixation probability is close to I; and (ii) if the concurrent-if system does not accept, then the fixation probability is close to 0. 5.2.2 Concurrent-if can simulate a deterministic space-bounded Turing machine In this section we show that the concurrent-if problem is PSPACE-complete. The PSPACE upper bound. The upper bound is straightforward and the argument is as follows. A PSPACE algorithm uses memory (or tape cells of the Turing machine) to store the Boolean variables, and then repeatedly execute in every round the following steps: (a) check if th is true, and if so then accept; (b) check if more than 2' rounds has been executed (by keeping a counter of n + 1 bits and incrementing at the end of every round), in which case the system must be cycling (by the pigeonhole principle), and we can reject; (c) find a satisfied if-statement (along with checking that there is exactly one, and otherwise reject), and update the Boolean variable according to the if-statement. The PSPACE lower bound. We show that the concurrent-if problem is PSPACE-hard by showing that concurrent-if systems can simulate polynomial space-bounded deterministic Turing machines. Our simulation will be such that (a) if the Turing machine rejects the input or exceeds the space bound, then the execution stops; and (b) if the Turing machine accepts, then the special boolean tot becomes true; and (c) if the Turing machine loops, then the execution EFTA01139478 loops. Also, each step of the Turing machine corresponds to between two and three iterations of the concurrent if- system (three in case when the space should be updated, two otherwise). For the remainder of this section, fix some deterministic Turing machine Al, an input I for Al, and a polynomial bound N on spaces. We will next describe the concurrent-if system simulating Al on /. Notations. Every tape cell i E 10,1, ..., N} of the Turing machine is a bit (i.e., either 0 or l). A configuration of the Turing machine consists of the valuation of every tape cell, the current state of the Turing machine, and the current head position 0 ≤ p ≤ N of the Turing machine. A tape-configuration (v, p, c) consists of the current state v of the Turing machine, the current head position p, and the content c of the tape cell at the current head position. Note that the number of tape-configurations is polynomial given the input. The Boolean variables. To describe the encoding of the Turing machine as the concurrent-if system, we first describe the Boolean variables and the encoding. • Product Turing machine: We will use three copies Ma of the given Turing machine Al and modify it such that each move takes the current state to the next Turing machine (starting over when the third is reached). More precisely, if the current state of M forms the sequence (v1)„ then the current state of Ma forms the sequence ((v„ i mod 3))i. This allows us to achieve the following: given two adjacent states of the sequence to distinguish which is the first. • Tape encoding: We will use a tape-boolean ti for the i-th bit of the tape. We will have such a Boolean variable for i E (0, , N + 1) (note that we have additional Boolean variables 0 and N + 1 so that we can check if the space usage has been exceeded in each direction). • Configuration: We will use a configuration-boolean b(v,p, c) for each possible current tape configuration (v,p,c) of the Turing machine defined as follows — The current state v of the Turing machine. — The current position p of the tape head (i.e., p has a value in (0 N -F 1}, in other words, there are configuration-booleans also corresponding to being just outside the space bound). — The content of the tape c (either I or 0) under the tape head as the Turing machine entered the current state and position. In other words, we have a single boolean for simultaneously being in state v, position p, and the content of the tape head being c as the tape head was moved to the current position. Initialization. Initially, the tape-boolean ti is true iff the i-th bit of the input I is true. Also, the only true configuration- boolean is the one for being in the start state, at the start position, and having ti as the content of the tape. The if-statements. Observe that the current tape-configuration (vi, ), for pt E 11, N) of the Turing ma- chine. and the current content of the tape cells, uniquely defines the next tape-configuration (v2, p2, c2). Simulating a move of the Turing machine is split into three iterations of the concurrent-if system. namely, update space; 2. tape-configuration super-position; and 3. resolve super-position. In the first step, the tape-boolean tp, is possibly updated. In the second step, either b(v2,p2, true) or b(v2,p2, false) is set to true°. In the third step. the configuration-boolean b(vi, P1, c1) is set to false. 5For readers not familiar with computer science. we point out that the problem we consider is similar to the halting problem for Turing machines which is undecidablc, however, here we have the restriction that the Turing machine must operate with a polynomial space bound N. which makes the problem PSPACIi-complete r'Observe that after doing so the configuration-boolcan b(vi pt, et ) is true and at least one of b(v2, p2, true) or b(v2, p2, false) is also true. This represents being in two tape-configurations at once, which we refer as super-position. EFTA01139479 • Update space. If the current tape-configuration (vi, pi , el ) updates the tape cell at position pi from true to false (rap.. from false to true) before moving, then there is an if-statement that checks as the conditional that the configuration-boolean variable b(vi, pt , rn ) is true, all other configuration-booleans are false, and the tape- boolean variable tp, is true (resp., false). If the conditional is true, then as its assignment it sets the tape-boolean variable tp, to false (resp.. true). • Tape-configuration super-position. For the second step there are two if-statements. One (rap., the other) if-statement checks as its conditional that the configuration-boolean variable b(vi, pi , ci) is true, all other configuration-boolean variables are false (i.e., the current configuration is b(vi, c1)), the tape-boolean vari- able tp, has been updated, and the tape-boolean variable tp., is true (resp., false). If this is the case, then it sets b(v2. p2, true) (rap.. b(v2,p2, false)) to true. • Resolve super-position. For the third step there are again two if-statements. One (rap., the other) if-statement checks that the configuration-boolean variables b(v1, pt ,ci) and b(v2, p2, true) (resp., b(v2, p2, false)) are true, and all other configuration-boolean variables are false. That is, intuitively speaking, the current configuration is the super position of (vi, pi, cn ) with either (v2, p2, true) or (v2, p2, false). In that case it sets the configuration- boolean variable b(vi, pr,c1) to false. Besides the above if-statements, we also have some additional if-statements. They are as follows: If the current state of the Turing machine is accepting, then set bi to true (formally: for each accepting state v and each p E {1, , and c E {true, false}, there is an if-statement that checks whether b(v, p, c) is true, and all other configuration-boolean variables are false, and if so, then sets bi to true). Furthermore there are no if-statements for configurations (vi ,pt, CI ) where pi E {0, 1, A/ +1} or where vi is rejecting, ensuring that the execution ends in that case (and it is especially not accepting). Note that the reduction is polynomial time since we use constantly many if-statements for every tape- configuration. Deterministic. We now argue that for the concurrent-if system we obtain in the reduction, the execution from the initial setting is deterministic. The reasoning is as follows: (I) If there are at least three configuration-boolean variables that are true, then no if-statement can be satisfied. (2) We now consider the case that two configuration-boolean variables, (v1. pi, rn ) and (v2, p2, c2), are true. Note that in this case the if-statements for update-space, and tape-configuration super-positions cannot be satisfied. Now we consider if-statements for resolve super-position case, which checks for the truth of two configuration-boolean variables, such that the tape-configurations can be consecutive. Since we consider A13. at most one of the tape-configurations can immediately precede the other. Hence at most one if-statement can be satisfied. (3) Finally, we consider when one configuration-boolean variable is true. In this case, precisely one of an if-statement from update-space or tape-configuration super-position can be true, depending on the content of the current and next position of the tape head. From the above case analysis, it follows that the concurrent-if system is deterministic. Lemma 13. The problem of deciding whether a concurrent-if system, that is deterministic, accepts is PSPACE- complete. Lemma 13 along with Lemma 11 and Lemma 12 gives us the following result. Lemma 14. For the general I&R model with resident reproduction and LSE the following assertions hold: (1) The qualitative decision problem is PSPACE-hard; and (2) given the promise that the fixation probability is close to either 0 or 1, deciding which is the case is PSPACE-hard. The previous lemma and Lemma 9 gives Theorem 15 which summarizes the result of this section. Theorem 15. For the general 18ER model with resident reproduction and LBF, the following assertions hold: (1) The qualitative decision problem is PSPACE-complete; and (2) the quantitative approximation problem can be solved in polynomial space, and even given the promise that the fixation probability is close to either 0 or 1, deciding which is the case is PSPACE-hard (hence the quantitative approximation problem is PSPACE-hard). EFTA01139480 6 Complexity Results for the Exponential Fitness Function ExF In this section we consider the case where the fitness of an individual at a vertex is an exponential function of the payoff, i.e., the fitness of an individual at vertex v is If f(v) =°c13 EuEEIZ P v7I ti) ) and we do not have density constraint. We first present the results, and describe how to obtain them. I. Result I. The qualitative problem can be solved in polynomial time. 2. Result 2. For the no resident reproduction case (i.e., when the fitness of a resident is set to 0), the quantitative approximation problem can be solved in polynomial time. 3. Result 3. For the resident reproduction case, we have the same complexity bounds as in the case where we have the LBF. Result I. Since the fitness is an exponential function and always positive, the fixation probability is positive if the replacement graph is connected, and otherwise the fixation probability is 0. Since whether a graph is connected or not can be checked in polynomial time, it follows that the qualitative problem can be solved in polynomial time. Result 2. In case the resident does not reproduce, and the fitness of the mutant is always positive due to an exponential function of the payoff, then for every vertex a mutant at the vertex reproduces to turn all its successor in the replacement graph as mutants. Hence given a vertex v. if the mutant arises at v, then all vertices that are reachable from v in the replacement graph become mutants with probability I. Given a vertex v, we say that v E All, iff all vertices in the graph are reachable from v in the replacement graph. Then the fixation probability is IAIII/In i.e., the probability that the mutant arises initially in any one of the vertices in All. Since reachability can be computed in polynomial time, it follows that the exact fixation probability (and hence also the approximation) can be computed in polynomial time. Theorem 16. For the general l&R model with the fitnessfunction is exponentialfunction of the payoff the following assertions hold: I. The qualitative problem can be solved in polynomial time. 2. For no resident reproduction, the exactfixation probability (hence also the quantitative approximation problem) can be computed in polynomial time. Result 3. Given Theorem 16 the only remaining problem is to consider the problem of quantitative approximation for the model with resident reproduction. We prove that the complexity results of Section 5 also hold when the fitness is an exponential function of the fitness. The PSPACE upper bound. The same argument for the PSPACE upper bound of Section 5.1 also holds when the entries of the payoff matrix are polynomial in the input, as an exponential size Markov chain can be constructed (the probabilities are obtained using the exponential fitness function), and hence we obtain the PSPACE upper bound. The PSPACE lower bound. The rest of the section is devoted to show how to modify the PSPACE lower bound of Section 5.2 to show that the lower bound is also applicable to the case where the fitness is exponential of the payoff (and there is no density constraint). Note that Remark I shows that density constraints can be encoded in LBF, but it does not show how to encode density constraint in ExF. In the sequel, we use "with high probability" to refer to that the probability is at least 1 — tp ) , where it is the size of input, and poly is a polynomial function. The key idea. The key idea is as follows: First step: First, we consider the problem with constant payoff along with density constraints and argue that the PSPACE hardness result holds even in the case where either mutants or residents fixate within an exponential number of steps with high probability. EFTA01139481 2. Second step: In the hardness proof in the model with density constraints we require that a vertex can reproduce iff all its successors are of the opposite type. In the model with fitness exponential of payoff, there is always a positive probability to reproduce. Thus even if a vertex has all its successors of the opposite type, it can still reproduce, and we refer to such reproductions as "undesired reproductions" (for the hardness proof). We show that a payoff matrix (with exponential payoff and no density constraints) can encode that if a vertex does not have all its successes of the other type, then the probability to reproduce is exponentially small (i.e., the undesired reproduction probability is exponentially small). Since in the hardness result of the previous item, the fixation happens within exponentially many steps, using union bounds it is easy to argue that the probability that an undesired reproduction happens before fixation is negligible. Hence the PSPACE hardness result for the model with density constraints can be translated also to the model with fitness exponential of the payoff and no density constraints. Also in the hardness result we only need to consider that the entries of the payoff matrix are polynomial in the input. Achieving thefirst step. Achieving the first step is done using the following: (a) first, modify the concurrent-if problem; and (b) then modify our reduction from concurrent-if problem to evolutionary games on graphs. The modified concurrent-if problem. We consider a modification of the concurrent-if problem, where given a concurrent-system, we accept an execution if iq is set to true, and we reject an execution if 62 is set to true. We consider concurrent-systems that are deterministic, along with the promise, that within an exponential number of steps either tot is set to true or 62 is set to true. We refer to a system of the above form as modified concurrent-if system. The decision problem, given a modified concurrent-if system whether it accepts or rejects, is PSPACE-complete. The argument for the lower bound is as follows: in our original PSPACE-hardness reduction in Section 5.2, we consider a space-bounded Turing machine At, with input I, and space bound N which is polynomial. Instead of M we can consider another Turing machine M' that simulates AI, and keeps in a counter the number of steps of Al that have been executed. If the number of steps exceeds exponential, then M loops, and thus Al' can reject (which is modeled in the concurrent-if system by setting 62 to true). Also M' can reject if the space bound N is exceeded, again modeled by setting 62 to true. The Turing machine M' can be reduced to a modified concurrent-if problem similar to the reduction of Section 5.2. Hence in our reduction we now consider the modified concurrent-if problem, which either accepts or rejects within exponentially many steps. Properties to be ensured by the reduction. We will now consider a modified concurrent-if system, and then construct an evolutionary game on graph with the following properties: (a) if a vertex has all its successors in E1 of the opposite type, then it has a constant fitness (say, a constant c > 0); (b) if a vertex has at least one successor in E1 of the same type, then it cannot reproduce; (c) if the modified concurrent-if system accepts (resp., rejects), then in the evolutionary graph the mutants (resp., residents) fixate within an exponential number of steps with high probability. Note that the first two properties reiterate that we are considering the model with constant fitness and density constraints. This is the main idea to achieve the first item of the key idea. We now describe the key changes to the reduction of Section 5.2. Changes to the setup phase. The main change in the setup phase is to construct a copy of v,. Recall that since the size of S is much larger than the rest of the graph, we only need to consider that the first mutant starts in a vertex in S. In our reduction in Section 5.2, the vertex vs played two different roles, which are as follows: (i) first, after a vertex in S, it becomes the second vertex to be a mutant, and is responsible for starting the process of reproducing mutants; and (ii) second, given v, is a mutant, to ensure that a vertex v can reproduce only if it is a resident, we made vs a successor of v in El. In this reduction we create a new vertex v's, such that v, achieves the first role, whereas plays the second role. The modification is as follows: (i) for all vertices v such that v, is a successor of v in Et in the reduction of Section 5.2, then v has v's as successor in Et instead of vs; (ii) v's is a successor of v, in Eft. Finally, rei has tea as a successor in Er, ensuring that it becomes a mutant in the first part of setup. Changes to the fixation phase. We first consider how to ensure fixation for residents if 62 is set to true, and then describe the changes to ensure fixation for mutants if bi is set to true. We introduce additional nodes, 4,4, .0,4, vif and d i (note that there are no vertices Or, vMor vt, but the naming scheme is used since vim is associated with vb. The subtle issue about the fixation. Our goal is to ensure that if b2 (resp., 60 is set to true, then the residents (resp., mutants) fixate. However, we must ensure in the evolutionary graph, that once 62 (resp., bi) is set to true, then the EFTA01139482 fixation of mutants (resp., residents) does not happen. The fixation of mutants (resp., residents) is triggered by the boolean-value-gadget for bi (resp., 62) being set to true by making it (resp., 4,2,) a mutant. Vertex 14 for fixation of residents. We consider the case when the concurrent-if system has rejected by making vat?, a mutant. Then vertex 4 plays a crucial role in ensuring fixation for the residents. The vertex ve3 has the vertex v's and 4 = 4 2 as successors in E1; the edge to xi, ensures that 4 is a resident before 4 reproduces, and the other edge ensures that the concurrent-if systems has rejected before 4 reproduces. The successors of 4 under Eft are as follows: (I) the vertices of 5; (2) the vertex vs; (3) all the vertices of the boolean-value-gadgets .si and 32 together with vt and ii2; and (4) the setter vertices of all other boolean-value-gadgets. We now argue that once 4 is a mutant. then within polynomially many steps, all successors of 4 under Eft become residents with high probability. The vertices in (1) (i.e., 5) are the only ones that can make vertices in (2) (i.e., vs) mutants, and in general for 2 ≤ ≤ 4, vertices described above in (i) can be made mutants by the vertices in (i — 1), except for the vertices in (3) (which can be made mutants by the constant number of vertices of either (2) or (3)). Thus since there is a probability of one over a polynomial that 43 makes a given successor a resident in one step (when it can reproduce), it follows that within a polynomial number of steps all vertices which are successor of ve3 under Eft are residents with high probability. Checking that vv is done with reproduction. The other vertices and edges are as follows: • The successors of vertex 4 1 in Er are the the successors of ui3 in Eft; this ensures that 44 can only reproduce after 0 has made all its successors residents. The successor of 4 under Eft is 4.; this ensures that once 4 has made all its successors under Eft residents, then 4 becomes a mutant. • The vertex ve4 has 41 as successor in Er; and all value vertices of the boolean-value-gadgets, the vertex via, and the vertex 4 as successors in Eft. Thus when and vv has reproduced to make all their successors under Eft residents, then only the vertex 4 can be a mutant. • The vertex has all other vertices as successors in E1 and 4 4 as successor in Eft. This ensures that 4 can only reproduce when all other vertices are residents. • The vertex 0 has 4 and 4 t as successors in Er; and 4 as successor in Eft. Thus the vertex 0 can only reproduce when has turned 4, to a mutant. qi • The vertex v: has 4 as the only successor in both Er and Eft. This ensures that v: can only reproduce once 4 4 has become a mutant. The construction ensures that after 4 is the only mutant, then it makes trli a mutant, and then both 0 and v: can reproduce. If does so first, then we are back to the case when 4 is the only mutant. Otherwise, we have that 0 reproduces and then vs reproduces to turn all the remaining vertices to residents, and thus we obtain fixation for residents. Changes for fixation ofmutants. To ensure that fixation of the mutant phase cannot trigger the fixation of the resident phase, we divide the fixation of mutants into three parts, using a boolean-value-gadget to describe when each part is over. We present informally the construction (as the construction is similar to the setup phase of the original construction of Section 5.2). We consider the case when the concurrent-if system has accepted by making 2116:, a mutant. • In the first part we make the vertices 0 and mutants. Observe that 0 being mutant does not matter, since all its successors in Eft are already mutants. The vertex 0 will make 4 into a mutant, but since only ves and have 4,5 as a successor in El and 0 will not reproduce since 4 is a resident, this does not trigger fixation of residents. Note that no vertex (other than the boolean-value-gadget for this part) has or yes as a successor under El/. • The second phase makes 4 a mutant, which will cause all the value vertices of the boolean-value-gadgets to become mutants, but cannot trigger fixation of the residents anymore, since 4 is already a mutant. • The last part is like the fixation of the mutant as in the original construction of Section 5.2, except that it also makes the vertices used in the boolean-value-gadgets for the parts of mutant fixation, the vertex 0, the vertex EFTA01139483 vet, the vertex 4,,and the vertex vb into mutants. Observe that if v: or vii were mutants earlier than 4, then vv could conceivably have triggered fixation of residents. in the above reduction, it is easy to see that setup and fixation each takes a polynomial number of steps in expectation, and the execution of the concurrent-if system takes at most exponential steps in expectation. Let Ty be the expected number of steps for fixation, and T' is exponential. Hence using Markov inequality, there exists T ≥ T' such that T is also exponential, and the fixation happens within T reproductions with high probability. The hardness result. The above reduction achieves the following. Given evolutionary games on graphs with the following properties (a) if a vertex has all its successors in Er of the opposite type, then it has a constant fitness (say, a constant c > 0); (b) if a vertex has at least one successor in El of the same type, then it cannot reproduce; (c) either the mutants or residents fixate within an exponential number T of steps with high probability; and the promise that the mutants fixate with probability close to either 0 or 1, decide which is the case is PSPACE-complete. Reduction to fitness as exponential of payoff. We now show how the above reduction is modified to obtain the same PSPACE hardness result, when the fitness is exponential of the payoff and there is no density constraint. Them is no modification in the graph, and we only define the payoff matrix R M I? —x 1 M 1 —x for some x, such that p = T • N • exp(1 — x/N) is small (to make p exponentially small the number x only needs to be polynomially small), where N is the number of vertices. Recall that T is such that in the previous model we have fixation within T steps with high probability and T is exponential. The idea is that if the payoff of a vertex is negative, then it is at most 1— 7f7 . Consider a given step in the evolutionary graph such that some vertex has non-negative payoff, and fix a vertex v that has negative payoff. The probability to pick v in this step to reproduce is at most exp(1 — x/N). Hence by union bound, the probability of an undesired reproduction in a given step is at most N • exp(1 — x/N). On the other hand, by picking vertices with non-negative payoffs, the fixation happens with high probability (say p') in T steps (see property c above). Thus we have the following: (i) the conditional probability of fixation within the first T steps, given that there are no undesired reproduction in first T steps, is at least p'; and (ii) the probability that there are no undesired reproduction within the first T steps is at least 1 — p. Hence the probability of fixation without any undesired reproductions within the first T steps is at least p' • (1 — p), which is close to I, since p is small. Hence the fixation probability in the model with the above matrix where the fitness is exponential of the payoff is close to the fixation probability of the previous model. The hardness result follows. The following theorem summarizes the result. Theorem 17. For the general l&R model with the fitness function is exponentialfunction of the payoff, where each payoffofthe matrix ispolynomial in the size ofthe graph, thefollowing assertion holds: the quantitative approximation problem can be solved in polynomial space, and even given the promise that thefixationprobability is close to either 0 or I. deciding which is the case is PSPACE-hard (hence the quantitative approximation problem is PSPACE-hard). 7 Solutions in Polynomial Time The molecular clock problem. We consider the problem of molecular clock of neutral evolution. It was shown in [1] that the molecular clock can be accelerated or slowed due to spatial structure, which is represented as a graph. The problem is as follows: given a graph that represents a population structure, over time, the population acquies neutral genetic substituitions due random drift. Reproductions happen asexually, might depend on the precise vertex of the graph. For each vertex, there is a given probability that a neutral mutation arises. The molecular clock, denoted as K, is the average number of fixations of the mutations per generation. The basic computation question is to compute the value of K. The molecular clock solution. The solution of the molecular clock problem is as follows: the fixation probabilities in the neutral case can be obtained as a unique solution to a set of linear equalities. A system of linear equalities can be solved in polynomial-time, for example, using Gaussian elimination (which is cubic time) 141. The value of the EFTA01139484 molecular clock K is characterized by a simple linear expression in the fixation probabilities [II. Given the fixation probabilities, the linear expression can be evaluated in polynomial time. Evolutionary games on well-mixed population. We consider the problem of evolutionary games on well-mixed popu- lation, i.e., both the interaction and replacement graphs are complete graphs. The basic computational problem is to compute the fixation probability. The problem we consider is actually a special case of the computational problems we have considered, and the special case is that the graphs are complete. Well-mixed population solution. We show that the fixation probability can be computed in polynomial time for both LBF and ExF. Note that in a given generation, with a given number of mutants, the probability of the following events are defined, independent of the precise locations of the mutants and the generation number: the number of mutants (i) increase by 1; (ii) decrease by 1; (iii) remains the same; for the next generation. In other words, the distribution of the number of mutants in the next generation, depends only on the number of mutants of the present generation and the payoff matrix (in both the LBF and ExF model for fitness). Hence, we can construct a linear Markov chain, where each vertex is a single number i, which corresponds to having i mutants and it - i residents, where there are n vertices. For each 0 < i < it, the transition probabilities from i to i itself, i — 1 and i + I can be computed in polynomial time given the payoff matrix. The fixation probability, is the probability to eventually reach vertex n starting at vertex 1. Since the eventual reachability probability in Markov chains can be computed in polynomial time (again solving a set of linear equalities [41), it follows that the fixation probabilities for well-mixed population can be computed in polynomial time. 8 Conclusion and Open Problems In this work we studied the complexity of basic computation questions for related to ecology and evolution on graphs. We established many lower and upper bounds for complexity, and our most interesting and significant results are the lower bounds. We have established NP-hardness, #P-hardness, and PSPACE-hardness for several problems, and it implies that polynomial-time algorithms for any of the problems would imply P is equal to NP, solving a long- standing open problem. Moreover, under the widely believed conjecture that P is different from NP, it follows that for the problems for which we establish lower bounds, there exists no efficient algorithm. A simple equation based solution implies an efficient algorithm. The significance of our result is that it shows that for several fundamental problems in ecology and evolutionary games on graphs, in general there is no simple equation based solution for the fixation probability (in other words, a simple equation based solution would imply that P=NP). There are several interesting open problems. First, is the problem of evolutionary games on graphs with constant fitness function, which is a special case of the general problem. In this case, the qualitative problem can be solved in polynomial time, and while the quantitative problem can be solved in PSPACE, a non-trivial lower bound for the problem is an open question. Second, another interesting direction would be to explore the computational question in the regime of weak selection. Acknowledgments. We thank Erez Lieberman and Michal Koucky for insightful discussions and sharing their thoughts on the problem. References [1] B. Allen, C. Sample, Y. Dementieva, R. C. Medeiros, C. Paoletti, and M. A. Nowak. The molecular clock of neu- tral evolution can be accelerated or slowed by asymmetric spatial structure. PLoS Comput Biol, 11(2):e1004.108, 02 2015. [2] N. Barton, D. E. Briggs, J. A. Eisen, D. B. Goldstein, and N. H. Patel. Evolution. Cold Spring Harbor Laboratory Press, 2007. [3] S. A. Cook. The complexity of theorem-proving procedures. In Proceedings of the Third Annual ACM Syvnpo• shun on Theory of Computing, STOC '71, pages 151-158, New York, NY, USA, 1971. ACM. EFTA01139485 [4] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press, 2009. [5] M. R. Garey and D. S. Johnson. Computers andIntractability: A Guide to the Theory ofNP-Completeness. W. H. Freeman, 1979. [6] J. E. Hoperoft and R. M. Karp. A n 512 algorithm for maximum matchings in bipartite graphs. In Proceedings of the 12th Annual Symposium on Switching and Automata Theory (Swat 1971), SWAT '71, pages 122-125, Washington, DC, USA, 1971. IEEE Computer Society. [7] J. Kemeny, J. Snell, and A. Knapp. Denumerable Markov Chains. D. Van Nostrand Company, 1966. [8] L. A. Levin. Universal sequential search problems. Problems ofInformation Transmission, 9(3):265-266, 1973. [9] E. Lieberman, C. Hauert, and M. A. Nowak. Evolutionary dynamics on graphs. Nature, 433(7023):312-316, Jan. 2005. [10] N. Nisan. RL C SC. STOC '92, 1992. [11] H. Ohtsuki, C. Hauert, E. Lieberman, and M. A. Nowak. A simple rule for the evolution of cooperation on graphs and social networks. Nature, 441:502-505, 2006. [12] H. Ohtsuki, J. M. Pacheco, and M. A. Nowak. Evolutionary graph theory: Breaking the symmetry between interaction and replacement. Journal of Theoretical Biology, 246(4):681 —694, 2007. [13] S. Otto and T. Day. A Biologist's Guide to Mathematical Modeling in Ecology and Evolution. Princeton Univer- sity Press, 2011. [14] C. M. Papadimitriou. Computational complexity. Addison-Wesley, Reading, Massachusetts, 1994. [15] W. J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal ofComputer and System Sciences, 4(2):177 — 192, 1970. [16] L. G. Valiant. The complexity of computing the permanent. Them: Comput. Sci., 8:189-201, 1979. EFTA01139486

AI Analysis

Summarize this document or ask questions about its contents using Claude.

Typical cost: less than $0.01 per query with Haiku. Model can be changed in Settings.

Add API Key in Settings