%% This document created by Scientific Word (R) Version 2.5


\documentclass[12pt,thmsa,a4paper]{article}
\usepackage{amsfonts}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage{sw20jart}

%TCIDATA{TCIstyle=article/art4.lat,jart,sw20jart}

%TCIDATA{Created=Sat Dec 13 21:56:00 1997}
%TCIDATA{LastRevised=Sun Nov 14 12:01:38 1999}
%TCIDATA{Language=American English}

\input{tcilatex}
\begin{document}

\title{Evolutionary Dynamics and Backward Induction\thanks{%
Previous versions: September 1999; May 1999. Research partially supported by
a grant of the Israel Academy of Sciences and Humanities. The starting point
for this research was a discussion during the Seventh Annual Conference of
the Center for Rationality. The author thanks Bob Aumann, Dani Cohen, Ilan
Eshel, Michihiro Kandori, Andreu Mas-Colell, Jean-Fran\c{c}ois Mertens, Uzi
Motro, Abraham Neyman, Bezalel Peleg, Motty Perry, Phil Reny, Larry
Samuelson, Karl Schlag, Reinhard Selten, Avi Shmida, J\"{o}rgen Weibull,
Benji Weiss, Asher Wolinsky, and Peyton Young for their useful comments.}}
\author{Sergiu Hart\thanks{%
Center for Rationality and Interactive Decision Theory; Department of
Economics; and Department of Mathematics; The Hebrew University of
Jerusalem, 91904 Jerusalem, Israel.$\quad $ \emph{E-mail}:
hart@math.huji.ac.il\emph{\ }$\quad $ \emph{URL: }http://www.ma.huji.ac.il/%
\symbol{126}hart}}
\maketitle

\begin{abstract}
The backward induction (or subgame-perfect) equilibrium of a perfect
information game is shown to be the unique evolutionarily stable outcome for
dynamic models consisting of selection and mutation, when the mutation rate
is low and the populations are large.

\emph{Keywords}: games in extensive form, games of perfect information,
backward induction equilibrium, subgame-perfect equilibrium, evolutionary
dynamics, evolutionary stability, mutation, selection, population games.

\emph{Journal of Economic Literature Classification}: C7, D7, C6.
\end{abstract}

%TCIMACRO{\TeXButton{setlength}{\setlength{\baselineskip}{.25in}}}
%BeginExpansion
\setlength{\baselineskip}{.25in}%
%EndExpansion

%TCIMACRO{
%\TeXButton{References Without Numbers}{
%\def\@biblabel#1{#1\hfill}
%\def\thebibliography#1{\section*{References}\list
%{}{
%   \labelwidth 0pt
%   \leftmargin 1.8em
%   \itemindent -1.8em
%   \usecounter{enumi}}
%\def\newblock{\hskip .11em plus .33em minus .07em}
%\sloppy\clubpenalty4000\widowpenalty4000
%\sfcode`\.=1000\relax\def\baselinestretch{1}\large \normalsize}
%\let\endthebibliography=\endlist}}
%BeginExpansion

\def\@biblabel#1{#1\hfill}
\def\thebibliography#1{\section*{References}\list
{}{
   \labelwidth 0pt
   \leftmargin 1.8em
   \itemindent -1.8em
   \usecounter{enumi}}
\def\newblock{\hskip .11em plus .33em minus .07em}
\sloppy\clubpenalty4000\widowpenalty4000
\sfcode`\.=1000\relax\def\baselinestretch{1}\large \normalsize}
\let\endthebibliography=\endlist%
%EndExpansion

\section{Introduction\label{s-introduction}}

\subsection{Background\label{ss-background}}

A fascinating meeting of ideas has occurred in the last two decades between
Evolutionary Biology and Game Theory. Now this may seem strange at first.
The players in game-theoretic models are usually assumed to be fully
rational, whereas genes and other vehicles of evolution are assumed to
behave in ways that are entirely mechanistic. Nonetheless, once a player is
replaced by a population of individuals, and a mixed strategy corresponds to
the proportions of the various strategies in the population, the formal
structures in the two fields turn out to be very closely related. This has
led to many ideas flowing back and forth. On the one hand, game-theoretic
constructs---at times quite sophisticated---find their way into evolutionary
arguments; on the other, the basic paradigm of natural selection is used to
justify and provide foundations for many aspects of rational behavior. For a
discussion of these issues, including a historical overview, the reader is
referred to Hammerstein and Selten [1994] and to Aumann [1998].

The basic parallel notions in the two fields are ``strategic equilibrium''
(introduced by J. Nash in 1950) and ``evolutionarily stable strategy''
(introduced by J. Maynard Smith and G. R. Price in 1973). Roughly speaking,
when a game is played by populations of individuals (with identical payoff
functions), then a Nash equilibrium point essentially yields evolutionarily
stable strategies. This type of relation has been established in a wide
variety of setups, both static and dynamic (see the books of Hofbauer and
Sigmund [1998], Weibull [1995] and Vega-Redondo [1997]).

Most of these models deal with games in strategic (or normal) form. Here we
consider instead \emph{games in extensive form},\emph{\ }where a most
complete description of the game is given, specifying exactly the rules, the
order of moves, the information of the players, and so on. Specifically, we
look at the simplest such games: \emph{finite games of perfect information}.
In such games, an equilibrium point can always be obtained by a so-called
``backward induction'' argument: Starting from the final nodes, each player
chooses a best-reply given the (already determined) choices of all the
players that move after him. This results in an equilibrium point also in
each subgame (i.e., the game starting at any node of the original game),
whether that subgame is reached or not. Such a point is called a \emph{%
subgame-perfect equilibrium}, or a \emph{backward induction equilibrium}, a
notion introduced by R. Selten in 1965, 1975.

Evolutionary models are based on two main ingredients: selection and
mutation. \emph{Selection} is a process whereby better strategies prevail;
in contrast, \emph{mutation}, which is relatively rare, generates strategies
at random, be they better or worse. It is the combination of the two that
allows for natural adaptation: New mutants undergo selection, and only the
better ones survive. Of course, selection includes many possible mechanisms,
be they biological (the payoff determines the number of descendants, and
thus the share of better strategies increases), individual (experimentation,
stimulus response), social (learning, imitation), and so on. What matters is
that the process is ``adaptive'' or ``improving,'' in the sense that the
proportion of better strategies increases. Evolutionary models have been
extensively analyzed in various classes of games in strategic form (starting
with Kandori, Mailath and Rob [1993] and Young [1993]; see also the books of
Young [1998] and Fudenberg and Levine [1998]).

Since mutations are like small perturbations that make everything possible
(i.e., every pure strategy has positive probability), and this yields in the
limit (as the perturbations go to zero) the subgame-perfect equilibrium
points,\footnote{%
Recall that these are games of perfect information, where ``trembling-hand
perfection'' is the same as ``subgame perfection.''} it is only natural to
expect that evolutionary models with low mutation rates should lead to these
same points. However, the literature up to now has found the above claim to
be false: Evolutionary models do not necessarily pick out the backward
induction equilibria. Specifically, except in special classes of games,
other equilibria besides the backward induction ones also turn out to be
``evolutionarily stable'' (see N\"{o}ldeke and Samuelson [1993]; Gale,
Binmore and Samuelson [1995]; Cressman and Schlag [1997]; and the books of
Samuelson [1997] and Fudenberg and Levine [1998]).

\subsection{Examples\label{ss-examples}}

Even without specifying exactly how selection and mutation operate, one can
get some intuition by considering a few examples. The first one is the $2$%
-person game $\Gamma _{1}$ of Figure \ref{ex1}%
%TCIMACRO{\TeXButton{BeginFigure-ex1}{\begin{figure}[tb] \centering}}
%BeginExpansion
\begin{figure}[tb] \centering%
%EndExpansion
%TCIMACRO{
%\TeXButton{ex1.pic}{ 
%\unitlength 1.00mm
%
%\begin{picture}(70.00,67.00)(5,13)
%\linethickness{1.0pt}
%\put(30.00,70.00){\line(1,-1){20.00}}
%\put(50.00,50.00){\line(1,-1){20.00}}
%\put(50.00,50.00){\line(-1,-1){20.00}}
%\put(30.00,70.00){\circle*{1.50}}
%\put(50.00,50.00){\circle*{1.50}}
%\put(30.00,74.00){\makebox(0,0)[cc]{$1$}}
%\put(52.00,53.00){\makebox(0,0)[cc]{$2$}}
%\put(30.00,26.00){\makebox(0,0)[cc]{$0$}}
%\put(30.00,21.00){\makebox(0,0)[cc]{$0$}}
%\put(30.00,70.00){\line(-1,-1){20.00}}
%\put(10.00,46.00){\makebox(0,0)[cc]{$1$}}
%\put(10.00,41.00){\makebox(0,0)[cc]{$2$}}
%\put(70.00,26.00){\makebox(0,0)[cc]{$2$}}
%\put(70.00,21.00){\makebox(0,0)[cc]{$1$}}
%\put(19.00,63.00){\makebox(0,0)[cc]{$c^1$}}
%\put(42.00,63.00){\makebox(0,0)[cc]{$b^1$}}
%\put(39.00,43.00){\makebox(0,0)[cc]{$c^2$}}
%\put(62.00,43.00){\makebox(0,0)[cc]{$b^2$}}
%\end{picture}
%}}
%BeginExpansion
 
\unitlength 1.00mm

\begin{picture}(70.00,67.00)(5,13)
\linethickness{1.0pt}
\put(30.00,70.00){\line(1,-1){20.00}}
\put(50.00,50.00){\line(1,-1){20.00}}
\put(50.00,50.00){\line(-1,-1){20.00}}
\put(30.00,70.00){\circle*{1.50}}
\put(50.00,50.00){\circle*{1.50}}
\put(30.00,74.00){\makebox(0,0)[cc]{$1$}}
\put(52.00,53.00){\makebox(0,0)[cc]{$2$}}
\put(30.00,26.00){\makebox(0,0)[cc]{$0$}}
\put(30.00,21.00){\makebox(0,0)[cc]{$0$}}
\put(30.00,70.00){\line(-1,-1){20.00}}
\put(10.00,46.00){\makebox(0,0)[cc]{$1$}}
\put(10.00,41.00){\makebox(0,0)[cc]{$2$}}
\put(70.00,26.00){\makebox(0,0)[cc]{$2$}}
\put(70.00,21.00){\makebox(0,0)[cc]{$1$}}
\put(19.00,63.00){\makebox(0,0)[cc]{$c^1$}}
\put(42.00,63.00){\makebox(0,0)[cc]{$b^1$}}
\put(39.00,43.00){\makebox(0,0)[cc]{$c^2$}}
\put(62.00,43.00){\makebox(0,0)[cc]{$b^2$}}
\end{picture}
%
%EndExpansion
\caption{The game $\Gamma _1$\label{ex1}}%
%TCIMACRO{\TeXButton{EndFigure}{\end{figure}}}
%BeginExpansion
\end{figure}%
%EndExpansion
. It possesses two Nash equilibria in pure strategies: $b=(b^{1},b^{2})$ and 
$c=(c^{1},c^{2})$; the first, $b$, is the backward induction equilibrium.
Assume that at each one of the two nodes $1$ and $2$ there is a population
of individuals playing the game in that role. The populations are distinct,
and each individual plays a pure action at his node.\footnote{%
We say, for example, that an individual at node $2$ ``plays $b^{2}$'' if he
is programmed (by his ``genes'') to play $b^{2}$ whenever he is in a
situation to choose between $c^{2}$ and $b^{2}$.} If everyone at node $1$
plays $b^{1}$ and everyone at node $2$ plays $b^{2}$, then any mutant in
population $1$ that plays $c^{1}$ will get a payoff of $1$ instead of $2$,
so selection will wipe him out; the same goes for any mutant at node $2$.
Therefore the backward induction equilibrium $b$ is ``stable.'' Now assume
that we are in the $c$ equilibrium: All the individuals at $1$ play $c^{1}$
and all the individuals at $2$ play $c^{2}$. Again, a mutant at $1$ loses
relative to his population: He gets $0$ (since the individuals at $2$ that
he will meet play $c^{2}$) instead of $1$. But now a mutant at $2$ that
plays $b^{2}$ gets the same payoff as a $c^{2}$-individual, so selection has
no effect at node $2$. Since node $2$ is not reached, all actions at $2$
yield the same payoff; there is no ``evolutionary pressure'' at $2$.
Mutations in the population at $2$, because they are not wiped out, keep
accumulating (this is called ``genetic drift''). Eventually, after a
sufficiently long time,\footnote{%
The assumption is that mutations have positive---though small---probability
at each period.} more than half the population at $2$ will consist of $b^{2}$%
-individuals. At this point the action $b^{1}$ at $1$ gets a higher expected
payoff than the action $c^{1}$, and thus selection at $1$ favors $b^{1}$. So
the proportion of $b^{1}$ at node $1$ becomes positive (and increases),
which renders node $2$ reachable. Once $2$ is reached, evolutionary pressure
there---i.e., selection---becomes effective, and it moves population $2$
towards the better strategy $b^{2}$. This only increases the advantage of $%
b^{1}$ over $c^{1}$, and the whole system gets to the $b=(b^{1},b^{2})$
equilibrium.

To summarize: In $\Gamma _{1}$, evolutionary dynamics lead necessarily to $b$%
, the backward induction equilibrium; in other words, $b$ is the
evolutionarily stable equilibrium.

The next example is the $3$-player game $\Gamma _{2}$ of Figure \ref{ex2} 
%TCIMACRO{\TeXButton{BeginFigure-ex2}{\begin{figure}[tb] \centering}}
%BeginExpansion
\begin{figure}[tb] \centering%
%EndExpansion
%TCIMACRO{
%\TeXButton{ex2}{ 
%\unitlength 1.00mm
%
%\begin{picture}(70.00,92.00)(5,-2)
%\linethickness{1.0pt}
%\put(30.00,80.00){\line(1,-1){20.00}}
%\put(50.00,60.00){\line(1,-1){20.00}}
%\put(50.00,60.00){\line(-1,-1){20.00}}
%\put(30.00,80.00){\circle*{1.50}}
%\put(50.00,60.00){\circle*{1.50}}
%\put(30.00,84.00){\makebox(0,0)[cc]{$1$}}
%\put(52.00,63.00){\makebox(0,0)[cc]{$2$}}
%\put(30.00,80.00){\line(-1,-1){20.00}}
%\put(10.00,56.00){\makebox(0,0)[cc]{$0$}}
%\put(10.00,51.00){\makebox(0,0)[cc]{$0$}}
%\put(70.00,36.00){\makebox(0,0)[cc]{$1$}}
%\put(70.00,31.00){\makebox(0,0)[cc]{$1$}}
%\put(19.00,73.00){\makebox(0,0)[cc]{$c^1$}}
%\put(42.00,73.00){\makebox(0,0)[cc]{$b^1$}}
%\put(39.00,53.00){\makebox(0,0)[cc]{$c^2$}}
%\put(62.00,53.00){\makebox(0,0)[cc]{$b^2$}}
%\put(30.00,40.00){\line(1,-1){20.00}}
%\put(30.00,40.00){\line(-1,-1){20.00}}
%\put(30.00,40.00){\circle*{1.50}}
%\put(28.00,43.00){\makebox(0,0)[cc]{$3$}}
%\put(8.00,16.00){\makebox(0,0)[cc]{$-1$}}
%\put(9.70,11.00){\makebox(0,0)[cc]{$5$}}
%\put(48.00,16.00){\makebox(0,0)[cc]{$-1$}}
%\put(48.00,11.00){\makebox(0,0)[cc]{$-1$}}
%\put(19.00,33.00){\makebox(0,0)[cc]{$c^3$}}
%\put(42.00,33.00){\makebox(0,0)[cc]{$b^3$}}
%\put(10.00,46.00){\makebox(0,0)[cc]{$0$}}
%\put(70.00,26.00){\makebox(0,0)[cc]{$1$}}
%\put(8.00,6.00){\makebox(0,0)[cc]{$-1$}}
%\put(49.70,6.00){\makebox(0,0)[cc]{$5$}}
%\end{picture}
%}}
%BeginExpansion
 
\unitlength 1.00mm

\begin{picture}(70.00,92.00)(5,-2)
\linethickness{1.0pt}
\put(30.00,80.00){\line(1,-1){20.00}}
\put(50.00,60.00){\line(1,-1){20.00}}
\put(50.00,60.00){\line(-1,-1){20.00}}
\put(30.00,80.00){\circle*{1.50}}
\put(50.00,60.00){\circle*{1.50}}
\put(30.00,84.00){\makebox(0,0)[cc]{$1$}}
\put(52.00,63.00){\makebox(0,0)[cc]{$2$}}
\put(30.00,80.00){\line(-1,-1){20.00}}
\put(10.00,56.00){\makebox(0,0)[cc]{$0$}}
\put(10.00,51.00){\makebox(0,0)[cc]{$0$}}
\put(70.00,36.00){\makebox(0,0)[cc]{$1$}}
\put(70.00,31.00){\makebox(0,0)[cc]{$1$}}
\put(19.00,73.00){\makebox(0,0)[cc]{$c^1$}}
\put(42.00,73.00){\makebox(0,0)[cc]{$b^1$}}
\put(39.00,53.00){\makebox(0,0)[cc]{$c^2$}}
\put(62.00,53.00){\makebox(0,0)[cc]{$b^2$}}
\put(30.00,40.00){\line(1,-1){20.00}}
\put(30.00,40.00){\line(-1,-1){20.00}}
\put(30.00,40.00){\circle*{1.50}}
\put(28.00,43.00){\makebox(0,0)[cc]{$3$}}
\put(8.00,16.00){\makebox(0,0)[cc]{$-1$}}
\put(9.70,11.00){\makebox(0,0)[cc]{$5$}}
\put(48.00,16.00){\makebox(0,0)[cc]{$-1$}}
\put(48.00,11.00){\makebox(0,0)[cc]{$-1$}}
\put(19.00,33.00){\makebox(0,0)[cc]{$c^3$}}
\put(42.00,33.00){\makebox(0,0)[cc]{$b^3$}}
\put(10.00,46.00){\makebox(0,0)[cc]{$0$}}
\put(70.00,26.00){\makebox(0,0)[cc]{$1$}}
\put(8.00,6.00){\makebox(0,0)[cc]{$-1$}}
\put(49.70,6.00){\makebox(0,0)[cc]{$5$}}
\end{picture}
%
%EndExpansion
\caption{The game $\Gamma _2$\label{ex2}}%
%TCIMACRO{\TeXButton{EndFigure}{\end{figure}}}
%BeginExpansion
\end{figure}%
%EndExpansion
(see N\"{o}ldeke and Samuelson [1993] or Samuelson [1997, Example 8.2]). The
backward induction equilibrium is $b=\left( b^{1},b^{2},b^{3}\right) $; the
other pure Nash equilibrium---which is not perfect---is $%
c=(c^{1},c^{2},c^{3})$. Start from a state where all individuals at each
node $i$ play their backward induction action $b^{i}$. Nodes $1$ and $2$ are
reached, whereas node $3$ is not. Therefore there is no selection operating
at node $3$, and mutations move the population at $3$ randomly. As long as
the proportion of $b^{3}$ is at least $2/3$, the system is in equilibrium.
Once it goes below $2/3$, the best-reply of $2$ becomes $c^{2}$; selection
then moves the population at $2$ toward $c^{2}$. But then node $3$ is no
longer unreached, so selection starts affecting the population at $3$%
---towards the best-reply there, $b^{3}$. Thus, as soon as the proportion of 
$b^{3}$ drops below $2/3$, the evolutionary dynamic immediately pushes it
back up; it is as if there is a ``reflecting barrier'' below the $2/3$ mark.
All this happens fast enough that the population at $1$, which is playing $%
b^{1}$, can move only a little, if at all. Therefore we have essentially
shown that the equilibrium component\footnote{%
Two (mixed) Nash equilibria belong to the same (\emph{equilibrium})\emph{\
component} if their equilibrium paths coincide, and they differ only at
unreached nodes.} of $b$---where $b^{i}$ is played at $i=1,2$ and $b^{3}$ is
played at $3$ with proportion at least $2/3$---is evolutionarily stable.
Moreover, since $b$ and its component must eventually be reached from any
state---by appropriate mutations---it follows that other equilibria, in
particular $c$, are \emph{not} stable. This conclusion is in contrast to the
result of N\"{o}ldeke and Samuelson [1993]: In their model,\footnote{%
The main difference between their model and ours is that each one of their
individuals, in addition to playing a pure action (as ours do), has
conjectures about the composition of the populations at all nodes, whether
reached or not. The dynamic then affects both actions and conjectures. In a
biological world, it is hard to see how such conjectures are encoded into
genes; our individuals are therefore characterized by their actions only.}
the non-subgame-perfect equilibrium $c$ also belongs to the stable set.

Consider now another $3$-player game: the game $\Gamma _{3}$ given in Figure 
\ref{ex3}%
%TCIMACRO{\TeXButton{BeginFigure-ex3}{\begin{figure}[tb] \centering}}
%BeginExpansion
\begin{figure}[tb] \centering%
%EndExpansion
%TCIMACRO{
%\TeXButton{ex3}{ 
%\unitlength 1.00mm
%
%\begin{picture}(72.00,92.00)(5,-2)
%\linethickness{1.0pt}
%\put(50.00,80.00){\line(-1,-1){20.00}}
%\put(30.00,60.00){\line(-1,-1){20.00}}
%\put(30.00,60.00){\line(1,-1){20.00}}
%\put(50.00,80.00){\circle*{1.50}}
%\put(30.00,60.00){\circle*{1.50}}
%\put(50.00,84.00){\makebox(0,0)[cc]{$1$}}
%\put(28.00,63.00){\makebox(0,0)[cc]{$2$}}
%\put(50.00,80.00){\line(1,-1){20.00}}
%\put(70.00,56.00){\makebox(0,0)[cc]{$4$}}
%\put(70.00,51.00){\makebox(0,0)[cc]{$0$}}
%\put(10.00,36.00){\makebox(0,0)[cc]{$5$}}
%\put(10.00,31.00){\makebox(0,0)[cc]{$9$}}
%\put(62.00,73.00){\makebox(0,0)[cc]{$b^1$}}
%\put(39.00,73.00){\makebox(0,0)[cc]{$c^1$}}
%\put(42.00,53.00){\makebox(0,0)[cc]{$b^2$}}
%\put(19.00,53.00){\makebox(0,0)[cc]{$c^2$}}
%\put(50.00,40.00){\line(-1,-1){20.00}}
%\put(50.00,40.00){\line(1,-1){20.00}}
%\put(50.00,40.00){\circle*{1.50}}
%\put(52.00,43.00){\makebox(0,0)[cc]{$3$}}
%\put(70.00,16.00){\makebox(0,0)[cc]{$0$}}
%\put(69.00,11.00){\makebox(0,0)[cc]{$10$}}
%\put(30.00,16.00){\makebox(0,0)[cc]{$0$}}
%\put(30.00,11.00){\makebox(0,0)[cc]{$0$}}
%\put(62.00,33.00){\makebox(0,0)[cc]{$b^3$}}
%\put(39.00,33.00){\makebox(0,0)[cc]{$c^3$}}
%\put(70.00,46.00){\makebox(0,0)[cc]{$0$}}
%\put(10.00,26.00){\makebox(0,0)[cc]{$0$}}
%\put(70.00,6.00){\makebox(0,0)[cc]{$1$}}
%\put(30.00,6.00){\makebox(0,0)[cc]{$0$}}
%\end{picture}
%}}
%BeginExpansion
 
\unitlength 1.00mm

\begin{picture}(72.00,92.00)(5,-2)
\linethickness{1.0pt}
\put(50.00,80.00){\line(-1,-1){20.00}}
\put(30.00,60.00){\line(-1,-1){20.00}}
\put(30.00,60.00){\line(1,-1){20.00}}
\put(50.00,80.00){\circle*{1.50}}
\put(30.00,60.00){\circle*{1.50}}
\put(50.00,84.00){\makebox(0,0)[cc]{$1$}}
\put(28.00,63.00){\makebox(0,0)[cc]{$2$}}
\put(50.00,80.00){\line(1,-1){20.00}}
\put(70.00,56.00){\makebox(0,0)[cc]{$4$}}
\put(70.00,51.00){\makebox(0,0)[cc]{$0$}}
\put(10.00,36.00){\makebox(0,0)[cc]{$5$}}
\put(10.00,31.00){\makebox(0,0)[cc]{$9$}}
\put(62.00,73.00){\makebox(0,0)[cc]{$b^1$}}
\put(39.00,73.00){\makebox(0,0)[cc]{$c^1$}}
\put(42.00,53.00){\makebox(0,0)[cc]{$b^2$}}
\put(19.00,53.00){\makebox(0,0)[cc]{$c^2$}}
\put(50.00,40.00){\line(-1,-1){20.00}}
\put(50.00,40.00){\line(1,-1){20.00}}
\put(50.00,40.00){\circle*{1.50}}
\put(52.00,43.00){\makebox(0,0)[cc]{$3$}}
\put(70.00,16.00){\makebox(0,0)[cc]{$0$}}
\put(69.00,11.00){\makebox(0,0)[cc]{$10$}}
\put(30.00,16.00){\makebox(0,0)[cc]{$0$}}
\put(30.00,11.00){\makebox(0,0)[cc]{$0$}}
\put(62.00,33.00){\makebox(0,0)[cc]{$b^3$}}
\put(39.00,33.00){\makebox(0,0)[cc]{$c^3$}}
\put(70.00,46.00){\makebox(0,0)[cc]{$0$}}
\put(10.00,26.00){\makebox(0,0)[cc]{$0$}}
\put(70.00,6.00){\makebox(0,0)[cc]{$1$}}
\put(30.00,6.00){\makebox(0,0)[cc]{$0$}}
\end{picture}
%
%EndExpansion
\caption{The game $\Gamma _3$\label{ex3}}%
%TCIMACRO{\TeXButton{EndFigure}{\end{figure}}}
%BeginExpansion
\end{figure}%
%EndExpansion
. The backward induction equilibrium is $b$, at which both nodes $2$ and $3$
are unreached. The populations at $2$ and $3$ therefore move by mutations.
Eventually, when the proportion of $b^{2}$ at node $2$ goes below $1/5$,
selection at $1$ will move the population at $1$ from $b^{1}$ to $c^{1}$. At
that point both $2$ and $3$ are reached, and which action of $2$ is the
best-reply at $2$ depends on the composition of the population at $3$. If
less than $9/10$ of them play $b^{3}$ (which is possible, and even quite
probable,\footnote{%
We take ``quite probable'' to mean that the probability of its happening is
positive and bounded away from zero (as the rate of mutation goes to zero).}
given that only random mutations have affected $3$ until now), then $c^{2}$
is the best-reply at $2$, and selection keeps decreasing the proportion of $%
b^{2}$. Again, it is quite probable for the proportion of $b^{2}$ to get all
the way down to $0$ (from $1/5$) long before the proportion of $b^{3}$ at $3$
has increased to $9/10$. What this discussion shows is that, in the game $%
\Gamma _{3}$, the non-subgame-perfect equilibrium $c$ (together with its
equilibrium component) cannot be ruled out; evolutionary dynamic systems may
well be in such states a positive fraction of the time.

However, we claim that such behavior \emph{cannot occur if the populations
are large enough}.

\subsection{This Paper\label{ss-this-paper}}

As stated above, the games we study in this paper are finite extensive-form
games with perfect information. We assume that the backward induction
equilibrium is unique; this holds when the game is generic (i.e., in almost
every game). At each node there is a distinct population of individuals that
play the game in the role of the corresponding player. Each individual is
fully characterized by his action, i.e., by the pure choice that he makes at
his node (of course, this goes into effect only if his node is reached%
\footnote{%
This action is thus the individual's ``genotype''---the hard-wired
programming by the genes; it becomes his ``phenotype''---his actual revealed
behavior---when his node is reached and it is his turn to play.}). We will
refer to such a population game as a ``gene-normal form'' (it parallels the
``agent-normal form'').

The games are analyzed in a dynamic framework. The model is as follows: At
each period, one individual is chosen at random\footnote{%
Uniformly, i.e., each individual has the same probability of being chosen.}
in each population. His current action, call it $a^{i}$, may then change by
selection, by mutation, or it may not change at all. Selection replaces $%
a^{i}$ by another action which, against the other populations currently
playing the game (i.e., ``against the field''), yields a higher payoff than $%
a^{i}$. Of course, this can only be if such a better action exists; if there
are many, one of them is chosen at random. Mutation replaces $a^{i}$ by an
arbitrary action, chosen at random. Finally, all the choices at each node
are made independently. This model---to which we refer as the ``basic
model''---is essentially a most elementary process that provides for both
adaptive selection and mutation. It turns out that the exact probabilities
of all the above choices do not matter; what is essential is that all of
them be bounded away from zero (this is the ``general model'').

Since such dynamics yield an ergodic system,\footnote{%
Mutations make every state reachable from any other state.} the long-run
behavior is well described by the corresponding (unique) invariant
distribution, which, for each state, gives the (approximate) frequency of
that state's occurrence during any large time interval. The mutations are
rare; we are therefore interested in those states which occur with positive
frequency however low the mutation rate is.\footnote{%
More precisely, states whose probability, according to the invariant
distribution, is bounded away from zero as the probability of mutation goes
to zero.} Such states are called ``evolutionarily stable.'' Two preliminary
results are that only Nash equilibria can be evolutionarily stable, and that
the backward induction equilibrium is always evolutionarily stable. The
examples show that other Nash equilibria may however be evolutionarily
stable as well.

We therefore add another factor: The populations are large. This yields:

\begin{quote}
\textbf{Main Result.} \emph{The backward induction equilibrium becomes in
the limit the only evolutionarily stable outcome as the mutation rate
decreases to zero and the populations increase to infinity.\footnote{%
With an additional proviso that the population sizes be large enough,
depending on the mutation rate; see the precise statement in Subsection \ref
{ss-main-result}.}}
\end{quote}

\noindent In other words: Evolutionary dynamic systems, consisting of
adaptive selection and rare mutations, lead in large populations to most of
the individuals most of the time playing their backward induction strategy.
Observe that this applies to reached as well as to unreached nodes; for
example, in game $\Gamma _{2}$ we have most of the individuals at all nodes $%
i$---\emph{including} node $3$---playing $b^{i}$. Evolutionary stability in
large populations picks out not merely the equilibrium component of $b$, but 
$b$ itself.\footnote{%
Actually, an arbitrarily small neighborhood of $b$.} The intuition for the
role of the large population assumption will be provided in Subsection \ref
{ss-intuition} below. Suffice it to say here that it has to do with a change
of action (whether by mutation or selection) being less likely for a
specific individual than for an arbitrary individual in a large population.
This leads to considerations of ``sequential'' rather than ``simultaneous''
mutations. As a further consequence, unlike most of the evolutionary
game-theoretic literature,\footnote{%
An exception is N\"{o}ldeke and Samuelson [1995].} our result does \emph{not}
rely on comparing different \emph{powers} of the infinitesimal mutation rate
(which require extremely long waiting times); single mutations suffice.%
\footnote{%
The static notion of an ``evolutionarily stable strategy'' is also based on
single mutations.}

To conclude the introduction, we note that two almost diametrically opposed
approaches each lead to the backward induction equilibrium. One approach
(Aumann [1995]), in the realm of full rationality, assumes that all players
are rational (i.e., they never play something which they know is not
optimal), and moreover assumes that this fact is commonly known to them
(i.e., each player knows that everyone is rational, and also knows that
everyone knows that everyone is rational, and so on). The other approach
(this paper), in the realm of evolutionary dynamics, is essentially
machinelike and requires no conscious optimization or rationality.\footnote{%
The biological mechanisms of selection are entirely automatic; other
selection processes (like learning, imitation, and so on) may well use some
form of rationality or ``bounded rationality.''} It is striking that such
disparate models converge.\footnote{%
For an interesting discussion of these matters, see Aumann [1998] (in
particular pages 191--195).}

The paper is organized as follows: Section \ref{s-model} presents the model:
the extensive form game (in Subsection \ref{ss-game}), the associated
population game (in Subsection \ref{ss-gene-normal}), and the evolutionary
dynamics (in Subsection \ref{ss-dynamics}). The results are stated in
Section \ref{s-results}, which also includes the proof of the preliminary
result for fixed populations (in Subsection \ref{ss-preliminary-result}).
The Main Result, stated in Subsection \ref{ss-main-result}, is proved in
Section \ref{s-proof}. The intuition behind our result is presented in
Subsection \ref{ss-intuition}, followed by an informal outline of the proof
(in Subsection \ref{ss-outline}). We conclude in Section \ref{s-discussion}
with a discussion of various issues, including possible extensions and
generalizations of our results.

\section{The Model\label{s-model}}

\subsection{The Game\label{ss-game}}

Let $\Gamma $ be a finite extensive-form game with perfect information. We
are thus given a rooted tree; each non-terminal vertex corresponds to a 
\emph{move}. It may be a chance move, with fixed positive probabilities for
all outgoing branches; or a move of one of the players, in which case the
vertex will be called a \emph{node}. The set of nodes is denoted $N$. It is
convenient to view the game in ``agent-normal form'': At each node there is
a different agent, and a player consists of a number of agents with
identical payoff functions. For each node $i\in N$, the agent there---called
``agent $i$''---has a set of choices $A^{i}$, which is the set of outgoing
branches at $i$. We refer to $a^{i}$ in $A^{i}$ as an \emph{action }of $i$,
and we put $A:=\prod_{i\in N}A^{i}$ for the set of $N$-tuples of actions. At
each terminal vertex (a \emph{leaf}) there are associated payoffs to all
agents; let\footnote{$\Bbb{R}$ is the real line.} $u^{i}:A\rightarrow \Bbb{R}
$ be the resulting payoff function of agent $i$ (i.e., for each $a=\left(
a^{j}\right) _{j\in N}\in A$: if there are no chance moves, then $u^{i}(a)$
is the payoff of $i$ at the leaf that is reached when every agent $j\in N$
chooses $a^{j}$; if there are chance moves, it is the appropriate
expectation). Of course, if $i$ and $j$ are agents of the same player, then $%
u^{i}\equiv u^{j}$. As usual, the payoff functions are extended
multi-linearly to \emph{randomized }(or \emph{mixed}) actions; thus $%
u^{i}:X\rightarrow \Bbb{R}$, where $X:=\prod_{i\in N}X^{i}$ and $%
X^{i}:=\Delta (A^{i})=\left\{ x^{i}\in \Bbb{R}_{+}^{A^{i}}:\sum_{a^{i}\in
A^{i}}x_{a^{i}}^{i}=1\right\} $, the unit simplex on $A^{i}$, is the set of
probability distributions over $A^{i}$.

For each node $i\in N$, let $N(i)$ be the set of nodes that are successors
(not necessarily immediate) of $i$ in the tree, and let $\Gamma (i)$ be the
subgame of $\Gamma $ starting at the node $i$. For example, if $1\in N$ is
the root then $N(1)=N\backslash \{1\}$ and $\Gamma (1)=\Gamma $; in general, 
$j\in N(i)$ if and only if the unique path from the root to $j$ goes through 
$i$, and the set of nodes of $\Gamma (i)$ is $N(i)\cup \{i\}$.

An $N$-tuple of randomized actions $x=\left( x^{i}\right) _{i\in N}\in X$ is
a \emph{Nash equilibrium} of\emph{\ }$\Gamma $ if\footnote{%
We write $x^{-i}$ for the $\left( \left| N\right| -1\right) $-tuple of
actions of the other agents, i.e., $x^{-i}=\left( x^{j}\right) _{j\in
N\backslash \{i\}}.$} $u^{i}(x)\geq u^{i}(y^{i},x^{-i})$ for every $i\in N$
and every $y^{i}\in X^{i}$. It is moreover a \emph{subgame-perfect }(or 
\emph{backward induction}) \emph{equilibrium }of $\Gamma $ if it is a Nash
equilibrium in each subgame $\Gamma (i)$, for all $i\in N$. This is
equivalent to each $x^{i}$ being a best-reply of $i$ in $\Gamma (i)$ when
every $j\in N(i)$ plays $x^{j}$. Such an equilibrium is therefore obtained
by \emph{backward induction, }starting from the final nodes (those nodes $i$
with no successors, i.e., with $N(i)=\emptyset $) and going towards the
root. We will denote by $EQ$ and $BI$ the set of Nash equilibria and the set
of backward induction equilibria, respectively, of the game $\Gamma $; thus $%
BI\subset EQ\subset X$.

At this point it is useful to point out the distinction between a best-reply
of $i$ in the whole game $\Gamma $---which we call a \emph{global best-reply}%
---and a best-reply of $i$ in the subgame $\Gamma (i)$---which we call a 
\emph{local best-reply. }Thus a local best-reply is always a global
best-reply, but the converse is not necessarily true.\footnote{%
One should not get confused with the parallel notions for optima (where
global implies local).} If $i$ is reached (i.e., when all agents on the path
from the root to $i$ make the choice along the path with positive
probability), then the two notions coincide. If $i$ is not reached, then the
payoff of $i$ in $\Gamma $ is independent of his action, and thus every
action in $A^{i}$ (and every mixed action in $X^{i})$ is a global best-reply
of $i$ (but not necessarily a local best-reply). The difference between a
Nash equilibrium and a subgame-perfect equilibrium is precisely that in the
former each action of an agent that is played with positive probability is a
global best-reply to the others' (mixed) actions, whereas in the latter it
is additionally a local best-reply.

The classical result of Kuhn [1953] states that there always exists a \emph{%
pure} backward induction equilibrium; the proof constructs it by backward
induction. We assume here that the game $\Gamma $ has a \emph{unique }%
backward induction equilibrium, which must therefore be pure; we denote it%
\emph{\ }$b=\left( b^{i}\right) _{i\in N}\in A$, and refer to $b^{i}$ as the
``backward induction action of $i$.'' This uniqueness is true generically,
i.e., for almost every game. For instance, when there are no chance moves,
it suffices for each player to have different payoffs at different leaves.

\subsection{The Gene-Normal Form\label{ss-gene-normal}}

We now consider a \emph{population game} associated to $\Gamma $: At each
node $i\in N$ there is a non-empty population $M(i)$ of individuals playing
the game in the role of player $i$. The populations are assumed to be
distinct (i.e., $M(i)\cap M(j)=\emptyset $ for $i\neq j$). Each individual $%
q\in M(i)$ plays a pure action in $A^{i}$, which we denote by $\omega
_{q}^{i}\in A^{i}$; put $\omega ^{i}=\left( \omega _{q}^{i}\right) _{q\in
M(i)}$ and $\omega =\left( \omega ^{i}\right) _{i\in N}$. For each $a^{i}\in
A^{i}$, let\footnote{$\left| Z\right| $ denotes the number of elements of
the finite set $Z.$} 
\begin{equation}
x_{a^{i}}^{i}\equiv x_{a^{i}}^{i}\left( \omega ^{i}\right) :=\frac{\left|
\left\{ q\in M(i):\omega _{q}^{i}=a^{i}\right\} \right| }{\left| M(i)\right| 
}  \label{eq-f}
\end{equation}
be the proportion of population $M(i)$ that plays the action $a^{i}$; then $%
x^{i}\equiv x^{i}\left( \omega ^{i}\right) :=\left( x_{a^{i}}^{i}\left(
\omega ^{i}\right) \right) _{a^{i}\in A^{i}}\in X^{i}$ may be viewed as a
mixed action of $i$. The payoff of an individual $q\in M(i)$ is defined as
his average payoff against the other populations, i.e., $u^{i}(\omega
_{q}^{i},x^{-i})$; we shall slightly abuse notation by writing this as $%
u^{i}(\omega _{q}^{i},\omega ^{-i})$.

We refer to the above model as the \emph{gene-normal form of }$\Gamma $\emph{%
\ }(it is the counterpart, in population games, of the ``agent-normal
form'').

This model is clear and needs no explanation when all the players in $\Gamma 
$ are distinct (i.e., each player plays at most once in $\Gamma $). When
however a player may play more than once (and thus has more than one agent),
then the ``biological'' interpretation is as follows: Each one of the
player's decisions (i.e., each one of his agents $i$) is controlled by a
``gene,'' whose various ``alleles'' correspond to the possible choices at
node $i$ (i.e., the set of alleles of gene $i$ is precisely $A^{i})$. The
genes of different nodes $i$ and $j$ of the same player are distinct (i.e.,
at different locations, or ``loci''); were it the same gene, then the player
would behave identically at the two nodes---and the appropriate
representation should have the two nodes $i$ and $j$ in the same information
set.\footnote{%
One would thus get a game of imperfect information, where information sets
are not necessarily singletons. Moreover, observe that here a path may
intersect an information set more than once; we are thus led naturally to
``games of imperfect recall.''}$^{,}$\footnote{%
A case where, say, the decision at $i$ is controlled by the two genes in
locations $1$ and $2,$ and the decision at $j$ by the two genes in locations 
$2$ and $3,$ is not allowed here.} For a discussion of these issues and
their import, the reader is referred to Subsection \ref{ss-agents}.

\subsection{The Dynamics\label{ss-dynamics}}

We come now to the dynamic model. A \emph{state }$\omega $ of the system
specifies the pure action of each individual in each population; i.e., $%
\omega =\left( \omega ^{i}\right) _{i\in N}$, where $\omega ^{i}=\left(
\omega _{q}^{i}\right) _{q\in M(i)}$ and $\omega _{q}^{i}\in A^{i}$ for each 
$i\in N$ and each $q\in M(i)$. Let $\Omega :=\prod_{i\in N}\left(
A^{i}\right) ^{M(i)}$ be the state space. We consider discrete-time
stochastic systems: Starting with an initial state\footnote{%
As we shall see below, the process is ergodic; thus, in the long run, the
starting state does not matter. Hence there will be no need to specify it.} $%
\omega _{1}\in \Omega $, a sequence of states $\omega _{1},\omega
_{2},...,\omega _{t},...$ in $\Omega $ is generated, according to certain
probabilistic rules. These so-called ``transition probabilities'' specify,
for each $t=1,2,...,$ the probabilities $P\left[ \omega _{t+1}=\widetilde{%
\omega }\mid \omega _{1},...,\omega _{t}\right] $ that $\omega _{t+1}$
equals a state $\widetilde{\omega }\in \Omega $, given the history $\omega
_{1},\omega _{2},...,\omega _{t}$. Our processes will be \emph{stationary
Markov chains}: The transition probabilities depend only on $\omega _{t}$,
the state in the previous period (and depend neither on the other past
states $\omega _{1},...,\omega _{t-1}$, nor on the ``calendar time'' $t$).
That is, there is a stochastic matrix\footnote{%
I.e., $Q\left[ \widetilde{\omega }\mid \omega \right] \geq 0$ for all $%
\widetilde{\omega },\omega \in \Omega $ and $\sum_{\widetilde{\omega }\in
\Omega }Q\left[ \widetilde{\omega }\mid \omega \right] =1$ for every $\omega
\in \Omega .$} $Q=\left( Q\left[ \widetilde{\omega }\mid \omega \right]
\right) _{\widetilde{\omega },\omega \in \Omega }$ such that $P\left[ \omega
_{t+1}=\widetilde{\omega }\mid \omega _{1},...,\omega _{t}\right] =Q\left[ 
\widetilde{\omega }\mid \omega _{t}\right] $ for every $\omega ,\widetilde{%
\omega }\in \Omega $ and $t=1,2,...$. The matrix $Q$ is called \emph{the
one-step transition probability matrix.}

We present first a simple dynamic model, which we call the \emph{basic model}%
. Assume that all populations are of equal size, say $m=\left| M(i)\right| $
for each $i\in N$. Let $\mu >0$ and $\sigma >0$ be given, such that $\mu
+\sigma \leq 1$. The one-step transition probabilities $Q\left[ \widetilde{%
\omega }\mid \omega \right] $ are given by the following process, performed
independently for each $i\in N$:

\begin{itemize}
\item  Choose a player $q(i)\in M(i)$ at random: All $m$ players in $M(i)$
have the same probability $1/m$ of being chosen.

\item  Put $\widetilde{\omega }_{q}^{i}:=\omega _{q}^{i}$ for each $q\in
M(i),q\neq q(i)$; i.e., all individuals in $M(i)$ except $q(i)$ do not
change their actions.

\item  Choose one of SE$\left( i\right) $ (``selection''), MU$\left(
i\right) $ (``mutation'') and NC$\left( i\right) $ (``no change''), with
probabilities $\sigma ,\mu $ and $1-\mu -\sigma $, respectively.

\item  If selection SE$\left( i\right) $ was chosen, then define 
\begin{equation}
B^{i}\equiv B^{i}\left( q(i),\omega \right) :=\left\{ a^{i}\in
A^{i}:u^{i}(a^{i},\omega ^{-i})>u^{i}(\omega _{q(i)}^{i},\omega
^{-i})\right\} ;  \label{eq-Bi}
\end{equation}
this is the set of actions of player $i$ that are strictly better in $\Gamma 
$, against the populations at the other nodes, than the action $\omega
_{q(i)}^{i}$ of the chosen player $q(i)$. If $B^{i}$ is not empty, then the
new action $\widetilde{\omega }_{q(i)}^{i}$ of $q(i)$ is a randomly chosen
better action: $\widetilde{\omega }_{q(i)}^{i}:=a^{i}$ with probability $%
1/\left| B^{i}\right| $ for each $a^{i}\in B^{i}$. If $B^{i}$ is empty, then
there is no change in $q(i)$'s action: $\widetilde{\omega }%
_{q(i)}^{i}:=\omega _{q(i)}^{i}$.

\item  If mutation MU$\left( i\right) $ was chosen, then $\widetilde{\omega }%
_{q(i)}^{i}$ is a random action in $A^{i}$; i.e., $\widetilde{\omega }%
_{q(i)}^{i}:=a^{i}$ with probability $1/\left| A^{i}\right| $ for each $%
a^{i}\in A^{i}$.

\item  If no-change NC$\left( i\right) $ was chosen, then the action of $%
q(i) $ does not change: $\widetilde{\omega }_{q(i)}^{i}:=\omega _{q(i)}^{i}$.
\end{itemize}

For example, in the game $\Gamma _{1}$ of Subsection \ref{ss-examples} (see
Figure \ref{ex1}; here $N=\{1,2\},A^{1}=\{c^{1},b^{1}\},A^{2}=\{c^{2},b^{2}%
\} $), with populations of size $m=3$, let $\omega =\left(
(c^{1},c^{1},c^{1}),(b^{2},b^{2},c^{2})\right) $ and $\widetilde{\omega }%
=\left( \left( b^{1},c^{1},c^{1}\right) ,\left( b^{2},b^{2},b^{2}\right)
\right) $, then $Q\left[ \widetilde{\omega }\mid \omega \right] =(1/3)\cdot
(\mu /2+\sigma )\cdot (1/3)\cdot \left( \mu /2\right) $. Indeed, the
probability that $q(1)=1$ is $1/3$; then $c^{1}$ changes to $b^{1}$ either
by mutation, with probability $\mu \cdot (1/2)$, or by selection (since $%
B^{1}=\{b^{1}\}$), with probability $\sigma $; similarly, the probability
that $q(2)=3$ is $1/3$, and then $c^{2}$ changes to $b^{2}$ by mutation only
(since $B^{2}=\emptyset $), with probability $\mu \cdot (1/2)$.

A few remarks are now in order.\medskip

\textbf{Remarks.}

\begin{enumerate}
\item  We have assumed that in each period there is at most one individual
in each population that may change his action. This defines what is meant by
``one period'': It is that time interval which is small enough that the
probability of more than one individual changing his action in the same
period is (relatively) negligible.\footnote{%
Our results do not need simultaneous mutations; they may indeed be ignored.}
This is a standard construct in stochastic setups (recall, for instance, the
construction of the Poisson distribution as an approximation to the binomial
distribution). Our assumption may thus be viewed as essentially nothing more
than a convenient rescaling of time; if one expects, say, $k$ changes each
period, then time should be rescaled by a factor of $1/k$. As we shall see
below (in particular in Subsection \ref{ss-intuition}), our arguments are
based on \emph{comparing} occurrence times, and are thus independent of the
units in which time is measured.

\item  The difference between mutation and selection is that mutation is
``blind''---in the sense that \emph{all} actions are possible---whereas
selection is ``directional''---\emph{only better} actions are possible. We
emphasize that ``better'' is understood---as it should be---with respect to
the payoffs in the whole game, i.e., ``globally better.'' Of course,
``selection'' may stand for various processes of adaptation, imitation,
learning, experimentation, and so on. Our model abstracts away from
particular selection mechanisms, and merely assumes that better actions fare
better. See also Subsection \ref{ss-extensions} for the case where the
selection probability of a better action is proportional to its current
proportion in the population.

\item  The basic dynamic is determined by two parameters,\footnote{%
Once the game $\Gamma $ and the population size $m$ are given.} $\mu $ and $%
\sigma $. As we shall see below, what really matters is that $\mu $ be small
relative to $\sigma $; formally, $\mu \rightarrow 0$ while $\sigma >0$ is
fixed: Mutations are rare relative to selection. Equivalently, we could well
take $\sigma =1-\mu $, and thus have only one parameter. We have preferred
to add the no-change case since it allows for more general interpretations.
The no-change periods may be viewed as ``payoff accumulation'' periods, or
as ``selection realization'' periods (i.e., periods during which the actual
selection occurs\footnote{%
This may help justify the fact that our selection mechanism is not
continuous (any ``better'' action has probability bounded away from zero,
whereas an ``equally good'' action has zero probability): Indeed, selection
makes even a slightly better action ``win,'' given enough time. See also
Subsection \ref{ss-extensions}.}).

\item  The one-step transition probabilities were defined to be independent
over the agents; this just means that the transitions are \emph{%
conditionally independent. }In general, the evolution of one population will
depend substantially on that of other populations.
\end{enumerate}

The basic model is essentially a most simple model that captures the
evolutionary paradigm of selection and mutation. It may appear however as
too specific. Therefore we now present a general class of dynamic models,
which will turn out to lead to the same results.

The \emph{general model} is as follows: We are given a \emph{mutation rate}
parameter $\mu >0$ and populations $M(i)$ at all nodes $i\in N$, which may
be of different size at different nodes. The process is a stationary Markov
chain, whose one-step transition probability matrix $Q=\left( Q\left[ 
\widetilde{\omega }\mid \omega \right] \right) _{\widetilde{\omega },\omega
\in \Omega }$ satisfies:

\begin{itemize}
\item  Conditional independence over $i\in N$, i.e.,\footnote{%
For each $\omega \in \Omega ,$ we view $Q\left[ \cdot \mid \omega \right] $
as a probability distribution over $\Omega $; derived probabilites---like
its marginals---will also be denoted by $Q\left[ \cdot \mid \omega \right] .$%
} 
\begin{equation}
Q\left[ \widetilde{\omega }\mid \omega \right] =\prod_{i\in N}Q\left[ 
\widetilde{\omega }^{i}\mid \omega \right] .  \label{eq-independence}
\end{equation}

\item  For each $i\in N$, one individual $q(i)\in M(i)$ is chosen, such that
there exist constants $\gamma _{1},\gamma _{2}>0$ with 
\begin{equation}
\frac{\gamma _{1}}{\left| M(i)\right| }\leq Q\left[ q(i)=q\mid \omega
\right] \leq \frac{\gamma _{2}}{\left| M(i)\right| }\text{ for each }q\in
M(i);\text{ and}  \label{eq-M}
\end{equation}
\begin{equation}
Q\left[ \widetilde{\omega }_{q}^{i}=\omega _{q}^{i}\text{ for all }q\in
M(i)\backslash \{q(i)\}\mid \omega \right] =1.  \label{eq-q}
\end{equation}

\item  There exists a constant $\beta >0$ such that, for each $i\in N$, 
\begin{equation}
Q\left[ \widetilde{\omega }_{q(i)}^{i}=a^{i}\mid \omega \right] \geq \beta 
\text{ for each }a^{i}\in B^{i},  \label{eq-beta}
\end{equation}
where $B^{i}\equiv B^{i}(q(i),\omega )$ is the set of strictly better
actions, as defined in (\ref{eq-Bi}).

\item  There exist constants $\alpha _{1},\alpha _{2}>0$ such that, for each 
$i\in N$, 
\begin{eqnarray}
Q\left[ \widetilde{\omega }_{q(i)}^{i}=a^{i}\mid \omega \right] &\geq
&\alpha _{1}\mu \text{ for each }a^{i}\in A^{i};\text{ and}  \label{eq-mu} \\
Q\left[ \widetilde{\omega }_{q(i)}^{i}=a^{i}\mid \omega \right] &\leq
&\alpha _{2}\mu \text{ for each }a^{i}\notin B^{i},a^{i}\neq \omega
_{q(i)}^{i}.  \label{eq-mu2}
\end{eqnarray}
\end{itemize}

Without loss of generality, all parameters $\alpha _{1},\alpha _{2},\beta
,\gamma _{1},\gamma _{2}$ are taken to be the same for all $i\in N$ (if
needed, replace them by the appropriate maximum or minimum over $i)$. To see
that the basic model is a special case of the general model, take $\gamma
_{1}=\gamma _{2}=1,\beta =\sigma /\left| A^{i}\right| $ and $\alpha
_{1}=\alpha _{2}=1/\left| A^{i}\right| $.

The general model thus assumes that: (i) the probabilities of various
individuals in the same population being chosen are comparable;\footnote{%
I.e., the ratios $Q\left[ q(i)=q\mid \omega \right] /Q\left[ q(i)=q^{\prime
}\mid \omega \right] $ are uniformly bounded.} (ii) the effect of
selection---towards better actions---is bounded away from zero
(independently of $\mu $); and (iii) the effect of mutation---with every
action being possible---is of the order $\mu $. The reader is referred to
Subsections \ref{ss-selection} and \ref{ss-extensions} for generalizations.

\section{The Results\label{s-results}}

\subsection{Preliminary Results\label{ss-preliminary-result}}

A general model with a one-step transition matrix $Q$ satisfying (\ref
{eq-independence})--(\ref{eq-mu2}) yields a Markov chain which is \emph{%
ergodic}, since the probability of reaching any state $\omega ^{\prime }\in
\Omega $ from any other state $\omega \in \Omega $ is positive (this follows
from (\ref{eq-M}) and (\ref{eq-mu}), by using an appropriate sequence of
mutations). Therefore there exists a unique \emph{invariant distribution} $%
\pi $ on $\Omega $; i.e., a unique $\pi \in \Delta (\Omega )$ satisfying $%
\pi =\pi Q$, or 
\[
\pi \left[ \widetilde{\omega }\right] =\sum_{\omega \in \Omega }\pi \left[
\omega \right] Q\left[ \widetilde{\omega }\mid \omega \right] 
\]
for every $\widetilde{\omega }\in \Omega $. The long-run behavior of the
process is well described by $\pi $, in the following two senses:

\begin{itemize}
\item  The relative frequency, in any long enough period of time, of the
process visiting a state $\omega $, is approximately $\pi \left[ \omega
\right] $; i.e., for every $\omega \in \Omega $: 
\[
\lim_{T_{2}-T_{1}\rightarrow \infty }\frac{\left| \left\{ t:T_{1}<t\leq
T_{2},\omega _{t}=\omega \right\} \right| }{T_{2}-T_{1}}=\pi \left[ \omega
\right] . 
\]

\item  The probability that the process is in state $\omega $ at some period 
$t$ is approximately $\pi \left[ \omega \right] $ for large $t$; i.e., for
every $\omega \in \Omega $: 
\[
\lim_{t\rightarrow \infty }P\left[ \omega _{t}=\omega \right] =\pi \left[
\omega \right] . 
\]
\end{itemize}

\noindent Note that the two properties hold regardless of the initial state;
moreover, they hold also for any set of states $\Theta \subset \Omega $.

We are interested in the behavior of the process when the mutation rate is
low; i.e., in the limit of the invariant distribution $\pi $ as $\mu
\rightarrow 0$ and all the other parameters (the game $\Gamma $, the
population sizes $\left| M(i)\right| $ and the constants $\alpha _{1},\alpha
_{2},\beta ,\gamma _{1},\gamma _{2})$ are fixed. We call a state $\omega \in
\Omega $ \emph{evolutionarily stable} if its invariant probability $\pi
\left[ \omega \right] $ is bounded away from zero as $\mu \rightarrow 0$;
i.e., if $\lim \inf_{\mu \rightarrow 0}\pi \left[ \omega \right] >0$. Recall
that each state $\omega \in \Omega $ may be viewed as an $N$-tuple of mixed
actions $x\left( \omega \right) =\left( x^{i}\left( \omega ^{i}\right)
\right) _{i\in N}\in X$ (see (\ref{eq-f})). The invariant distribution $\pi $
on $\Omega $ therefore induces a probability distribution\footnote{$\left(
x\right) ^{-1}:X\rightarrow \Omega $ denotes the inverse of the mapping $%
x:\Omega \rightarrow X.$} $\widehat{\pi }:=\pi \circ \left( x\right) ^{-1}$
over $X$; i.e., $\widehat{\pi }\left[ Y\right] :=\pi \left[ \left\{ \omega
\in \Omega :x(\omega )\in Y\right\} \right] $ for every (measurable) $%
Y\subset X$. We therefore call an $N$-tuple of mixed actions $x\in X$ \emph{%
evolutionarily stable} if there are evolutionarily stable states $\omega \in
\Omega $ with $x(\omega )=x$; i.e., if $\lim \inf_{\mu \rightarrow 0}%
\widehat{\pi }\left[ x\right] >0$. The following result states that only
Nash equilibria can be evolutionarily stable, and that the backward
induction equilibrium $b$ is always so.

\begin{theorem}
\label{th-mu}For each $\mu >0$, let $\pi _{\mu }$ be the unique invariant
distribution of a dynamic process given by a one-step transition matrix $%
Q\equiv Q_{\mu }$ satisfying (\ref{eq-independence})--(\ref{eq-mu2}). Then 
\begin{eqnarray*}
\lim_{\mu \rightarrow 0}\widehat{\pi }_{\mu }\left[ EQ\right] &=&1,\text{ and%
} \\
\stackunder{\mu \rightarrow 0}{\lim \inf }\widehat{\pi }_{\mu }\left[
b\right] &>&0\text{.}
\end{eqnarray*}
\end{theorem}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
Assume without loss of generality that $Q_{\mu }\rightarrow Q_{0}$ and $\pi
_{\mu }\rightarrow \pi _{0}$ as $\mu \rightarrow 0$ (take a convergent
subsequence if needed; recall that the state space $\Omega $ is finite and
fixed). The invariance property $\pi _{\mu }=\pi _{\mu }Q_{\mu }$ becomes $%
\pi _{0}=\pi _{0}Q_{0}$ in the limit; thus $\pi _{0}$ is an invariant
distribution of $Q_{0}$ (but $Q_{0}$ is in general not ergodic, so its
invariant distribution is not unique). Now $Q_{0}$ allows no mutations (by (%
\ref{eq-mu2}) and $\mu \rightarrow 0$), so only selection applies. Therefore 
$\omega \in \Omega $ is a Nash equilibrium state (i.e., $x(\omega )\in EQ$)
if and only if $\omega $ is an absorbing state of $Q_{0}$ (i.e., $%
Q_{0}\left[ \omega \mid \omega \right] =1$); denote by $\Omega _{EQ}$ the
subset (of $\Omega $) of all such states. To prove that an absorbing state
must always be reached---and thus $\pi _{0}\left[ \Omega _{EQ}\right] =1$%
---we shall show that $Q_{0}$ allows no cycles. Indeed, assume that $\omega
_{0},\omega _{1},...,\omega _{t},...,\omega _{T}\in \Omega $ satisfy $\omega
_{t}\neq \omega _{t-1}$ and $Q_{0}\left[ \omega _{t}\mid \omega
_{t-1}\right] >0$ for every $t=1,...,T$, and $\omega _{T}=\omega _{0}$. At a
final node $i\in N$ (i.e., with $N(i)=\emptyset $), selection can only
increase the sum of the ``local'' payoffs (in $\Gamma (i)$) of the
population $M(i)$, i.e.,\footnote{%
We write $u_{\Gamma (i)}^{i}$ for the payoff function of $i$ in the subgame $%
\Gamma (i)$.} $\sum_{q\in M(i)}u_{\Gamma (i)}^{i}\left( \omega
_{q}^{i}\right) $; therefore $\omega _{T}^{i}=\omega _{0}^{i}$ implies that $%
\omega _{t}^{i}=\omega _{0}^{i}$ for all $t$; thus the population at $i$
never moves. The same applies at any node $i$ for which there were no
changes at all its descendant nodes $N(i)$; backward induction thus yields $%
\omega _{t}^{i}=\omega _{0}^{i}$ for all $t$ and all $i\in N$, a
contradiction.

Next, we shall show that $\widehat{\pi }_{0}\left[ b\right] >0$. First, we
claim that, given two Nash equilibrium states $\omega ,\omega ^{\prime }\in
\Omega _{EQ},$ if $\pi _{0}\left[ \omega \right] >0$ and $\omega ^{\prime }$
can be reached from $\omega $ by one mutation step in one population
followed by any number of selection steps, then $\pi _{0}\left[ \omega
^{\prime }\right] >0$ too. Indeed, the invariance property $\pi _{\mu }=\pi
_{\mu }Q_{\mu }$ implies\footnote{$Q^{k},$ the $k$-th power of the one-step
transition probability matrix $Q,$ gives precisely the $k$-steps transition
probabilities.} $\pi _{\mu }=\pi _{\mu }Q_{\mu }^{k}$ for any integer $k\geq
1$, and thus 
\[
\pi _{\mu }\left[ \omega ^{\prime }\right] \geq \pi _{\mu }\left[ \omega
^{\prime }\right] Q_{\mu }^{k}\left[ \omega ^{\prime }\mid \omega ^{\prime
}\right] +\pi _{\mu }\left[ \omega \right] Q_{\mu }\left[ \omega ^{\prime
\prime }\mid \omega \right] Q_{\mu }^{k-1}\left[ \omega ^{\prime }\mid
\omega ^{\prime \prime }\right] , 
\]
where $\omega ^{\prime \prime }$ satisfies $Q_{\mu }\left[ \omega ^{\prime
\prime }\mid \omega \right] \geq c_{1}\mu $ for some $c_{1}>0$ (by (\ref
{eq-M}) and (\ref{eq-mu}); this is the mutation step) and $Q_{0}^{k-1}\left[
\omega ^{\prime }\mid \omega ^{\prime \prime }\right] =c_{2}>0$ (these are
the selection steps; thus $Q_{0}$ rather than $Q_{\mu }$). Also, since $%
\omega ^{\prime }$ is a Nash equilibrium state, it can change only by
mutations, so $Q_{\mu }^{k}\left[ \omega ^{\prime }\mid \omega ^{\prime
}\right] \geq 1-c_{3}\mu $ for an appropriate constant $c_{3}>0$ (by (\ref
{eq-mu2})). Therefore $c_{3}\mu \pi _{\mu }\left[ \omega ^{\prime }\right]
\geq c_{1}\mu \pi _{\mu }\left[ \omega \right] Q_{\mu }^{k-1}\left[ \omega
^{\prime }\mid \omega ^{\prime \prime }\right] $, which, after dividing by $%
\mu $ and then letting $\mu \rightarrow 0$, yields 
\[
\pi _{0}\left[ \omega ^{\prime }\right] \geq \frac{c_{1}}{c_{3}}\pi
_{0}\left[ \omega \right] Q_{0}^{k-1}\left[ \omega ^{\prime }\mid \omega
^{\prime \prime }\right] =\frac{c_{1}c_{2}}{c_{3}}\pi _{0}\left[ \omega
\right] >0, 
\]
and thus $\pi _{0}\left[ \omega ^{\prime }\right] >0$.

Second, given a final node $i\in N$, we claim that there is a Nash
equilibrium state $\omega \in \Omega _{EQ}$ with $\pi _{0}\left[ \omega
\right] >0$, at which all the population at $i$ plays the backward induction
action $b^{i}$. If not, let $\omega \in \Omega _{EQ}$ be such that $\pi
_{0}\left[ \omega \right] >0$ and the proportion $x_{b^{i}}^{i}(\omega )$ of 
$b^{i}$ is maximal; thus $x_{b^{i}}^{i}(\omega )<1$. Consider a mutation of
a non-$b^{i}$-individual into $b^{i}$ (and no changes at all nodes $j\neq i$%
; recall that $\omega \in \Omega _{EQ}$), followed by any number of
selection periods until a state $\omega ^{\prime }\in \Omega _{EQ}$ is
reached. By the claim of the previous paragraph, we have $\pi _{0}\left[
\omega ^{\prime }\right] >0$. But $x_{b^{i}}^{i}(\omega ^{\prime
})>x_{b^{i}}^{i}(\omega )$, since the first mutation step increased this
proportion, and the selection steps could not have decreased it; this
contradicts our choice of $\omega $. Therefore there are Nash equilibrium
states $\omega \in \Omega _{EQ}$ with $\pi _{0}\left[ \omega \right] >0$ and 
$x_{b^{i}}^{i}(\omega )=1$. The same argument applies at any node $i\in N$
for which there are Nash equilibrium states $\omega \in \Omega _{EQ}$ with $%
\pi _{0}\left[ \omega \right] >0$ and $x_{b^{j}}^{j}(\omega )=1$ for all $%
j\in N(i)$ (just choose among these states one with maximal $%
x_{b^{i}}^{i}(\omega )$). Therefore, by backward induction, we get $\pi
_{0}\left[ \omega _{b}\right] >0$ for that state $\omega _{b}\in \Omega
_{EQ} $ where $x_{b^{i}}^{i}(\omega _{b})=1$ for all $i\in N$.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

We note that the above proof shows that the result of Theorem \ref{th-mu}
holds also in the non-generic case where the backward induction equilibrium
is not unique (the second statement then applies to each $b\in BI$).

\subsection{The Main Result\label{ss-main-result}}

As the examples show, other equilibria besides the backward induction
equilibrium $b$ (including some that are very different from $b$) may be
evolutionarily stable when the populations are fixed. We now consider the
case where the populations increase, i.e., $\left| M(i)\right| \rightarrow
\infty $ for $i\in N$. Put $\mathbf{m=(}\left| M(i)\right| )_{i\in N}$ for
the vector of population sizes; we will refer to $\mathbf{m}$ as the \emph{%
population profile}. As $\mathbf{m}\rightarrow \infty $, the state space
changes and becomes infinite in the limit; we need therefore\footnote{%
The probability of a single point may become $0$ in the limit. For instance,
if $1/m$ is much smaller than $\mu $ (i.e., if $m\mu \rightarrow \infty $),
then we may well get $\widehat{\pi }[b]\rightarrow 0$ (consider even the
simplest $1$-person game: The transition probability from the state $\omega
_{0}$ where everyone plays $b,$ to the state $\omega _{1}$ where all but one
individual play $b,$ is of the order of $\mu ,$ whereas the transition from $%
\omega _{1}$ to $\omega _{0}$ has probability of the order of $1/m$).} to
consider (small) neighborhoods of $BI=\{b\}$ in the set of mixed actions $X$%
: For every $\varepsilon >0$, put $BI_{\varepsilon }:=\left\{ x\in
X:x_{b^{i}}^{i}\geq 1-\varepsilon \text{ for all }i\in N\right\} $. That is, 
$x$ belongs to the $\varepsilon $-neighborhood $BI_{\varepsilon }$ of $b$ if
most of the individuals, in all populations, play their backward induction
action; we emphasize that this holds for \emph{all} $i\in N$---whether node $%
i$ is reached or not.

\begin{theorem}[Main Theorem]
\label{th-main}For every mutation rate $\mu >0$ and population profile $%
\mathbf{m=(}\left| M(i)\right| )_{i\in N}$ with $m:=\min_{i\in N}\left|
M(i)\right| $, let $\pi _{\mu ,\mathbf{m}}$ be the unique invariant
distribution of a dynamic process given by a one-step transition matrix $%
Q\equiv Q_{\mu ,\mathbf{m}}$ satisfying (\ref{eq-independence})--(\ref
{eq-mu2}). Then, for every $\varepsilon >0$ and $\delta >0$, 
\[
\lim\Sb \mu \rightarrow 0  \\ \mathbf{m}\rightarrow \infty  \\ \mu m\geq
\delta  \endSb \widehat{\pi }_{\mu ,\mathbf{m}}\left[ BI_{\varepsilon
}\right] =1. 
\]
Moreover, there exists a constant $c$, depending on the game, on the
dynamics parameters $\alpha _{1},\alpha _{2},\beta ,\gamma _{1},\gamma _{2}$%
, and on $\varepsilon ,\delta ,$ such that\footnote{$E_{\mu ,\mathbf{m}}$
denotes the expectation with respect to the probability distribution $\pi
_{\mu ,\mathbf{m}}.$} 
\begin{equation}
E_{\mu ,\mathbf{m}}\left[ x_{b^{i}}^{i}(\omega )\right] \geq 1-c\mu \text{
for all }i\in N,\text{ and}  \label{eq-th-e}
\end{equation}
\begin{equation}
\pi _{\mu ,\mathbf{m}}\left[ x_{b^{i}}^{i}\left( \omega \right) \geq
1-\varepsilon \text{ for all }i\in N\right] \geq 1-c\mu ,  \label{eq-th-pi}
\end{equation}
for all $\mu >0$ and all $\mathbf{m=}\left( \left| M(i)\right| \right)
_{i\in N}$ with $\left| M(i)\right| \geq \delta /\mu $ for all $i\in N$.
\end{theorem}

Thus, as the mutation rate is low and the populations are large enough, the
proportion of each population $i$ that does not play the backward induction
action is small. Hence, in the long run, the dynamic system is most of the
time in states where almost every individual plays his backward induction
action.\medskip

\textbf{Remarks.}

\begin{enumerate}
\item  The only assumption made on the relative rates of convergence of $\mu 
$ and $m$ is that $\mu m$ is bounded away from $0.$ This implies that 
\begin{equation}
\lim\Sb \mu \rightarrow 0  \endSb \lim_{\mathbf{m}\rightarrow \infty }%
\widehat{\pi }_{\mu ,\mathbf{m}}\left[ BI_{\varepsilon }\right] =1
\label{eq-lim2}
\end{equation}
(however this need not hold for $\lim_{\mathbf{m}\rightarrow \infty
}\lim_{\mu \rightarrow 0}$).

\item  No assumptions are made on the relative population sizes $\left|
M\left( i\right) \right| $; one population may well be much larger than
another. However, the mutation rates in the different populations are
assumed to be of the same order of magnitude---see (\ref{eq-mu}) and (\ref
{eq-mu2}).

\item  The estimates we get in (\ref{eq-th-e}) and (\ref{eq-th-pi}) involve
the mutation rate $\mu $ but no higher powers of $\mu $ (as is the case in
much of the existing literature, in particular in evolutionary dynamics for
games in strategic form). This means that the effect of \emph{simultaneous
mutations }(whose probability---a power of $\mu $---is relatively small) may
indeed be ignored. Thus our result does not rely on the fact that, when $\mu 
$ is small, $100$ simultaneous mutations are much more probable than $101$
simultaneous mutations (both of these events are extremely improbable).%
\footnote{%
In a sense, the comparison here is between different coefficients of $\mu $
(i.e., of $\mu $ to the power $1$), rather than between the first powers of $%
\mu $ with non-zero coefficient.} Our proof therefore does not use any of
the techniques based on ``counting mutations.''
\end{enumerate}

\section{Proof of the Main Theorem\label{s-proof}}

\subsection{An Informal Argument\label{ss-intuition}}

We begin by presenting informally the main ideas of the proof of our result;
in particular, we will explain the role of large populations. We do so for
the simpler basic model; the same arguments apply to the general model.

Clearly, if a node $i$ is reached (i.e., if at every node along the path
from the root to $i$ there are individuals playing the action that
corresponds to the path), then mutation MU($i$) at $i$ has probability of
the order of $\mu $, which is much smaller\footnote{%
We take the term ``$f$ is much smaller than $g$'' to mean that the ratio $%
f/g $ goes to $0$ as $\mu \rightarrow 0$ and $m\rightarrow \infty .$} than
the probability of selection SE($i$) at $i$. Therefore, at reached nodes
most of the individuals play their best-reply actions. (This is essentially
the argument of Theorem \ref{th-mu}.) The problem is how to get a similar
conclusion at the \emph{unreached }nodes.

Consider first the $3$-player game $\Gamma _{2}$ of Figure \ref{ex2} in the
Introduction. Assume that the dynamic system is in a state where all
individuals at nodes $1$ and $2$ play $b^{1}$ and $b^{2}$, respectively;
thus node $3$ is not reached. Then selection SE($3$) does not affect the
population at $3$; only mutation MU($3$) does. Mutation by itself will lead
in the long run to a distribution close to $(1/2,1/2)$ (since each
individual is eventually chosen, and then his action is replaced with equal
probabilities by $c^{3}$ and $b^{3}$). However, there are also \emph{%
mutations} \emph{at node} $2$ that yield a $c^{2}$ action, with a frequency
of $\mu /2$. After such a mutation, the probability that the action of the
mutant individual will revert back to $b^{2}$ is at most $\left( 1/m\right)
\rho $ (since his probability of being chosen is $1/m$; here $\rho =\sigma
+\mu /2$); thus, it will take on the average\footnote{%
An event whose probability is $p$ every period will occur on average $pT$
times during $T$ periods, or once every $1/p$ periods. In our arguments we
shall go back and forth between the two computations as needed.} $m/\rho $
periods for it to happen. Therefore, during a long stretch of time, say $T$
periods, the number of periods that there is a $c^{2}$ in population $2$ is
about $\left( \mu /2\right) (m/\rho )T=\mu mT/(2\rho )$. These are periods
at which $3$ is reached and thus selection SE($3$)---into $b^{3}$---is
effective. At the same time, mutation MU($3$) occurs at $3$ in roughly $\mu
T $ periods. Comparing the two implies that, when the population is large
(i.e., as $m\rightarrow \infty $), selection has a much greater effect than
mutation. Therefore, in the long run, we will get most of the population at
node $3$ playing $b^{3}$---even though $3$ is unreached most of the time.

Consider next the $4$-person game $\Gamma _{4}$ of Figure \ref{ex4}%
%TCIMACRO{\TeXButton{BeginFigure-ex4}{\begin{figure}[tb] \centering}}
%BeginExpansion
\begin{figure}[tb] \centering%
%EndExpansion
%TCIMACRO{
%\TeXButton{ex4}{ 
%\unitlength 1.00mm
%
%\begin{picture}(90.00,117.00)(5,3)
%\linethickness{1.0pt}
%\put(50.00,110.00){\line(1,-1){20.00}}
%\put(70.00,90.00){\line(1,-1){20.00}}
%\put(70.00,90.00){\line(-1,-1){20.00}}
%\put(50.00,110.00){\circle*{1.50}}
%\put(70.00,90.00){\circle*{1.50}}
%\put(50.00,114.00){\makebox(0,0)[cc]{$1$}}
%\put(72.00,93.00){\makebox(0,0)[cc]{$2$}}
%\put(50.00,110.00){\line(-1,-1){20.00}}
%\put(30.00,86.00){\makebox(0,0)[cc]{$0$}}
%\put(30.00,81.00){\makebox(0,0)[cc]{$0$}}
%\put(90.00,66.00){\makebox(0,0)[cc]{$1$}}
%\put(90.00,61.00){\makebox(0,0)[cc]{$1$}}
%\put(39.00,103.00){\makebox(0,0)[cc]{$c^1$}}
%\put(62.00,103.00){\makebox(0,0)[cc]{$b^1$}}
%\put(59.00,83.00){\makebox(0,0)[cc]{$c^2$}}
%\put(82.00,83.00){\makebox(0,0)[cc]{$b^2$}}
%\put(50.00,70.00){\line(1,-1){20.00}}
%\put(50.00,70.00){\line(-1,-1){20.00}}
%\put(50.00,70.00){\circle*{1.50}}
%\put(48.00,73.00){\makebox(0,0)[cc]{$3$}}
%\put(68.30,46.00){\makebox(0,0)[cc]{$-1$}}
%\put(70.00,41.00){\makebox(0,0)[cc]{$0$}}
%\put(39.00,63.00){\makebox(0,0)[cc]{$c^3$}}
%\put(62.00,63.00){\makebox(0,0)[cc]{$b^3$}}
%\put(30.00,76.00){\makebox(0,0)[cc]{$0$}}
%\put(90.00,56.00){\makebox(0,0)[cc]{$1$}}
%\put(70.00,36.00){\makebox(0,0)[cc]{$0$}}
%\put(30.00,50.00){\line(1,-1){20.00}}
%\put(30.00,50.00){\line(-1,-1){20.00}}
%\put(30.00,50.00){\circle*{1.50}}
%\put(28.00,53.00){\makebox(0,0)[cc]{$4$}}
%\put(8.00,26.00){\makebox(0,0)[cc]{$-1$}}
%\put(9.70,21.00){\makebox(0,0)[cc]{$5$}}
%\put(48.00,26.00){\makebox(0,0)[cc]{$-1$}}
%\put(48.00,21.00){\makebox(0,0)[cc]{$-1$}}
%\put(19.00,43.00){\makebox(0,0)[cc]{$c^4$}}
%\put(42.00,43.00){\makebox(0,0)[cc]{$b^4$}}
%\put(9.70,16.00){\makebox(0,0)[cc]{$5$}}
%\put(48.00,16.00){\makebox(0,0)[cc]{$-1$}}
%\put(30.00,71.00){\makebox(0,0)[cc]{$0$}}
%\put(90.00,51.00){\makebox(0,0)[cc]{$1$}}
%\put(70.00,31.00){\makebox(0,0)[cc]{$0$}}
%\put(8.00,11.00){\makebox(0,0)[cc]{$-1$}}
%\put(49.70,11.00){\makebox(0,0)[cc]{$5$}}
%\end{picture}
%}}
%BeginExpansion
 
\unitlength 1.00mm

\begin{picture}(90.00,117.00)(5,3)
\linethickness{1.0pt}
\put(50.00,110.00){\line(1,-1){20.00}}
\put(70.00,90.00){\line(1,-1){20.00}}
\put(70.00,90.00){\line(-1,-1){20.00}}
\put(50.00,110.00){\circle*{1.50}}
\put(70.00,90.00){\circle*{1.50}}
\put(50.00,114.00){\makebox(0,0)[cc]{$1$}}
\put(72.00,93.00){\makebox(0,0)[cc]{$2$}}
\put(50.00,110.00){\line(-1,-1){20.00}}
\put(30.00,86.00){\makebox(0,0)[cc]{$0$}}
\put(30.00,81.00){\makebox(0,0)[cc]{$0$}}
\put(90.00,66.00){\makebox(0,0)[cc]{$1$}}
\put(90.00,61.00){\makebox(0,0)[cc]{$1$}}
\put(39.00,103.00){\makebox(0,0)[cc]{$c^1$}}
\put(62.00,103.00){\makebox(0,0)[cc]{$b^1$}}
\put(59.00,83.00){\makebox(0,0)[cc]{$c^2$}}
\put(82.00,83.00){\makebox(0,0)[cc]{$b^2$}}
\put(50.00,70.00){\line(1,-1){20.00}}
\put(50.00,70.00){\line(-1,-1){20.00}}
\put(50.00,70.00){\circle*{1.50}}
\put(48.00,73.00){\makebox(0,0)[cc]{$3$}}
\put(68.30,46.00){\makebox(0,0)[cc]{$-1$}}
\put(70.00,41.00){\makebox(0,0)[cc]{$0$}}
\put(39.00,63.00){\makebox(0,0)[cc]{$c^3$}}
\put(62.00,63.00){\makebox(0,0)[cc]{$b^3$}}
\put(30.00,76.00){\makebox(0,0)[cc]{$0$}}
\put(90.00,56.00){\makebox(0,0)[cc]{$1$}}
\put(70.00,36.00){\makebox(0,0)[cc]{$0$}}
\put(30.00,50.00){\line(1,-1){20.00}}
\put(30.00,50.00){\line(-1,-1){20.00}}
\put(30.00,50.00){\circle*{1.50}}
\put(28.00,53.00){\makebox(0,0)[cc]{$4$}}
\put(8.00,26.00){\makebox(0,0)[cc]{$-1$}}
\put(9.70,21.00){\makebox(0,0)[cc]{$5$}}
\put(48.00,26.00){\makebox(0,0)[cc]{$-1$}}
\put(48.00,21.00){\makebox(0,0)[cc]{$-1$}}
\put(19.00,43.00){\makebox(0,0)[cc]{$c^4$}}
\put(42.00,43.00){\makebox(0,0)[cc]{$b^4$}}
\put(9.70,16.00){\makebox(0,0)[cc]{$5$}}
\put(48.00,16.00){\makebox(0,0)[cc]{$-1$}}
\put(30.00,71.00){\makebox(0,0)[cc]{$0$}}
\put(90.00,51.00){\makebox(0,0)[cc]{$1$}}
\put(70.00,31.00){\makebox(0,0)[cc]{$0$}}
\put(8.00,11.00){\makebox(0,0)[cc]{$-1$}}
\put(49.70,11.00){\makebox(0,0)[cc]{$5$}}
\end{picture}
%
%EndExpansion
\caption{The game $\Gamma _4$\label{ex4}}%
%TCIMACRO{\TeXButton{EndFigure}{\end{figure}}}
%BeginExpansion
\end{figure}%
%EndExpansion
. Assume again that everyone plays $b^{i}$ at nodes $1$ and $2$, and thus
both $3$ and $4$ are not reached. In the same way as in the previous
example, we get the following: At node $4$, mutations MU($4$) occur with a
frequency of $\mu $, whereas selection SE($4$) there---which requires
mutations at \emph{both} nodes $2$ and $3$ in order for $4$ to be
reached---occurs with a frequency of $\left( \mu /2\right) ^{2}m/\left(
2\rho \right) =\mu ^{2}m/(8\rho )$ (indeed, the probability of a mutation at 
$2$ into $c^{2}$ is $\mu /2$; the same goes for a mutation into $c^{3}$ at $%
3 $; and then it takes about $\left( m/\rho \right) /2=m/\left( 2\rho
\right) $ periods until at least one of the mutants reverts back). But we
cannot say that $\mu ^{2}m$ is much larger than $\mu $ (we only assumed that 
$\mu m$ is bounded away from zero), so we cannot conclude that, at node $4$,
selection ``overpowers'' mutation. Without this happening at $4$, there is
no reason for the populations at the higher nodes (like $3$) to choose their
backward induction action either. Moreover, when a node is even further away
from the equilibrium path---say, $k$ nodes away--- the previous argument
will work out only if $\mu ^{k}m$ is much larger than $\mu $.

A more careful analysis is thus called upon at this point.

Let us consider the first time that there is a $c^{3}$ individual in
population $3$; this happens (by mutation) on the average once every $2/\mu $
periods. If, at that point, there is a $c^{2}$ action in population $2$,
then $4$ is reached and we are done. If not, then, as long as there is no $%
c^{2}$ in population $2$, node $3$ is not reached. Therefore, the only way
that the $c^{3}$ individual can revert back to $b^{3}$ is by mutation at $3$%
; the probability of that happening is $\left( 1/m\right) \left( \mu
/2\right) =\mu /\left( 2m\right) $ (since that individual must be chosen out
of $M(3)$, and then undergo mutation). At the same time, the probability of
getting a $c^{2}$ individual in population $2$---by mutation---is $\mu /2$.
Since $\mu /2$ is much larger than $\mu /\left( 2m\right) $ for large $m$,
in general the latter will happen much later than the former (and therefore
can be essentially ignored). Thus altogether we have to wait at most on the
order of $2/\mu $ periods for the mutation at $3$, and then another $2/\mu $
periods for the mutation at node $2$; in sum, $4/\mu $ periods until node $4$
is reached (compare this to $4/\mu ^{2}$ in the
previous---unsuccessful---argument). Once $4$ is reached, it takes on the
order of $m/\left( 2\rho \right) $ periods for either the $c^{2}$ or the $%
c^{3}$ individual to revert back, so selection SE($4$) operates at node $4$
with a frequency of approximately $\left( \mu /4\right) m/(2\rho )=\mu
m/(8\rho )$. When $m\rightarrow \infty $, this is much larger than the
mutation rate $\mu $; so, again, selection ``wins'' at\footnote{%
This argument clearly generalizes; for a node that is at distance $k$ from
the equilibrium path, the frequency of selection is of the order of $\mu
m/(2k^{2}\rho ),$ which, as $m\rightarrow \infty ,$ is much larger than the
mutation rate $\mu .$} $4$. Once this has been established, it follows that
most of the population at node $4$ plays $b^{4}$ most of the time, and we
are essentially\footnote{%
See the last paragraph in this subsection.} left with a $3$-player game
(like $\Gamma _{2}$); the proof is completed by (backward) induction.%
\footnote{%
Similar arguments apply to the $3$-player game $\Gamma _{3}$ of Figure \ref
{ex3} of the Introduction. When less than $90\%$ of population $3$ plays $%
b^{3},$ in order for $3$ to be reached one needs mutants at $2$ and at $1,$
and the computations are exactly as for node $4$ in $\Gamma _{4}.$ When the $%
90\%$ proportion of $b^{3}$ is exceeded, then $b^{2}$ becomes a best-reply
of $2,$ so a mutant is needed only at $1$---and the estimates are as for
node $3$ in $\Gamma _{2}.$}

The crux of the argument is that, after a mutation in population $3$
generated a $c^{3}$, this $c^{3}$ action is ``stuck'' there for a long time%
\footnote{%
It is a so-called ``neutral mutation'' that does not affect the payoffs.}%
---at least until a mutation generates a $c^{2}$ in population $2$. Thus,
what appeared to require \emph{simultaneous mutations} (with a frequency of
the order of $\mu ^{k}$ for some $k\geq 2$), turns out instead to rely on 
\emph{sequential mutations} (with a frequency of the order of $\mu $).%
\footnote{%
This kind of argument may also explain how matching mutations occur in
interacting populations; i.e., mutations that yield no advantage in their
own population, unless there are fitting mutations in the other populations.
The computations above show that, in large populations, the frequency of
such events may well be much higher than commonly thought: of the order of $%
\mu $ rather than a power of $\mu .$}

It should now be clear what role the large populations play: The smaller a
group of individuals is, the (relatively) less probable it is for a change
of action to occur in that group. This is particularly true when comparing a 
\emph{specific} individual (like the $c^{3}$ mutant in the analysis of game $%
\Gamma _{4}$, or the $c^{2}$ mutant in $\Gamma _{2}$), to \emph{any}
individual in a whole population (population $2$ in $\Gamma _{4}$, or
population $3$ in $\Gamma _{2}$).

Finally, to understand the use of the condition that $\mu m\geq \delta >0,$
note that the above arguments show that the effect of selection is of the
order of $\mu m$, whereas that of mutation is $\mu $. The possibility that a
sizeable fraction of the population does not play the backward induction
action, albeit an event of low probability for large $m$, is not negligible.
A simple estimate\footnote{%
Using Markov's inequality. More refined probabilistic computations may well
lead to weaker conditions.} shows this probability to be of the order of at
most $1/m$. When this event occurs at some descendant node, it may affect
selection at the current node---away from the backward induction action.
However, as long as $1/m$ is at most a constant times $\mu $---which is the
case when $\mu m\geq \delta >0$---this effect (like that of random
mutations) is again small relative to $\mu m$ for large $m.$

\subsection{An Outline of the Proof\label{ss-outline}}

We now provide an outline that may help the reader follow the formal proof
given in the next subsection. The proof proceeds by backward induction,
starting from the final nodes and working back towards the root; see
Proposition \ref{p-1}. The main claims that are proved for each node $i$ are
as follows:

\begin{enumerate}
\item  The probability that $b^{i}$ is not the local best-reply of $i$ is
low (this happens only when a sizeable proportion of the population at some
descendant node $j$ does not choose $b^{j}$, which, by induction, has low
probability); in fact, this probability is of the order of $\mu $; see (\ref
{p1-1}).

\item  When $b^{i}$ is the local best-reply of $i$, the expected proportion
of population $i$ that does not play $b^{i}$ when $i$ is reached is small
(this holds since selection has probability at least $\beta $, which is
bounded away from $0$, while mutation has probability $\mu $); again, it is
of the order of $\mu $; see (\ref{p1-2}).

\item  When $b^{i}$ is the local best-reply of $i$, the ratio between the
expected proportion of population $i$ that does not play $b^{i}$ when $i$ is
reached, and the same expected proportion when $i$ is not reached, is of the
order of $\mu m$; see (\ref{eq-mu-m}). This is the central step in the
proof, and its essence is the ``sequential mutations'' argument above; see
Proposition \ref{p-z}. Together with the previous claim, it follows that the
expected proportion of population $i$ that does not play $b^{i}$ when $i$ is
not reached is of the order of $1/m$; see (\ref{p3-4}).

\item  Adding the above three estimates and noting that $1/m\leq \left(
1/\delta \right) \mu $ implies that the expected proportion of population $i$
that does not play $b^{i}$ is of the order of $\mu $, which yields (\ref
{eq-th-e}) and (\ref{eq-th-pi}).
\end{enumerate}

\subsection{The Proof\label{ss-proof}}

We now prove the Main Theorem. Fix $\varepsilon ,\delta >0$, the mutation
rate $\mu >0$, the population profile $\mathbf{m=}\left( m^{i}\right) _{i\in
N}$ (with $\mu m\geq \delta $, where $m:=\min_{i\in N}m^{i}$) and the
transition probability matrix $Q$ that satisfies (\ref{eq-independence})--(%
\ref{eq-mu2}). Let $\pi $ be the resulting unique invariant distribution
over the state space $\Omega $. Take the state $\omega \in \Omega $ to be
distributed according to $\pi $, and let $\widetilde{\omega }\in \Omega $ be
the next state, given by the one-step transition probabilities $Q\left[ 
\widetilde{\omega }\mid \omega \right] $; then $\widetilde{\omega }$ is also
distributed according to $\pi $. From now on all probability statements and
expectations will be according to this distribution.

Before proceeding with the proof, we introduce a number of useful notations:

\begin{itemize}
\item  For each node $i\in N$, put $Y^{i}:=1-x_{b^{i}}^{i}(\omega )$ (the
mapping $x:\Omega \rightarrow X$ is given by (\ref{eq-f})); this is the
proportion of population $i$ that does \emph{not} play the backward
induction action in state $\omega $. Similarly, put $\widetilde{Y}%
^{i}:=1-x_{b^{i}}^{i}\left( \widetilde{\omega }\right) $ for the same
proportion in the next-period state $\widetilde{\omega }$. The random
variables $Y^{i}$ and $\widetilde{Y}^{i}$ are identically distributed (their
distribution is $\pi \circ \left( 1-x_{b^{i}}^{i}\right) ^{-1}$); thus in
particular $E\left[ \widetilde{Y}^{i}\right] =E\left[ Y^{i}\right] $.

\item  Given two nodes $i,j\in N$ such that $i$ is a descendant of $j$
(i.e., $i\in N(j))$, let $R^{j,i}$ be an indicator random variable, defined
to be $1$ if node $i$ is \emph{reached} from node $j$ in state $\omega $ and 
$0$ otherwise (i.e., $R^{j,i}=1$ if and only if for every $k\in N$ on the
path from $j$ to $i$ there is at least one individual $q\in M(k)$ whose
choice $\omega _{q}^{k}$ is the action that leads towards $i)$. When $j$ is
the root we will write $R^{i}$ for the indicator that $i$ is reached. Again, 
$\widetilde{R}^{j,i}$ is defined in the same way for $\widetilde{\omega }$.

\item  When everyone plays the backward induction action---i.e., when $%
Y^{j}=0$ for all $j\in N$---the unique local best-reply of each $i\in N$ is $%
b^{i}$ (recall that $b$ is the unique backward induction equilibrium).
Therefore there exists a $\lambda >0$ (appropriately small) such that $b^{i}$
is the unique local best-reply of $i$ for all $i\in N$ when $Y^{j}<\lambda $
for all $j\in N$ (i.e., when the proportion of the individuals at each node
that do \emph{not} play the backward induction action is less than $\lambda $%
). This $\lambda $ depends on the game only, and will be fixed from now on.

\item  Let $L^{i}$ be an indicator random variable, defined to be $1$ in
state $\omega $ if $Y^{j}<\lambda $ for all $j\in N(i)$ and $0$ otherwise.
Thus, when $L^{i}=1$ the backward induction action $b^{i}$ is the unique 
\emph{local} best-reply of $i$ in state $\omega $, i.e., $u_{\Gamma
(i)}^{i}(b^{i},\omega ^{N(i)})>u_{\Gamma (i)}^{i}(a^{i},\omega ^{N(i)})$ for
every $a^{i}\in A^{i},a^{i}\neq b^{i}$. We denote by $\widetilde{L}^{i}$ the
indicator that $\widetilde{Y}^{j}<\lambda $ for all $j\in N(i)$. When $i$ is
a final node (i.e., $N(i)=\emptyset $) we have $L^{i}\equiv \widetilde{L}%
^{i}\equiv 1$.
\end{itemize}

We note that selection SE($i$) has an effect only when $i$ is reached, i.e.,
when $R^{i}=1$; if $i$ is not reached, i.e., if $R^{i}=0$, then all actions
of $i$ yield the same payoff in $\Gamma $ and only mutation MU($i$) affects $%
\omega ^{i}$. If $R^{i}=1$ and $L^{i}=1$ then $b^{i}$ is the global
best-reply of $i$, and thus certainly a ``better action'' for a ``non-$b^{i}$%
-individual'' (i.e., $b^{i}\in B^{i}(q(i),\omega )$ when $\omega
_{q(i)}^{i}\neq b^{i}$). Since these arguments will be repeatedly used in
the proof, for convenience we state the following implications of (\ref
{eq-independence})--(\ref{eq-mu2}) here:\footnote{%
We use the ``big-$O$'' notation: $f(x)=O(g(x))$ if there exists a constant $%
K<\infty $ such that $\left| f(x)\right| \leq K\left| g(x)\right| $ for all $%
x$ in the relevant region. Thus $f(\mu ,\mathbf{m})=O(\mu )$ means that
there exists $K$ such that $\left| f(\mu ,\mathbf{m})\right| \leq K\mu $ for
all $0<\mu <1$ and all vectors $\mathbf{m=}\left( m^{i}\right) _{i\in N}$
with integer coordinates $m^{i}\geq \delta /\mu $ for all $i\in N.$} 
\begin{equation}
P\left[ \widetilde{\omega }_{q(i)}^{i}=a^{i}\mid \omega \right] \geq \alpha
_{1}\mu \text{ for every }a^{i}\in A^{i}.  \label{r0-1}
\end{equation}
\begin{equation}
\text{If }R^{i}=0\text{ then }P\left[ \widetilde{\omega }_{q(i)}^{i}\neq
\omega _{q(i)}^{i}\mid \omega \right] =O\left( \mu \right) .  \label{r0-all}
\end{equation}
\begin{equation}
\text{If }L^{i}R^{i}=1\text{ and }\omega _{q(i)}^{i}\neq b^{i}\text{ then }%
P\left[ \widetilde{\omega }_{q(i)}^{i}=b^{i}\mid \omega \right] \geq \beta .
\label{r1l1}
\end{equation}

The crucial argument in our proof is the following:

\begin{proposition}
\label{p-z}Consider the path from the root to a node $i\in N$; without loss
of generality, assume that the nodes along this path are $1,2,...,i-1,i$ (in
that order, with $1$ the root). Let $Z$ be a non-negative random variable.
Then 
\begin{eqnarray*}
\mu E\left[ Z\left( 1-R^{i}\right) \right] &\leq &O\left( \frac{1}{m}\right)
E\left[ ZR^{i}\right] +O\left( \frac{\mu }{m}\right) \\
&&+\sum_{j=1}^{i-1}\left( E\left[ Z\left( 1-R^{j,i}\right) \right] -E\left[
Z\left( 1-\widetilde{R}^{j,i}\right) \right] \right) .
\end{eqnarray*}
\end{proposition}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
For each $j=1,2,...,i-1$, let $c^{j}\in A^{j}$ be the choice along the given
path (i.e., towards $j+1$), and put $\theta ^{j}(\omega ):=\left| \left\{
q\in M(j):\omega _{q}^{j}=c^{j}\right\} \right| $, the proportion of players
at node $j$ that choose $c^{j}$; denote $V^{j}:=\theta ^{j}(\omega )$ and $%
\widetilde{V}^{j}:=\theta ^{j}(\widetilde{\omega })$. We first prove three
lemmata.

\begin{lemma}
\label{l-1} 
\[
E\left[ Z\left( 1-\widetilde{R}^{j,i}\right) R^{i}\right] =O\left( \frac{1}{m%
}\right) E\left[ ZR^{i}\right] . 
\]
\end{lemma}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
For each $\omega $ with $R^{i}=1$, to get $\widetilde{R}^{j,i}=0$ there must
be some node $k$ with $j\leq k<i$ and $\widetilde{V}^{k}=0$. But $V^{k}>0$
(since $R^{i}=1$), so in fact $V^{k}=1/m^{k}$ (since $\left| \widetilde{V}%
^{k}-V^{k}\right| $ is $0$ or $1/m^{k}$). Hence $P\left[ \widetilde{V}%
^{k}=0\mid \omega \right] \leq \gamma _{2}/m^{k}=O\left( 1/m\right) $ (by (%
\ref{eq-M}), since the single $c^{k}$-individual must be chosen). Therefore $%
P\left[ \widetilde{R}^{j,i}=0\mid \omega \right] \leq
\sum_{k=j}^{i-1}P\left[ \widetilde{V}^{k}=0\mid \omega \right] =O\left(
1/m\right) $, from which it follows that 
\[
E\left[ Z\left( 1-\widetilde{R}^{j,i}\right) R^{i}\right] =E\left[ P\left[ 
\widetilde{R}^{j,i}=0\mid ZR^{i}\right] \right] =O\left( \frac{1}{m}\right)
E\left[ ZR^{i}\right] . 
\]
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

\begin{lemma}
\label{l-2} 
\[
E\left[ Z\left( 1-\widetilde{R}^{j,i}\right) \left( 1-R^{j}\right)
R^{j,i}\right] =O\left( \frac{\mu }{m}\right) . 
\]
\end{lemma}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
For each $\omega \in \Omega $ with $R^{j}=0$ and $R^{j,i}=1$ (i.e., $j$ is
not reached, and $i$ is reached from $j$), to get $\widetilde{R}^{j,i}=0$
there must be some node $k$ with $j\leq k<i$ and $\widetilde{V}^{k}=0.$ But $%
R^{j,i}=1$ implies that $V^{k}>0$, and thus $V^{k}=1/m^{k}$. Therefore $%
P\left[ \widetilde{V}^{k}=0\mid \omega \right] \leq \left( \gamma
_{2}/m^{k}\right) O\left( \mu \right) =O\left( \mu /m\right) $ (by (\ref
{eq-M}) and (\ref{r0-all}): the single $c^{k}$-individual must be chosen,
and its action can change by mutation only, since $j$, and thus \emph{a
fortiori} $k$, is not reached). Hence $P\left[ \widetilde{R}^{j,i}=0\mid
\omega \right] \leq \sum_{k=j}^{i-1}P\left[ \widetilde{V}^{k}=0\mid \omega
\right] =O\left( \mu /m\right) $, from which the result follows.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

\begin{lemma}
\label{l-3} 
\[
E\left[ Z\left( 1-\widetilde{R}^{j,i}\right) \left( 1-R^{j,j+1}\right)
R^{j+1,i}\right] \leq \left( 1-\alpha _{1}\mu \right) E\left[ Z\left(
1-R^{j,j+1}\right) R^{j+1,i}\right] +O\left( \frac{\mu }{m}\right) . 
\]
\end{lemma}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
Take $\omega \in \Omega $ with $R^{j,j+1}=0$ (i.e., $V^{j}=0$), and $%
R^{j+1,i}=1$. To get $\widetilde{R}^{j,i}=0$ there must be some node $k$
with $j\leq k<i$ and $\widetilde{V}^{k}=0$. Now $P\left[ \widetilde{V}%
^{j}>0\mid \omega \right] =P\left[ \widetilde{\omega }_{q(j)}^{j}=c^{j}\mid
\omega \right] \geq \alpha _{1}\mu $ (this follows from $V^{j}=0$ and (\ref
{r0-1})), and $P\left[ \widetilde{V}^{k}=0\mid \omega \right] \leq \gamma
_{2}/m^{k}\leq \gamma _{2}/m$ for $k=j+1,...,i-1$ (by (\ref{eq-M}), since $%
R^{j+1,i}=1$ implies $V^{k}>0$). Therefore $P\left[ \widetilde{R}%
^{j,i}=1\mid \omega \right] =P\left[ \widetilde{V}^{k}>0\text{ for all }%
j\leq k<i-1\mid \omega \right] $ $\geq \left( \alpha _{1}\mu \right)
(1-\gamma _{2}/m)^{i-j-1}$, hence $P\left[ \widetilde{R}^{j,i}=0\mid \omega
\right] \leq \left( 1-\alpha _{1}\mu \right) +O\left( \mu /m\right) $, and
the result follows as in the Proof of Lemma \ref{l-1}.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

The Proof of Proposition \ref{p-z} can now be completed.

\textbf{Proof of Proposition \ref{p-z} (continued).} We have $\left(
1-R^{j}\right) R^{j,i}=R^{j,i}-R^{i}$ and $\left( 1-R^{j,j+1}\right)
R^{j+1,i}=R^{j+1,i}-R^{j,i}$. Adding the inequalities of Lemmata \ref{l-1}, 
\ref{l-2} and \ref{l-3} together with 
\[
E\left[ Z\left( 1-\widetilde{R}^{j,i}\right) \left( 1-R^{j+1,i}\right)
\right] \leq E\left[ Z\left( 1-R^{j+1,i}\right) \right] 
\]
yields: 
\begin{eqnarray*}
E\left[ Z\left( 1-\widetilde{R}^{j,i}\right) \right] &\leq &O\left( \frac{1}{%
m}\right) E\left[ ZR^{i}\right] +\left( 1-\alpha _{1}\mu \right) E\left[
Z\left( R^{j+1,i}-R^{j,i}\right) \right] \\
&&+E\left[ Z\left( 1-R^{j+1,i}\right) \right] +O\left( \frac{\mu }{m}\right)
.
\end{eqnarray*}
Rearranging terms gives 
\begin{eqnarray*}
\alpha _{1}\mu E\left[ Z\left( R^{j+1,i}-R^{j,i}\right) \right] &\leq
&O\left( \frac{1}{m}\right) E\left[ ZR^{i}\right] +O\left( \frac{\mu }{m}%
\right) \\
&&+\left( E\left[ Z\left( 1-R^{j,i}\right) \right] -E\left[ Z\left( 1-%
\widetilde{R}^{j,i}\right) \right] \right) .
\end{eqnarray*}
Adding these inequalities for $j=1,2,...,i-1$ and noting that $R^{i,i}=1$
and $R^{1,i}=R^{i}$ completes the proof.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

The next proposition proves the Main Theorem. The argument is divided into 8
steps.

\begin{proposition}
\label{p-1}For each node $i\in N$: 
\begin{eqnarray}
P\left[ L^{i}=0\right] &=&O\left( \mu \right) ;  \label{p1-1} \\
P\left[ \widetilde{Y}^{i}<Y^{i}\right] =P\left[ \widetilde{Y}%
^{i}>Y^{i}\right] &=&O\left( \mu \right) ;  \label{eq-p1-1} \\
E\left[ Y^{i}L^{i}R^{i}\right] &=&O\left( \mu \right) ;  \label{p1-2} \\
E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}R^{i}\right] &=&O\left( \mu
\right) ;  \label{p2-2} \\
E\left[ \left| \widetilde{Y}^{i}\widetilde{L}^{i}-Y^{i}L^{i}\right| \left(
1-R^{i}\right) \right] &=&O\left( \frac{\mu }{m}\right) ;  \label{p3-1} \\
E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}(1-R^{i})\right] &=&O\left( \frac{1%
}{m}\right) ;  \label{p3-3} \\
E\left[ Y^{i}L^{i}(1-R^{i})\right] &=&O\left( \frac{1}{m}\right) ;\text{ and}
\label{p3-4} \\
E\left[ Y^{i}\right] &=&O\left( \mu \right) .  \label{p-9}
\end{eqnarray}
\end{proposition}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
The proof is by backward induction on $i$. Assume that (\ref{p1-1})--(\ref
{p-9}) hold for all\footnote{%
The induction starts from final nodes $i$ for which $N(i)=\emptyset $ (and
thus there is no assumption).} $j\in N(i)$; then each of the claims (\ref
{p1-1})--(\ref{p-9}) for $i$ will be proved in turn.

\emph{Step 1: (\ref{p1-1}) holds for }$i$. Indeed,\footnote{%
We thank Michihiro Kandori for pointing out an error in this proof in the
first version of the paper.} $L^{i}=0$ implies that there is $j\in N(i)$
such that $Y^{j}\geq \lambda ,$ hence 
\[
P\left[ L^{i}=0\right] \leq \sum_{j\in N(i)}P\left[ Y^{j}\geq \lambda
\right] \leq \sum_{j\in N(i)}\frac{1}{\lambda }E\left[ Y^{j}\right] =O\left(
\mu \right) , 
\]
where we have used Markov's inequality\footnote{%
Markov's inequality is: $P\left[ Z\geq z\right] \leq \left( 1/z\right)
E\left[ Z\right] $ for a non-negative random variable $Z$ and $z>0.$} and (%
\ref{p-9}) for $j$ (by the induction hypothesis).

\emph{Step 2: (\ref{eq-p1-1}) holds for }$i$. We have $E\left[ \widetilde{Y}%
^{i}\right] =E\left[ Y^{i}\right] $ (recall that $\pi $ is the invariant
distribution), thus 
\[
0=E\left[ \widetilde{Y}^{i}-Y^{i}\right] =\left( \frac{1}{m^{i}}\right)
P\left[ \widetilde{Y}^{i}>Y^{i}\right] +\left( -\frac{1}{m^{i}}\right)
P\left[ \widetilde{Y}^{i}<Y^{i}\right] , 
\]
since the only possible values of $\widetilde{Y}^{i}-Y^{i}$ are $0,1/m^{i}$
and $-1/m^{i}$. Therefore 
\[
P\left[ \widetilde{Y}^{i}>Y^{i}\right] =P\left[ \widetilde{Y}%
^{i}<Y^{i}\right] . 
\]
To get $\widetilde{Y}^{i}>Y^{i}$ we need a $b^{i}$-individual to become non-$%
b^{i}$ (i.e., $\omega _{q(i)}^{i}=b^{i}$ and $\widetilde{\omega }%
_{q(i)}^{i}\neq b^{i})$. This can happen either by selection---which
requires $b^{i}$ not to be a best-reply of $i$ (hence $L^{i}=0$)---or by
mutation---with probability equal to $O\left( \mu \right) $ (by (\ref{eq-mu2}%
)). Thus 
\[
P\left[ \widetilde{Y}^{i}>Y^{i}\right] \leq P\left[ L^{i}=0\right] +O\left(
\mu \right) =O\left( \mu \right) , 
\]
by (\ref{p1-1}) for $i$, proving (\ref{eq-p1-1}) for $i$.

\emph{Step 3: (\ref{p1-2}) holds for }$i$. The case $\widetilde{Y}^{i}<Y^{i}$
occurs when a non-$b^{i}$-action is replaced by $b^{i}$; thus the chosen
individual $q(i)\in M(i)$ is a non-$b^{i}$-individual (i.e., $\omega
_{q(i)}^{i}\neq b^{i})$, which happens with probability $\geq \gamma
_{1}Y^{i}$ by (\ref{eq-M}). For every $\omega \in \Omega $ with $%
L^{i}R^{i}=1 $, the probability $P\left[ \widetilde{\omega }%
_{q(i)}^{i}=b^{i}\mid \omega \right] $ of changing the action to $b^{i}$ is
at least $\beta $ (see (\ref{r1l1})), and we get 
\begin{eqnarray*}
P\left[ \widetilde{Y}^{i}<Y^{i}\right] &\geq &E\left[ \beta \gamma
_{1}Y^{i}\mid L^{i}R^{i}=1\right] P\left[ L^{i}R^{i}=1\right] +0\ P\left[
L^{i}R^{i}\neq 1\right] \\
&=&\beta \gamma _{1}E\left[ Y^{i}L^{i}R^{i}\right] .
\end{eqnarray*}
Using (\ref{eq-p1-1}) for $i$ thus yields (\ref{p1-2}) for $i$.

\emph{Step 4: (\ref{p2-2}) holds for }$i$. Write $E\left[ \widetilde{Y}^{i}%
\widetilde{L}^{i}R^{i}\right] =E\left[ \widetilde{Y}^{i}\widetilde{L}%
^{i}(1-L^{i})R^{i}\right] +E\left[ \widetilde{Y}^{i}\widetilde{L}%
^{i}L^{i}R^{i}\right] $. The first term is $O\left( \mu \right) $ by (\ref
{p1-1}), and the second term is 
\begin{eqnarray*}
E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}L^{i}R^{i}\right] &\leq &E\left[ 
\widetilde{Y}^{i}L^{i}R^{i}\right] =E\left[ Y^{i}L^{i}R^{i}\right] +E\left[
\left( \widetilde{Y}^{i}-Y^{i}\right) L^{i}R^{i}\right] \\
&\leq &E\left[ Y^{i}L^{i}R^{i}\right] +\left( \frac{1}{m^{i}}\right) P\left[ 
\widetilde{Y}^{i}>Y^{i}\right]
\end{eqnarray*}
(since the only positive value of $\widetilde{Y}^{i}-Y^{i}$ is $1/m^{i}$).
Applying (\ref{eq-p1-1}) yields the desired inequality.

\emph{Step 5: (\ref{p3-1}) holds for }$i$\emph{.} We have 
\begin{eqnarray*}
&&E\left[ \left| \widetilde{Y}^{i}\widetilde{L}^{i}-Y^{i}L^{i}\right| \left(
1-R^{i}\right) \right] \\
&\leq &E\left[ \left| \widetilde{Y}^{i}-Y^{i}\right| \widetilde{L}^{i}\left(
1-R^{i}\right) \right] +E\left[ Y^{i}\left| \widetilde{L}^{i}-L^{i}\right|
\left( 1-R^{i}\right) \right] \\
&\leq &E\left[ \left| \widetilde{Y}^{i}-Y^{i}\right| \left( 1-R^{i}\right)
\right] +E\left[ \left| \widetilde{L}^{i}-L^{i}\right| \left( 1-R^{i}\right)
\right]
\end{eqnarray*}

The first term is bounded by $\left( 1/m^{i}\right) P\left[ \widetilde{Y}%
^{i}\neq Y^{i},R^{i}=0\right] =(1/m^{i})\,O\left( \mu \right) =O\left( \mu
/m\right) $ (see (\ref{r0-all}): $R^{i}=0$ implies that the change from $%
Y^{i}$ to $\widetilde{Y}^{i}$ is by mutation only). For the second term,
note that $\widetilde{L}^{i}\neq L^{i}$ implies that there exists $j\in N(i)$
such that $Y^{j}\geq \lambda -1/m^{j}$ (otherwise $Y^{j}<\lambda -1/m^{j}$
and thus $\widetilde{Y}^{j}<\lambda $ for all $j\in N(i)$, in which case $%
L^{i}=\widetilde{L}^{i}=1$). Choose $j\in N(i)$ to be a last such node; thus 
$Y^{l}<\lambda -1/m^{l}$ for all $l\in N(j)$, hence $L^{j}=1$. Now $%
\widetilde{L}^{i}\neq L^{i}$ implies that $\widetilde{\omega }^{k}\neq
\omega ^{k}$ for some $k\in N(i)\cup \left\{ i\right\} $, which, by (\ref
{r0-all}), has conditional probability equal to $O\left( \mu \right) $ for
every $\omega $ with $R^{i}=0$ and thus \emph{a fortiori} $R^{k}=0$.
Therefore 
\begin{eqnarray*}
E\left[ \left| \widetilde{L}^{i}-L^{i}\right| \left( 1-R^{i}\right) \right]
&\leq &\sum_{j\in N(i)}P\left[ \widetilde{L}^{i}\neq L^{i},Y^{j}\geq \lambda
-\frac{1}{m^{j}},L^{j}=1,R^{i}=0\right] \\
&\leq &O\left( \mu \right) \sum_{j\in N(i)}P\left[ Y^{j}\geq \lambda -\frac{1%
}{m^{j}},L^{j}=1,R^{i}=0\right] \\
&\leq &O\left( \mu \right) \sum_{j\in N(i)}\frac{1}{\lambda -\frac{1}{m^{j}}}%
E\left[ Y^{j}L^{j}\left( 1-R^{j}\right) \right] ,
\end{eqnarray*}
where we have used $1-R^{i}\leq 1-R^{j}$ together with Markov's inequality.
Applying (\ref{p3-4}) for each $j\in N(i)$ completes the proof of (\ref{p3-1}%
) for $i$.

\emph{Step 6: (\ref{p3-3}) holds for }$i$\emph{.} Take $Z=\widetilde{Y}^{i}%
\widetilde{L}^{i}$ in Proposition \ref{p-z}. We have 
\begin{eqnarray*}
&&\left| E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}\left( 1-R^{j,i}\right)
\right] -E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}\left( 1-\widetilde{R}%
^{j,i}\right) \right] \right| \\
&=&\left| E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}\left( 1-R^{j,i}\right)
\right] -E\left[ Y^{i}L^{i}\left( 1-R^{j,i}\right) \right] \right| \\
&\leq &E\left[ \left| \widetilde{Y}^{i}\widetilde{L}^{i}-Y^{i}L^{i}\right|
\left( 1-R^{j,i}\right) \right] \\
&\leq &E\left[ \left| \widetilde{Y}^{i}\widetilde{L}^{i}-Y^{i}L^{i}\right|
\left( 1-R^{i}\right) \right] =O\left( \frac{\mu }{m}\right) ,
\end{eqnarray*}
where we have used the fact that $\pi $ is the invariant distribution, the
inequality $1-R^{j,i}\leq 1-R^{i}$ (since $R^{j,i}=0$ implies $R^{i}=0$),
and finally (\ref{p3-1}) for $i$. Thus each one of the right-most terms in
the inequality obtained from Proposition \ref{p-z} is $O\left( \mu /m\right) 
$, and therefore 
\begin{equation}
\mu E\left[ \widetilde{Y}^{i}\widetilde{L}^{i}\left( 1-R^{i}\right) \right]
\leq O\left( \frac{1}{m}\right) E\left[ \widetilde{Y}^{i}\widetilde{L}%
^{i}R^{i}\right] +O\left( \frac{\mu }{m}\right) .  \label{eq-mu-m}
\end{equation}
Applying (\ref{p2-2}) for $i$ completes the proof.

\emph{Step 7: (\ref{p3-4}) holds for }$i$. It follows immediately from (\ref
{p3-1}) and (\ref{p3-3}) for $i$.

\emph{Step 8: (\ref{p-9}) holds for }$i$. Adding (\ref{p1-2}) and (\ref{p3-4}%
) for $i$ yields 
\[
E\left[ Y^{i}L^{i}\right] =O\left( \mu \right) +O\left( \frac{1}{m}\right)
=O\left( \mu \right) , 
\]
since\footnote{%
This is the only place in the proof where $\mu m\geq \delta $ is used.} $%
1/m\leq \left( 1/\delta \right) \mu .$ Together with 
\[
E\left[ Y^{i}(1-L^{i})\right] \leq P\left[ L^{i}=0\right] =O\left( \mu
\right) 
\]
by (\ref{p1-1}) for $i$, the proof is completed.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

\textbf{Proof of Theorem \ref{th-main} (Main Theorem). }Inequality (\ref{p-9}%
) of Proposition \ref{p-1} is precisely (\ref{eq-th-e}); applying Markov's
inequality then yields (\ref{eq-th-pi}).%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

\section{Discussion\label{s-discussion}}

\subsection{Multiple Agents\label{ss-agents}}

The gene-normal form, which was defined in Subsection \ref{ss-gene-normal},
assumes that the populations at the different nodes are distinct. We shall
now consider the implications of this assumption in the case where a player
in the original game $\Gamma $ consists of more than one agent; that is,
that player moves at more than one node. For concreteness, assume that two
nodes $i$ and $j$ belong to the same player. There are three main
implications:

\begin{enumerate}
\item  The actions at $i$ and at $j$ are determined by different
``characteristics.'' This poses no problem, since different decision nodes
in the tree correspond to different situations, and each one is controlled
by a different ``gene.'' As we have already argued in Subsection \ref
{ss-gene-normal}, if instead the same gene controls both decisions, it means
that the player acts identically at both. The correct description of such an
instance is then to have the two nodes belong to the same ``information
set''---i.e., a game with imperfect information (and possibly imperfect
recall too).

\item  The payoffs of all participants are computed based on independent
draws from the populations; i.e., $u^{k}(\omega _{q}^{k},\omega
^{-k})=u^{k}(\omega _{q}^{k},x^{-k})$, where $x^{-k}$ is the product of $%
x^{l}$ for all $l\neq k$. In particular, the proportion of a pair of actions 
$(a^{i},a^{j})$ is taken to be $x_{a^{i}}^{i}x_{a^{j}}^{j}$. If $i$ and $j$
were two gene loci of the same organism, this would be in general incorrect.
However, an examination of our proofs suggests that a weaker assumption
could be used: Namely, if both $a^{i}$ and $a^{j}$ are present (i.e., if $%
x_{a^{i}}^{i}>0$ and $x_{a^{j}}^{j}>0$), then the combination $(a^{i},a^{j})$
is also present (i.e., its proportion is positive).

\item  At each period, the dynamic process affects the two characteristics
independently. As far as mutations go, this is natural. However, selection
operates at the level of the organism rather than the single gene. When the
proportion of the combination $(a^{i},a^{j})$ is given by the product $%
x_{a^{i}}^{i}x_{a^{j}}^{j}$, it makes no difference. But difficulties arise
once this independence no longer holds. Still, note that $%
x_{b^{i}}^{i}>1-\varepsilon $ and $x_{b^{j}}^{j}>1-\varepsilon $ imply that
the proportion of the pair $(b^{i},b^{j})$ must be at least $1-2\varepsilon $%
, regardless of the degree of interdependence between $i$ and $j$. This
argument suggests that our results may well hold in this more general setup.
\end{enumerate}

\subsection{Other Selection Dynamics\label{ss-selection}}

One straightforward generalization of our dynamics assumes that selection
satisfies $Q\left[ \widetilde{\omega }_{q(i)}^{i}\in B^{i}\mid \omega
\right] \geq \beta $ instead of (\ref{eq-beta}); i.e., at least one better
action has positive probability of being chosen (rather than each one of
them). It seems that the result will remain unchanged.\footnote{%
Consider for simplicity a final node $i$; one shows first that the
proportion of the worst action at $i$ must be small, after which the same is
proven for the second-worst, and so on. Note that the proportion of the
best-reply---which can change only by mutation---is bounded away from zero.}
Another possible generalization, which also appears not to affect the
result, assumes that the probability of each individual in $M(i)$ being
chosen goes to zero as the population size $\left| M(i)\right| $ goes to
infinity (without it necessarily being of the order of $1/\left| M(i)\right| 
$ as in (\ref{eq-M})).\footnote{%
One then needs to work with expressions like $E\left[ p\left( Y^{i}\right)
\right] $ instead of $E\left[ Y^{i}\right] $, where $p\left( Y^{i}\right) $
is the probability that a non-$b^{i}$-individual is chosen when their
proportion in the population is $Y^{i}$.} Other dynamics to consider---like
the ``replicator dynamics''---may have the selection take into account the
actual payoff differences, rather than just their sign.\footnote{%
Large populations may then decrease the effect of selection, since the
difference in payoff due to one individual is small.} Further variants
should also be analyzed; hopefully, a precise characterization of the class
of dynamics that yield the backward induction outcome will be obtained.

One modification\footnote{%
Suggested by Ilan Eshel.} makes selection choose better actions with
probabilities that are proportional to their current proportions in the
population. Thus, for instance, if the best-reply action $c^{i}$ is
currently played by $k$ individuals, then the probability that a chosen non-$%
c^{i}$ individual will switch to $c^{i}$ by selection is $\sigma k/m^{i}$
(rather than $\sigma /\left| B^{i}\right| $). When $k$ is small, this
probability becomes low; in particular, if $c^{i}$ is not currently present
in the population, then selection cannot introduce it.\footnote{%
A requirement suggested by Karl Schlag.} Such dynamics may be more
appropriate in imitation-type models (where the ``visibility'' of an action
depends on its prevalence in the population). Note also the property that $%
x_{a^{i}}^{i}$, the proportion in the population of a better action $a^{i}$,
increases by selection at a rate that is proportional to $x_{a^{i}}^{i}$.

Formally, we weaken (\ref{eq-beta}) to\footnote{%
Notice that (\ref{eq-beta0}) is more general than the description in the
previous paragraph (since we only require $\geq $ rather than $=$).} 
\begin{equation}
Q\left[ \widetilde{\omega }_{q(i)}^{i}=a^{i}\mid \omega \right] \geq \beta
x_{a^{i}}^{i}\left( \omega \right) \text{ for each }a^{i}\in B^{i}.
\label{eq-beta0}
\end{equation}
It turns out that our result continues to hold.

\begin{theorem}
\label{th-beta0}The results of the Main Theorem \ref{th-main} hold also for
dynamic processes satisfying (\ref{eq-independence}), (\ref{eq-M}), (\ref
{eq-q}), (\ref{eq-beta0}), (\ref{eq-mu}) and (\ref{eq-mu2}).
\end{theorem}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
It can easily be checked that selection---i.e., (\ref{eq-beta}) or (\ref
{r1l1})---is used in the proofs of Subsection \ref{ss-proof} only in Step 3
of the Proof of Proposition \ref{p-1}. Replacing (\ref{eq-beta}) by (\ref
{eq-beta0}) implies that $P\left[ \widetilde{\omega }_{q(i)}^{i}=b^{i}\mid
\omega \right] \geq \beta \left( 1-Y^{i}\right) $ for every $\omega \in
\Omega $ with $L^{i}R^{i}=1$ (since the proportion of $b^{i}$ in the
population is $1-Y^{i}$). Therefore the argument of Step 3 yields 
\[
P\left[ \widetilde{Y}^{i}<Y^{i}\right] \geq \beta \gamma _{1}E\left[
Y^{i}\left( 1-Y^{i}\right) L^{i}R^{i}\right] , 
\]
which, by (\ref{eq-p1-1}) for $i,$ implies 
\begin{equation}
E\left[ Y^{i}\left( 1-Y^{i}\right) L^{i}R^{i}\right] =O\left( \mu \right) .
\label{eq-pf-step3-0}
\end{equation}

Next, we claim that the probability that $Y^{i}$ is in a neighborhood of $1$
is low.

\begin{lemma}
\label{l-eta}There exist constants $\eta >0$ and $c>0$ such that 
\[
P\left[ Y^{i}>1-\eta \right] =O\left( e^{-cm}\right) .
\]
\end{lemma}

%TCIMACRO{\TeXButton{Proof}{\proof}}
%BeginExpansion
\proof%
%EndExpansion
We have 
\[
P\left[ \widetilde{Y}^{i}=\frac{k}{m^{i}}\left| Y^{i}=\frac{k+1}{m^{i}}%
\right. \right] \geq \gamma _{1}\frac{k+1}{m^{i}}\alpha _{1}\mu 
\]
by (\ref{eq-M}) and (\ref{eq-mu}). Next, 
\begin{eqnarray*}
P\left[ \widetilde{Y}^{i}=\frac{k+1}{m^{i}}\left| Y^{i}=\frac{k}{m^{i}}%
\right. \right] &\leq &\gamma _{2}\left( 1-\frac{k}{m^{i}}\right) \left(
\alpha _{2}\mu +P\left[ L^{i}=0\right] \right) \\
&\leq &c_{1}\left( 1-\frac{k}{m^{i}}\right) \mu
\end{eqnarray*}
for an appropriate constant $c_{1}>0$, where we have used (\ref{eq-M}) and (%
\ref{eq-mu2}) (selection can increase $Y^{i}$ only when $L^{i}=0$), and then 
$P\left[ L^{i}=0\right] =O\left( \mu \right) $ by (\ref{p1-1}) for $i$
(proved in Step 1).

Using the invariant distribution property yields 
\[
P\left[ Y^{i}=\frac{k+1}{m^{i}}\right] \gamma _{1}\frac{k+1}{m^{i}}\alpha
_{1}\mu \leq P\left[ Y^{i}=\frac{k}{m^{i}}\right] c_{1}\left( 1-\frac{k}{%
m^{i}}\right) \mu ,
\]
or 
\[
P\left[ Y^{i}=\frac{k+1}{m^{i}}\right] \leq c_{2}\left( \frac{m^{i}-k}{k+1}%
\right) P\left[ Y^{i}=\frac{k}{m^{i}}\right] 
\]
(where $c_{2}:=c_{1}/\left( \gamma _{1}\alpha _{1}\right) $). Let $\eta >0$
be small enough so that $c_{2}\left( m^{i}-k\right) /\left( k+1\right) \leq
1/2$ for all\footnote{$\left\lfloor x\right\rfloor $ denotes the largest
integer that is $\leq x$.} $k>k_{0}:=\left\lfloor \left( 1-2\eta \right)
m^{i}\right\rfloor $. Then we get 
\[
P\left[ Y^{i}=\frac{k}{m^{i}}\right] \leq \left( \frac{1}{2}\right)
^{k-k_{0}}
\]
for all $k\geq k_{0}$ and thus 
\[
P\left[ Y^{i}>1-\eta \right] \leq \sum_{k>\left( 1-\eta \right) m^{i}}\left( 
\frac{1}{2}\right) ^{k-k_{0}}\leq \left( \frac{1}{2}\right) ^{\eta m^{i}-1},
\]
as claimed.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

\textbf{Proof of Theorem \ref{th-beta0} (continued).} (\ref{eq-pf-step3-0})
implies 
\[
\eta E\left[ Y^{i}L^{i}R^{i}\mathbf{1}_{Y^{i}\leq 1-\eta }\right] =O\left(
\mu \right) ,
\]
where $\mathbf{1}_{Y^{i}\leq 1-\eta }$ is the indicator that $Y^{i}\leq
1-\eta $. Lemma \ref{l-eta} yields 
\[
E\left[ Y^{i}L^{i}R^{i}\mathbf{1}_{Y^{i}>1-\eta }\right] \leq P\left[
Y^{i}>1-\eta \right] =O\left( e^{-cm}\right) ,
\]
which is at most $O\left( \mu \right) $ since $1/m\leq \left( 1/\delta
\right) \mu $. Adding the two estimates gives (\ref{p1-2}) for $i$, thus
completing Step 3. The rest of the Proof of Proposition \ref{p-1} is
unchanged.%
%TCIMACRO{\TeXButton{End Proof}{\endproof}}
%BeginExpansion
\endproof%
%EndExpansion

\subsection{Extensions\label{ss-extensions}}

This work is a first attempt to analyze basic evolutionary models in
extensive form games. A number of directions for further study suggest
themselves:

\begin{enumerate}
\item  \emph{Non-unique backward induction equilibrium}: Analyze the
non-generic case where there is more than one subgame-perfect equilibrium;
for instance, when some of the payoffs are equal. It seems that a subset of $%
BI$, at times a strict subset, is obtained.

\item  \emph{Games with imperfect information}: Here we conjecture that all
evolutionarily stable equilibria will be subgame-perfect, but the converse
will no longer be true; evolutionary dynamics may well pick out certain
refinements rather than others.

\item  \emph{Multiple agents and non-distinct populations}: This is
discussed in Subsection \ref{ss-agents}, which suggests a number of relevant
generalizations.

\item  \emph{Other selection dynamics}: These are discussed in Subsection 
\ref{ss-selection}.
\end{enumerate}

\begin{thebibliography}{99}
\bibitem{}  Aumann, R. J. [1995], ``Backward Induction and Common Knowledge
of Rationality,'' \emph{Games and Economic Behavior} 8, 6--19.

\bibitem{}  Aumann, R. J. [1998], ``On the State of the Art in Game
Theory,'' \emph{Games and Economic Behavior} 24, 181--210.

\bibitem{}  Cressman, R. and K. H. Schlag [1997], ``The Dynamic
(In)Stability of Backwards Induction,'' mimeo.

\bibitem{}  Fudenberg, D. and D. K. Levine [1998], \emph{The Theory of
Learning in Games}, MIT Press.

\bibitem{}  Gale, J., K. Binmore and L. Samuelson [1995], ``Learning to be
Imperfect: The Ultimatum Game,'' \emph{Games and Economic Behavior} 8,
56--90.

\bibitem{}  Hammerstein, P. and R. Selten [1994], ``Game Theory and
Evolutionary Biology,'' in \emph{Handbook of Game Theory, Vol. II}, R. J.
Aumann and S. Hart (eds.), North-Holland, 929--993.

\bibitem{}  Hofbauer, J. and K. Sigmund [1998], \emph{Evolutionary Dynamics
and Population Dynamics}, Cambridge University Press.

\bibitem{}  Kandori, M., G. Mailath and R. Rob [1993], ``Learning, Mutation,
and Long-Run Equilibrium in Games,'' \emph{Econometrica }61, 29--56.

\bibitem{}  Kuhn, H. W. [1953], ``Extensive Games and the Problem of
Information,'' in \emph{Contributions to the Theory of Games II},\emph{\ }H.
W. Kuhn and A. W. Tucker (eds.), \emph{Annals of Mathematics Studies }28,
Princeton University Press, 193--216.

\bibitem{}  N\"{o}ldeke, G. and L. Samuelson [1993], ``An Evolutionary
Analysis of Backward and Forward Induction,'' \emph{Games and Economic
Behavior} 5, 425--454.

\bibitem{}  Samuelson, L. [1997], \emph{Evolutionary Games and Equilibrium
Selection}, MIT Press.

\bibitem{}  Vega-Redondo, F. [1997], \emph{Evolution, Games, and Economic
Behavior}, Oxford University Press.

\bibitem{}  Weibull, J. W. [1995], \emph{Evolutionary Game Theory}, MIT
Press.

\bibitem{}  Young, H. P. [1993], ``The Evolution of Conventions,'' \emph{%
Econometrica} 61, 57--84.

\bibitem{}  Young, H. P. [1998], \emph{Individual Strategy and Social
Structure}, Princeton University Press.
\end{thebibliography}

\end{document}