Skip to main content

Fourth Quarter 2021, 
Vol. 103, No. 4
Posted 2021-10-18

Stability and Equilibrium Selection in Learning Models: A Note of Caution


Relative to rational expectations models, learning models provide a theory of expectation formation where agents use observed data and a learning rule. Given the possibility of multiple equilibria under rational expectations, the learning literature often uses stability as a criterion to select an equilibrium. This article uses a monetary economy to illustrate that equilibrium selection based on stability is sensitive to specifications of the learning rule. The stability criterion selects qualitatively different equilibria even when the differences in learning specifications are small.

YiLi Chien is a research officer and economist at the Federal Reserve Bank of St. Louis. In-Koo Cho is a professor of economics at Emory University and Hanyang Univerity and a research fellow at the Federal Reserve Bank of St. Louis. B. Ravikumar is a senior vice president, the deputy director of research, and an economist at the Federal Reserve Bank of St. Louis.


Under the rational expectations (RE) hypothesis, the expectations of agents are consistent with and always confirmed by equilibrium outcomes. This hypothesis often significantly simplifies the analysis of complicated economic problems. However, the RE hypothesis is silent on how agents form their expectations and, hence, provides no guidance on selecting an equilibrium when multiple RE equilibria occur. In contrast, a learning model specifies the agent's learning rule for forming expectations and is used to select one from multiple RE equilibria. The advantage of using the learning approach for equilibrium selection is noted by Evans and Honkapohja (2001). The standard criterion for equilibrium selection in the learning approach is stability of the learning dynamics. This consideration is quite intuitive: If a learning equilibrium is not stable, then it is unlikely to be the long-run equilibrium outcome.

Examples of using stability of the learning dynamics to eliminate some RE equilibria include Lucas (1986), Marcet and Sargent (1989), Woodford (1990), and Bullard and Mitra (2002). Lucas (1986) argues that adaptive behavior of economic agents may narrow the set of equilibria in some economic models. By using the stability criterion under learning, Marcet and Sargent (1989) eliminate the hyperinflation equilibrium of Sargent and Wallace (1985). In a monetary model with multiple RE equilibria, Woodford (1990) shows that the economy could converge to a stationary sunspot equilibrium under learning. Bullard and Mitra (2002) argue that the stability under learning criterion is necessary for monetary policy evaluation, especially in situations where multiple RE equilibria could be induced by policy.

The focus of our article is on equilibrium selection using the stability criterion under different specifications of learning. We conduct our exercise using an example in Bullard (1994), which is a simplified version of the model in Sargent and Wallace (1981). Our choice of the Sargent-Wallace framework is deliberate. The framework is simple and admits two steady states under RE, so we can explore the issue of selection. Furthermore, the key features of the model have been used repeatedly in the learning literature; see, for instance, Marcet and Sargent (1989), Bullard (1994), and Marcet and Nicolini (2003). We employ the model to examine equilibrium selection in different learning specifications via the stability criterion, a criterion that is common in the learning literature.

Our model is an overlapping generations endowment economy where money is the only store of value. The optimal decision of agents depends on their inflation forecasts, and so do the equilibrium outcomes. Under RE, the model has two steady states: high inflation and low inflation. To select one of the two steady states, we consider a learning model where agents forecast inflation using a rule that is a convex combination of past expected inflation and actual inflation. We examine the learning dynamics under two specifications: (i) Agents know only the past prices, and (ii) agents know the current price in addition to past prices. The two specifications imply different values for actual inflation in the learning rule. Under (i), agents use last period's inflation rate, whereas under (ii), they use the current inflation rate. Thus, the only difference between the two specifications is that the value of actual inflation is current in one specification and lagged by just one period in the other.

Our main result is that the stability criterion selects qualitatively different equilibria even when the differences in the learning specifications are minor. In particular, the learning rule using last period's inflation rate implies that the low-inflation RE steady state is the only stable learning equilibrium. Thus, using the stability criterion to select an RE equilibrium implies that the low-inflation steady state would be the long-run outcome. However, under the learning rule using the current inflation rate, both RE steady states are stable. Thus, the stability criterion does not offer useful guidance for equilibrium selection. In other words, our simple model shows that the stability of the learning equilibrium is sensitive to the specification of the learning rule. The learning dynamics may not be robust against seemingly minor differences in the learning rule.

Earlier work by Marcet and Sargent (1989) demonstrated how the stability criterion under RE dynamics selects the equilibrium that validates the "unpleasant monetarist arithmetic" in Sargent and Wallace (1981) but that the same criterion under least-squares learning selects another equilibrium that invalidates the unpleasant monetarist arithmetic. Our investigation of the sensitivity issue differs from that in Marcet and Sargent (1989) in an important way: Marcet and Sargent (1989) compare the equilibrium selection under two substantially different specifications—RE and learning. Under RE, the decisionmaker is a forward-looking rational agent, while under the learning dynamics, the decisionmaker is a backward-looking boundedly rational agent. We, on the other hand, compare outcomes under two learning specifications. Under both learning specifications, we maintain the assumption that the decisionmaker is a least-squares learner, but we change the timing of the observation of one variable by just one period.

Our exercise is in the same spirit as the exercises by Hansen and Sargent (2007) that examine robustness of an equilibrium to small changes in specifications. Hansen and Sargent (2007) endow the agents with a set of models instead of just one. The agents have a reference model and entertain a small neighborhood of models around the reference model and respond by choosing the model that performs the best against the worst possible state. Similarly, Cho and Kasa (2017) consider a set of "nearby" learning models close to a benchmark model, but the agent uses model averaging between the benchmark and the nearby models. In Cho and Kasa (2015), the agents choose a model based on a specification test.2 In contrast to these articles, our agents do not evaluate multiple models simultaneously and do not choose one based on a pre-specified performance criterion. Instead, our agents consider only one model at a time. We examine the equilibrium selected by the stability criterion in each model.

Read the full article.