This paper studies the relationships between learning about rules of thumb (represented by classifier systems) and dynamic programming. Building on a result about Markovian stochastic approximation algorithms, we characterize all decision functions that can be asymptotically obtained through classifier system learning, provided the asymptotic ordering of the classifiers is strict. We demonstrate in a robust example that the learnable decision function is in general not unique, not characterized by a strict ordering of the classifiers, and may not coincide with the decision function delivered by the solution to the dynamic programming problem even if that function is attainable. As an illustration we consider the puzzle of excess sensitivity of consumption to transitory income: classifier systems can generate such behavior even if one of the available rules of thumb is the decision function solving the dynamic programming problem, since bad decisions in good times can "feel better" than good decisions in bad times.
|Publication status||Published - 1995|
|Name||CentER Discussion Paper|