Bootstrap Thompson Sampling and sequential decision problems in the behavioral sciences

Dean Eckles*, Maurits Kaptein

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

228 Downloads (Pure)

Abstract

Behavioral scientists are increasingly able to conduct randomized experiments in settings that enable rapidly updating probabilities of assignment to treatments (i.e., arms). Thus, many behavioral science experiments can be usefully formulated as sequential decision problems. This article reviews versions of the multiarmed bandit problem with an emphasis on behavioral science applications. One popular method for such problems is Thompson sampling, which is appealing for randomizing assignment and being asymptoticly consistent in selecting the best arm. Here, we show the utility of bootstrap Thompson sampling (BTS), which replaces the posterior distribution with the bootstrap distribution. This often has computational and practical advantages. We illustrate its robustness to model misspecification, which is a common concern in behavioral science applications. We show how BTS can be readily adapted to be robust to dependent data, such as repeated observations of the same units, which is common in behavioral science applications. We use simulations to illustrate parametric Thompson sampling and BTS for Bernoulli bandits, factorial Gaussian bandits, and bandits with repeated observations of the same units.
Original languageEnglish
Number of pages12
JournalSage Open
Volume9
Issue number2
DOIs
Publication statusPublished - 2019

Keywords

  • ADJUVANT CHEMOTHERAPY
  • BAYESIAN-INFERENCE
  • LIKELIHOOD
  • REGRESSION
  • Thompson sampling
  • bootstrapping
  • dependent data
  • model misspecification
  • multiarmed bandits
  • online learning

Fingerprint

Dive into the research topics of 'Bootstrap Thompson Sampling and sequential decision problems in the behavioral sciences'. Together they form a unique fingerprint.

Cite this