StreamingBandit

Experimenting with bandit policies

Research output: Contribution to journalArticleScientificpeer-review

Abstract

A large number of statistical decision problems in the social sciences and beyond can be framed as a (contextual) multi-armed bandit problem. However, it is notoriously hard to develop and evaluate policies that tackle these types of problem, and to use such policies in applied studies. To address this issue, this paper introduces StreamingBandit, a python web application for developing and testing bandit policies in field studies. StreamingBandit can sequentially select treatments using (online) policies in real time. Once StreamingBandit is implemented in an applied context, different policies can be tested, altered, nested, and compared. StreamingBandit makes it easy to apply a multitude of
bandit policies for sequential allocation in field experiments, and allows for the quick development and re-use of novel policies. In this article, we detail the implementation logic of StreamingBandit and provide several examples of its use.
Original languageEnglish
JournalJournal of Statistical Software
Publication statusAccepted/In press - 2019

Fingerprint

Social sciences
Testing
Experiments
Multi-armed Bandit
Bandit Problems
Python
Field Experiment
Field Study
Social Sciences
Policy
Web Application
Decision problem
Reuse
Logic
Evaluate

Cite this

@article{55c9d68ce1694b668d221776549447f7,
title = "StreamingBandit: Experimenting with bandit policies",
abstract = "A large number of statistical decision problems in the social sciences and beyond can be framed as a (contextual) multi-armed bandit problem. However, it is notoriously hard to develop and evaluate policies that tackle these types of problem, and to use such policies in applied studies. To address this issue, this paper introduces StreamingBandit, a python web application for developing and testing bandit policies in field studies. StreamingBandit can sequentially select treatments using (online) policies in real time. Once StreamingBandit is implemented in an applied context, different policies can be tested, altered, nested, and compared. StreamingBandit makes it easy to apply a multitude ofbandit policies for sequential allocation in field experiments, and allows for the quick development and re-use of novel policies. In this article, we detail the implementation logic of StreamingBandit and provide several examples of its use.",
author = "Jules Kruijswijk and {van Emden}, Robin and Kaptein, {Maurits C.} and P. Parvinen",
year = "2019",
language = "English",
journal = "Journal of Statistical Software",
issn = "1548-7660",
publisher = "JOURNAL STATISTICAL SOFTWARE",

}

StreamingBandit : Experimenting with bandit policies. / Kruijswijk, Jules; van Emden, Robin; Kaptein, Maurits C.; Parvinen, P.

In: Journal of Statistical Software, 2019.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - StreamingBandit

T2 - Experimenting with bandit policies

AU - Kruijswijk, Jules

AU - van Emden, Robin

AU - Kaptein, Maurits C.

AU - Parvinen, P.

PY - 2019

Y1 - 2019

N2 - A large number of statistical decision problems in the social sciences and beyond can be framed as a (contextual) multi-armed bandit problem. However, it is notoriously hard to develop and evaluate policies that tackle these types of problem, and to use such policies in applied studies. To address this issue, this paper introduces StreamingBandit, a python web application for developing and testing bandit policies in field studies. StreamingBandit can sequentially select treatments using (online) policies in real time. Once StreamingBandit is implemented in an applied context, different policies can be tested, altered, nested, and compared. StreamingBandit makes it easy to apply a multitude ofbandit policies for sequential allocation in field experiments, and allows for the quick development and re-use of novel policies. In this article, we detail the implementation logic of StreamingBandit and provide several examples of its use.

AB - A large number of statistical decision problems in the social sciences and beyond can be framed as a (contextual) multi-armed bandit problem. However, it is notoriously hard to develop and evaluate policies that tackle these types of problem, and to use such policies in applied studies. To address this issue, this paper introduces StreamingBandit, a python web application for developing and testing bandit policies in field studies. StreamingBandit can sequentially select treatments using (online) policies in real time. Once StreamingBandit is implemented in an applied context, different policies can be tested, altered, nested, and compared. StreamingBandit makes it easy to apply a multitude ofbandit policies for sequential allocation in field experiments, and allows for the quick development and re-use of novel policies. In this article, we detail the implementation logic of StreamingBandit and provide several examples of its use.

M3 - Article

JO - Journal of Statistical Software

JF - Journal of Statistical Software

SN - 1548-7660

ER -