Unbiased sampling of users from (online) activity data

Zack Almquist*, Sakshi Arya, Li Zeng, Emma Spiro

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review


Online platforms offer new opportunities to study human behavior. However, while social scientists are often interested in using behavioral trace data—data created by a user over the course of their everyday life—to draw inferences about users, many online platforms only allow data to be sampled based on user activities (leading to data sets that are biased toward highly active users). Here, we introduce a simple method for reweighting activity-based sample statistics in order to provide descriptive (and potentially model-based) estimates of the user population. We illustrate these techniques by applying them to a case study of an online fitness community (Strava) and use it to explore basic network properties. Last, we explore the weights effect on model-based estimates for count data.
Original languageEnglish
Pages (from-to)23-38
JournalField Methods
Issue number1
Publication statusPublished - 2019
Externally publishedYes


Dive into the research topics of 'Unbiased sampling of users from (online) activity data'. Together they form a unique fingerprint.

Cite this