Abstract
Nonprobability samples, for example observational studies, online opt-in surveys, or register data, do not come from a sampling design and therefore may suffer from selection bias. To correct for selection bias, Elliott and Valliant (EV) proposed a pseudo-weight estimation method that applies a two-sample setup for a probability sample and a nonprobability sample drawn from the same population, sharing some common auxiliary variables. By estimating the propensities of inclusion in the nonprobability sample given the two samples, we may correct the selection bias by (pseudo) design-based approaches. This paper expands the original method, allowing for large sampling fractions in either sample or for high expected overlap between selected units in each sample, conditions often present in administrative data sets and more frequently occurring with Big Data.
Original language | English |
---|---|
Pages (from-to) | 1181-1203 |
Journal | Journal of Survey Statistics and Methodology |
Volume | 11 |
Issue number | 5 |
DOIs | |
Publication status | Published - 2023 |
Keywords
- Big Data
- Nonprobability sample
- Propensity score
- Pseudo population bootstrap
- Selection bias