TY - UNPB

T1 - Two-Step Sequential Sampling

AU - Moors, J.J.A.

AU - Strijbosch, L.W.G.

N1 - Pagination: 36

PY - 2000

Y1 - 2000

N2 - Deciding upon the optimal sample size in advance is a difficult problem in general. Often, the investigator regrets not having drawn a larger sample; in many cases additional observations are done. This implies that the actual sample size is no longer deterministic; hence, even if all sample elements are drawn at random, the final sample is not a simple random sample. Although this fact is widely recognized, its consequences are often grossly underrated in our view. Too often, these consequences are ignored: the usual statistical procedures are still applied. This paper shows in detail the dangers of applying standard techniques to extended samples. To allow theoretical derivations only some elementary situations are considered. More precisely, the following features hold throughout the paper: - the population variable of interest is normally distributed; - estimation concerns population mean and variance; - all sample elements are drawn at random, with replacement; - only standard estimators, like sample mean and sample variance, will be considered. Nevertheless, the results are rather disturbing: standard estimators have sizable biases, their variances are (much) larger than usual, and standard confidence intervals do not have the prescribed confidence level any more. Crucial is of course the criterion used to decide whether or not to extend the original sample. Four criteria are applied. In the first three cases, an independent event, the observed sample mean and the observed sample variance, respectively, determine whether or not to double the original sample size. The fourth criterion compares the variances observed in two independent samples; the sample with the highest variance is extended. Only in this fourth case the size of the extension is a random variable. Note that a given criterion is used only once: after the original observation s the final sample size is determined; hence the title of our paper.

AB - Deciding upon the optimal sample size in advance is a difficult problem in general. Often, the investigator regrets not having drawn a larger sample; in many cases additional observations are done. This implies that the actual sample size is no longer deterministic; hence, even if all sample elements are drawn at random, the final sample is not a simple random sample. Although this fact is widely recognized, its consequences are often grossly underrated in our view. Too often, these consequences are ignored: the usual statistical procedures are still applied. This paper shows in detail the dangers of applying standard techniques to extended samples. To allow theoretical derivations only some elementary situations are considered. More precisely, the following features hold throughout the paper: - the population variable of interest is normally distributed; - estimation concerns population mean and variance; - all sample elements are drawn at random, with replacement; - only standard estimators, like sample mean and sample variance, will be considered. Nevertheless, the results are rather disturbing: standard estimators have sizable biases, their variances are (much) larger than usual, and standard confidence intervals do not have the prescribed confidence level any more. Crucial is of course the criterion used to decide whether or not to extend the original sample. Four criteria are applied. In the first three cases, an independent event, the observed sample mean and the observed sample variance, respectively, determine whether or not to double the original sample size. The fourth criterion compares the variances observed in two independent samples; the sample with the highest variance is extended. Only in this fourth case the size of the extension is a random variable. Note that a given criterion is used only once: after the original observation s the final sample size is determined; hence the title of our paper.

KW - sampling

KW - bias

M3 - Discussion paper

VL - 2000-39

T3 - CentER Discussion Paper

BT - Two-Step Sequential Sampling

PB - Econometrics

CY - Tilburg

ER -