Given n independent random vectors with common density f on Rd, we study the weak convergence of three empirical-measure based estimators of the convex λ-level set Lλ of f, namely the excess mass set, the minimum volume set and the maximum probability set, all selected from a class of convex sets A that contains Lλ. Since these set-valued estimators approach Lλ, even the formulation of their weak convergence is non-standard. We identify the joint limiting distribution of the symmetric difference of Lλ and each of the three estimators, at rate n−1/3. It turns out that the minimum volume set and the maximum probability set estimators are asymptotically indistinguishable, whereas the excess mass set estimator exhibits "richer" limit behavior. Arguments rely on the boundary local empirical process, its cylinder representation, dimension-free concentration around the boundary of Lλ, and the set-valued argmax of a drifted Wiener process.
|Journal||The Annals of Statistics|
|Publication status||Accepted/In press - Dec 2021|