Mathematics & Statistics
Permanent URI for this collection
Browse
Browsing Mathematics & Statistics by Author "Gabriela Gonzalez Martinez"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Open Access Bandwidth Selection for Level Set Estimation in the Context of Regression and a Simulation Study for Non Parametric Level Set Estimation When the Density Is Log-Concave(2022-08-08) Gonzalez Martinez, Gabriela; Jankowski, HannaBandwidth selection is critical for kernel estimation because it controls the amount of smoothing for a function's estimator. Traditional methods for bandwidth selection involve optimizing a global loss function (e.g. least squares cross validation, asymptotic mean integrated squared error). Nevertheless, a global loss function becomes suboptimal for the level set estimation problem which is local in nature. For a function $g$, the level set is the set LSλ = {x : g(x) ≥ λ}. In the first part of this thesis we study optimal bandwidth selection for the Nadaraya-Watson kernel estimator in one dimension. We present a local loss function as an alternative to $L_2$ metric and derive an asymptotic approximation of its corresponding risk. The level set optimal bandwidth $(h_{opt})$ is the argument that minimizes the asymptotic approximation. We show that the rate of $h_{opt}$ coincides with the rate from traditional global bandwidth selectors. We then derive an algorithm to obtain the practical bandwidth and study its performance through simulations. Our simulation results show that in general, for small samples and small levels, the level set optimal bandwidth shows improvement in estimating the level set when compared to the cross validation bandwidth selection or the local polynomial kernel estimator. We illustrate this new bandwidth selector on a decompression sickness study on the effects of duration and pressure on mortality during a dive. In the second part, motivated by our simulation findings and the relationship of the level set estimation to the highest density region (HDR) problem, we study via simulations the properties of a plug-in estimator where the density is estimated with a log-concave mixed model. We focus in particular on univariate densities and compare this method against a kernel plug-in estimator. The bandwidth for the kernel plug-in estimator is chosen optimally for the HDR problem. We observe through simulations that when the number of components in the model is correctly specified, the log-concave plug-in estimator performs better than the kernel estimator for lower levels and similarly for the rest of the levels considered. We conclude with an analysis on the daily maximum temperatures in Melbourne, Australia.