We propose the randomized Generalized Approximate Cross Validation
(ranGACV) method for choosing multiple smoothing parameters in penalized
likelihood estimates for Bernoulli data. The method is intended for application
with penalized likelihood smoothing spline ANOVA models. In addition we propose
a class of approximate numerical methods for solving the penalized likelihood
variational problem which, in conjunction with the ranGACV method allows
the application of smoothing spline ANOVA models with Bernoulli data to much
larger data sets than previously possible. These methods are based on choosing
an approximating subset of the natural (representer) basis functions for the
variational problem. Simulation studies with synthetic data, including
synthetic data mimicking demographic risk factor data sets is used to examine
the properties of the method and to compare the approach with the GRKPACK code
of Wang (1997c). Bayesian “confidence intervals” are obtained for
the fits and are shown in the simulation studies to have the “across the
function” property usually claimed for these confidence intervals.
Finally the method is applied to an observational data set from the Beaver Dam
Eye study, with scientifically interesting results.
@article{1015957471,
author = {Lin, Xiwu and Wahba, Grace and Xiang, Dong and Gao, Fangyu and Klein, Ronald and Klein, Barbara},
title = {Smoothing spline ANOVA models for large data sets with Bernoulli
observations and the randomized GACV},
journal = {Ann. Statist.},
volume = {28},
number = {3},
year = {2000},
pages = { 1570-1600},
language = {en},
url = {http://dml.mathdoc.fr/item/1015957471}
}
Lin, Xiwu; Wahba, Grace; Xiang, Dong; Gao, Fangyu; Klein, Ronald; Klein, Barbara. Smoothing spline ANOVA models for large data sets with Bernoulli
observations and the randomized GACV. Ann. Statist., Tome 28 (2000) no. 3, pp. 1570-1600. http://gdmltest.u-ga.fr/item/1015957471/