Subset Selection in Regression - Couverture rigide

Livre 13 sur 110: ISSN

Miller, Alan

 
9780412353802: Subset Selection in Regression

Synopsis

Most scientific computing packages contain facilities for stepwise regression and often for 'all subsets' and other techniques for finding 'best-fitting' subsets of regression variables. The application of standard theory can be very misleading in such cases when the model has not been chosen a priori, but from the data. There is widespread awareness that considerable over-fitting occurs and that prediction equations obtained after extensive 'data dredging' often perform poorly when applied to new data.

This monograph relates almost entirely to least-squares methods of finding and fitting subsets of regression variables, though most of the concepts are presented in terms of the interpretation and statistical properties of orthogonal projections. An early chapter introduces these methods, which are still not widely known to users of least-squares methods.

Existing methods are described for testing whether any useful improvement can be obtained by using any of a set of predictors. Spjotvoll's method for comparing two arbitrary subsets of predictor variables is illustrated and described in detail.

When the selected model is the 'best-fitting' in some sense, conventional fitting methods give estimates of regression coefficients which are usually biased in the direction of being too large. The extent of this bias is demonstrated for simple cases. Various ad hoc methods for correcting the bias are discussed (ridge regression, James-Stein shrinkage, jack-knifing, etc.), together with the author's maximum likelihood technique. Areas in which further research is needed are also outlined.

Les informations fournies dans la section « Synopsis » peuvent faire référence à une autre édition de ce titre.

Présentation de l'éditeur

Most scientific computing packages contain facilities for stepwise regression and often for 'all subsets' and other techniques for finding 'best-fitting' subsets of regression variables. The application of standard theory can be very misleading in such cases when the model has not been chosen a priori, but from the data. There is widespread awareness that considerable over-fitting occurs and that prediction equations obtained after extensive 'data dredging' often perform poorly when applied to new data.

This monograph relates almost entirely to least-squares methods of finding and fitting subsets of regression variables, though most of the concepts are presented in terms of the interpretation and statistical properties of orthogonal projections. An early chapter introduces these methods, which are still not widely known to users of least-squares methods.

Existing methods are described for testing whether any useful improvement can be obtained by using any of a set of predictors. Spjotvoll's method for comparing two arbitrary subsets of predictor variables is illustrated and described in detail.

When the selected model is the 'best-fitting' in some sense, conventional fitting methods give estimates of regression coefficients which are usually biased in the direction of being too large. The extent of this bias is demonstrated for simple cases. Various ad hoc methods for correcting the bias are discussed (ridge regression, James-Stein shrinkage, jack-knifing, etc.), together with the author's maximum likelihood technique. Areas in which further research is needed are also outlined.

Les informations fournies dans la section « A propos du livre » peuvent faire référence à une autre édition de ce titre.

Autres éditions populaires du même titre

9781584881711: Subset Selection in Regression

Edition présentée

ISBN 10 :  1584881712 ISBN 13 :  9781584881711
Editeur : Chapman & Hall/CRC, 2002
Couverture rigide