[R] cross validation and parameter determination

Ramon Diaz-Uriarte rdiaz at cnio.es
Wed Apr 20 09:56:29 CEST 2005


On Wednesday 20 April 2005 00:17, array chip wrote:
> Hi all,
>
> In Tibshirani's PNAS paper about nearest shrunken
> centroid analysis of microarrays (PNAS vol 99:6567),
> they used cross validation to choose the amount of
> shrinkage used in the model, and then test the
> performance of the model with the cross-validated
> shrinkage in separate independent testing set. If I
> don't have the luxury of having independent testing
> set, can I just use the cross validation performance
> as the performance estimate? In other words, can I use
> the same single cross-validation to both choose the
> value of the parameter (amount of shrinkage in this
> case) and estimate the performance which was based on
> the value of the parameter chosen by the same
> cross-validation? I kind of feel awkward by getting
> both on a single cross validation, because it seems
> like I used the dataset in training set manner. Am I
> wrong/right?


That error rate is probably optimistic, because as you say
> cross-validation? I kind of feel awkward by getting
> both on a single cross validation, because it seems
> like I used the dataset in training set manner. Am I

However, you can easily wrap the whole pam procedure within an outer-loop of 
cross validation or bootstrap. (This problem is not that different from, say, 
using knn and selecting k using cross-validation; or selecting the number of 
genes to use with cross-validation, etc. You should then assess the error 
rate of your procedure).

R.
>
> Thanks!
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligarto.org/rdiaz/0xE89B3462.asc)




**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.




More information about the R-help mailing list