[R] how to tell if its better to standardize your data matrix first when you do principal

masterinex xevilgang79 at hotmail.com
Mon Nov 23 02:53:41 CET 2009


Hi Hadley , 

I really apreciate the suggestions you gave, It was helpful , but I still
didnt quite get it all.   and I really want to do a good job , so any
comments would sure come helpful, please understand me . 





hadley wrote:
> 
> You've asked the same question on stackoverflow.com and received the
> same answer.  This is rude because it duplicates effort.  If you
> urgently need a response to a question, perhaps you should consider
> paying for it.
> 
> Hadley
> 
> On Sun, Nov 22, 2009 at 12:04 PM, masterinex <xevilgang79 at hotmail.com>
> wrote:
>>
>> so under which cases is it better to  standardize  the data matrix first
>> ?
>> also  is  PCA generally used to predict the response variable , should I
>> keep that variable in my data matrix ?
>>
>>
>> Uwe Ligges-3 wrote:
>>>
>>> masterinex wrote:
>>>>
>>>>
>>>> Hi guys ,
>>>>
>>>> Im trying to do principal component analysis in R . There is 2 ways of
>>>> doing
>>>> it , I believe.
>>>> One is doing  principal component analysis right away the other way is
>>>> standardizing the matrix first  using s = scale(m)and then apply
>>>> principal
>>>> component analysis.
>>>> How  do I tell what result is better ? What values in particular should
>>>> i
>>>> look at . I already managed to find the eigenvalues and eigenvectors ,
>>>> the
>>>> proportion of  variance for each eigenvector using both methods.
>>>>
>>>
>>> Generally, it is better to standardize. But in some cases, e.g. for the
>>> same units in your variables indicating also the importance, it might
>>> make sense not to do so.
>>> You should think about the analysis, you cannot know which result is
>>> `better' unless you know an interpretation.
>>>
>>>
>>>
>>>> I noticed that the proportion of the variance for the first  pca
>>>> without
>>>> standardizing had a larger  value . Is there a meaning to it ? Isnt
>>>> this
>>>> always the case?
>>>>  At last , if I am  supposed to predict a variable ie weight should I
>>>> drop
>>>> the variable ie weight from my data matrix when I do principal
>>>> component
>>>> analysis ?
>>>
>>>
>>> This sounds a bit like homework. If that is the case, please ask your
>>> teacher rather than this list.
>>> Anyway, it does not make sense to predict weight using a linear
>>> combination (principle component) that contains weight, does it?
>>>
>>> Uwe Ligges
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> http://had.co.nz/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26471673.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list