[R] how to tell if its better to standardize your data matrix first when you do principal

Uwe Ligges ligges at statistik.tu-dortmund.de
Mon Nov 23 10:12:50 CET 2009



masterinex wrote:
> Hi Hadley , 
> 
> I really apreciate the suggestions you gave, It was helpful , but I still
> didnt quite get it all.   and I really want to do a good job , so any
> comments would sure come helpful, please understand me . 


Well, we try to understand you, but we do not either. I think you really 
nedc to consult some statistics textbook on PCA if my answer was not 
sufficient. Given your questions, I doubt you understand what PCA does 
and how it works. It does not predict anything.

Uwe Ligges



> hadley wrote:
>> You've asked the same question on stackoverflow.com and received the
>> same answer.  This is rude because it duplicates effort.  If you
>> urgently need a response to a question, perhaps you should consider
>> paying for it.
>>
>> Hadley
>>
>> On Sun, Nov 22, 2009 at 12:04 PM, masterinex <xevilgang79 at hotmail.com>
>> wrote:
>>> so under which cases is it better to  standardize  the data matrix first
>>> ?
>>> also  is  PCA generally used to predict the response variable , should I
>>> keep that variable in my data matrix ?
>>>
>>>
>>> Uwe Ligges-3 wrote:
>>>> masterinex wrote:
>>>>>
>>>>> Hi guys ,
>>>>>
>>>>> Im trying to do principal component analysis in R . There is 2 ways of
>>>>> doing
>>>>> it , I believe.
>>>>> One is doing  principal component analysis right away the other way is
>>>>> standardizing the matrix first  using s = scale(m)and then apply
>>>>> principal
>>>>> component analysis.
>>>>> How  do I tell what result is better ? What values in particular should
>>>>> i
>>>>> look at . I already managed to find the eigenvalues and eigenvectors ,
>>>>> the
>>>>> proportion of  variance for each eigenvector using both methods.
>>>>>
>>>> Generally, it is better to standardize. But in some cases, e.g. for the
>>>> same units in your variables indicating also the importance, it might
>>>> make sense not to do so.
>>>> You should think about the analysis, you cannot know which result is
>>>> `better' unless you know an interpretation.
>>>>
>>>>
>>>>
>>>>> I noticed that the proportion of the variance for the first  pca
>>>>> without
>>>>> standardizing had a larger  value . Is there a meaning to it ? Isnt
>>>>> this
>>>>> always the case?
>>>>>  At last , if I am  supposed to predict a variable ie weight should I
>>>>> drop
>>>>> the variable ie weight from my data matrix when I do principal
>>>>> component
>>>>> analysis ?
>>>>
>>>> This sounds a bit like homework. If that is the case, please ask your
>>>> teacher rather than this list.
>>>> Anyway, it does not make sense to predict weight using a linear
>>>> combination (principle component) that contains weight, does it?
>>>>
>>>> Uwe Ligges
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> -- 
>> http://had.co.nz/
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>




More information about the R-help mailing list