[R] prcomp - arbitrary direction of the returned principal components

Ashim Kapoor @@h|mk@poor @end|ng |rom gm@||@com
Thu Oct 13 06:28:25 CEST 2022


Dear Aaron,

Many thanks for your reply.

Please allow me to illustrate my query a bit.

I take some data, throw it to prcomp and extract the x data frame from prcomp.

>From ?prcomp:

       x: if ‘retx’ is true the value of the rotated data (the centred
          (and scaled if requested) data multiplied by the ‘rotation’
          matrix) is returned.  Hence, ‘cov(x)’ is the diagonal matrix
          ‘diag(sdev^2)’.  For the formula method, ‘napredict()’ is
          applied to handle the treatment of values omitted by the
          ‘na.action’.

I consider x[,1] as my index. This makes sense as x[,1] is the
projection of the data on the FIRST principal component.
Now this x[,1] can be a high +ve number or a low -ve number. I can't
ignore the sign.

If I ignore the sign by taking the absolute value, the HIGH / LOW
stress values will be indistinguishable.

Hence I do not think using absolute values of x[,1] is the solution.
Yes it will make the results REPRODUCIBLE but that will be at the cost
of losing information.

Any other idea ?

Many thanks,
Ashim

On Wed, Oct 12, 2022 at 5:23 PM Ebert,Timothy Aaron <tebert using ufl.edu> wrote:
>
> Use absolute value
>
> Tim
>
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Ashim Kapoor
> Sent: Wednesday, October 12, 2022 7:48 AM
> To: R Help <r-help using r-project.org>
> Subject: [R] prcomp - arbitrary direction of the returned principal components
>
> [External Email]
>
> Dear R experts,
>
> From ?prcomp,
>
> ---- snip -----
> Note:
>
>      The signs of the columns of the rotation matrix are arbitrary, and
>      so may differ between different programs for PCA, and even between
>      different builds of R.
> ---- snip ------
>
> My problem is that I am building an index based on Principal Components Analysis.
> When the index is high it should indicate stress in the market. Due to the arbitrary sign sometimes I get an index which is HIGH when there is stress and sometimes I get  the OPPOSITE - an index which is LOW when there is stress.
> This program is shared with other people who may have a different build of R.
>
> I can forcefully use a NEGATIVE sign to FLIP the index when it is LOW.
> That works.
>
> Now my query is : Just like we do set.seed(1234) and force the pattern of generation of random number and make it REPRODUCIBLE, can I do something like :
>
> set.direction.for.vector.in.pca(1234)
>
> Now each time I do prcomp it should choose the SAME ( high or low ) direction of the principle component on ANY computer having ANY version of R installed.
>
> That's what I want. I don't want the the returned principal component to be HIGH(LOW) on my computer and LOW(HIGH) on someone else's computer.
> That would confuse the people the code is shared with.
>
> Is this possible ? How do people deal with this ?
>
> Many thanks,
> Ashim
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C258ecdf67d1342e9785508daac47cdf3%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638011721656997427%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Jh00DHZnx%2FbRGgsdqkgEp7qcMzzqcjhxYfJGF1d13PI%3D&reserved=0
> PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C258ecdf67d1342e9785508daac47cdf3%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638011721656997427%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=p%2BYrpIUZTD1msNJFsE34J1iLCt8yAPsCe334GKm%2BAtk%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list