[R] problem with "unique" function

Fri Jul 28 19:09:19 CEST 2017

Thank you Peter. Rounding fixed the problem.

2017-07-28 12:15 GMT-04:00 peter dalgaard <pdalgd at gmail.com>:

> Most likely, previous computations have ended up giving slightly different
> values of say 0.13333. A pragmatic way out is to round to, say, 5 digits
> before applying unique. In this particular case, it seems like all numbers
> are multiples of 1/30, so another idea could be to multiply by 30, round,
> and divide by 30.
>
> -pd
>
> > On 28 Jul 2017, at 17:17 , li li <hannah.hlx at gmail.com> wrote:
> >
> > I have the joint distribution of three discrete random variables z1, z2
> and
> > z3 which is captured by "z"
> > and "prob" as described below.
> >
> > For example, the probability for z1=0.46667, z2=-1 and z3=-1 is
> 2.752e-13.
> > Also, the probability adds up to 1.
> >
> >> head(z)           z1      z2      z3
> > [1,] -0.46667 -1.0000 -1.0000
> > [2,] -0.33333 -0.9333 -0.9333
> > [3,] -0.20000 -0.8667 -0.8667
> > [4,] -0.06667 -0.8000 -0.8000
> > [5,]  0.06667 -0.7333 -0.7333
> > [6,]  0.20000 -0.6667 -0.6667> prob[1:5][1] 2.752e-13 3.210e-12
> > 1.348e-11 2.656e-11 2.656e-11> sum(prob)[1] 1
> >
> >
> > I want to put the distribution into a joint probability table. I use the
> > following code. But the probability no longer adds up to 1.
> >
> >> z1 <- sort(unique(z[,1])); z2 <- sort(unique(z[,2]));  z3 <-
> sort(unique(z[,3]))> P <- array(0, dim=c(length(z1),length(z2),length(z3)),
> dimnames=list(A=z1, H=z2, M=z3))> > for (i in 1:(dim(z)[1])){+     ind <-
> z[i,]+     P[dimnames(P)$A==ind[1], dimnames(P)$H==ind[2],
> dimnames(P)$M==ind[3]] <- prob[i]     + }> sum(P)[1] 37.5
> >
> >
> > The problem is when we look z1 as below, there are lot of repeated
> values.
> >
> >> unique(z[,1]) [1] -0.46667 -0.33333 -0.20000 -0.06667  0.06667
> 0.20000  0.33333  0.46667 -0.53333 -0.40000
> > [11] -0.26667 -0.13333  0.00000  0.13333  0.26667  0.40000  0.53333
> > -0.60000 -0.06667  0.06667
> > [21]  0.60000 -0.66667 -0.13333  0.13333  0.66667 -0.73333 -0.60000
> > -0.20000  0.20000  0.60000
> > [31]  0.73333 -0.80000 -0.53333 -0.26667  0.26667  0.53333  0.80000
> > -0.86667 -0.46667 -0.33333
> > [41]  0.33333  0.46667  0.86667 -0.93333 -0.40000  0.40000  0.93333
> > -1.00000 -0.46667 -0.33333
> > [51]  0.33333  0.46667  1.00000 -0.53333 -0.26667  0.26667  0.53333
> > -0.20000  0.20000 -0.66667
> > [61] -0.13333  0.13333  0.66667 -0.73333 -0.06667  0.06667  0.73333
> > -0.20000  0.20000 -0.13333
> > [71]  0.13333 -0.06667  0.06667 -0.06667  0.06667
> >
> >
> >
> > Is there a way to fix this? Any idea and suggestions? Thanks very much!!
> >
> >   Hanna
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
>

	[[alternative HTML version deleted]]