[R] 2 D density plot interpretation and manipulating the data

Abby Spurdle @purd|e@@ @end|ng |rom gm@||@com
Sat Oct 10 02:22:25 CEST 2020


> SNP$density <- get_density(SNP$mean, SNP$var)
> > summary(SNP$density)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>       0     383     696     738    1170    1789

This doesn't look accurate.
The density values shouldn't all be integers.
And I wouldn't expect the smallest density to be zero, if using a
Gaussian kernel.

Are these values rounded or formatted?

(Recombined Excerpts)
> and keep only entries with density > 400
> a=SNP[SNP$density>400,]
> Any idea how do I interpret data points that are left contained within
the ellipses?

Reiterating, they're contour lines, but they should *not* be ellipses.

You could work out the proportion of "densities" > 400.

    d <- SNP$density
    p.remain <- length (d [d > 400]) / length (d)
    p.remain

Or a more succinct version:

    p.remain <- sum (SNP$density > 400) / nrow (SNP)

Then you can say that you've plotted data with the highest <p.remain> densities.



More information about the R-help mailing list