[R] Scatterplot : smoothing colors according to density of points

jim holtman jholtman at gmail.com
Mon Jun 15 02:08:34 CEST 2015


check out the 'hexbin' package for making scatter plots that have a lot of
points overlapping in a small area.


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Tue, Jun 2, 2015 at 9:51 AM, Adams, Jean <jvadams at usgs.gov> wrote:

> Try this.
>
> Jean
>
> D <- structure(list(
>   id = structure(1:6, .Label = c("O13297", "O13329", "O13525",
>     "O13539", "O13541", "O13547"), class = "factor"),
>   X = c(44.444444, 31.272085, 6.865672, 14.176245, 73.275862,
>     28.991597),
>   Y = c(21.6122, 4.0159, 2.43884, 7.81217, 3.59012, 258.999)),
>   .Names = c("id", "X", "Y"), class = "data.frame",
>   row.names = c("1", "2", "3", "4", "5", "6"))
>
> # define the number of colors
> ncol <- 100
> # define the radius of the neighborhood
> distcut <- 30
> pal <- colorRampPalette(c("blue", "yellow", "red"))(ncol)
>
> # calculate the euclidean distance between all pairs of points, based on X,
> Y coordinates
> Ddist <- with(D, as.matrix(dist(cbind(X, Y), diag=TRUE, upper=TRUE)))
> # count up the number of neighbors within distcut distance of each point
> D$C <- apply(Ddist<distcut, 2, sum)
> # use this count to define the levels (which will be then used to color
> points in the plot
> D$Clevels <- with(D,
>   cut(C, breaks=seq(min(C), max(C), length.out=ncol+1),
>     labels=FALSE, include.lowest=TRUE))
>
> # plot the data
> with(D, plot(X, Y, col=pal[Clevels], log="y", pch=16))
>
>
>
> On Tue, Jun 2, 2015 at 5:37 AM, Benjamin Dubreuil <
> benjamin.dubreuil at weizmann.ac.il> wrote:
>
> > Hello everyone,
> >
> > I have a data frame D with 4 columns id,X,Y,C.
> > I want to plot a simple scatter plot of D$X vs. D$Y and using D$C values
> > as a color. (id is just a text string not used for the plot)
> >
> > But actually, I don't want to use the raw values of D$C, I would prefer
> to
> > calculate the average values of D$C according to the density of points
> in a
> > fixed neighborhood.
> > In other words, I would like to smooth the colors according to the
> density
> > of points.
> >
> > I am looking for any function,package that could solve this.
> > So far, I've been looking at library MASS and the function kde2d which
> can
> > calculate the density of points in 2 directions, but I don't see how I
> > could then use this information to recalculate my D$C values.
> >
> > Here is a piece of the matrix :
> >  > head(D)
> >       id         X         Y            C
> > 1 O13297 44.444444  21.61220 -0.136651639
> > 2 O13329 31.272085   4.01590 -0.117016949
> > 3 O13525  6.865672   2.43884 -0.161173913
> > 4 O13539 14.176245   7.81217 -0.075756757
> > 5 O13541 73.275862   3.59012 -0.006988235
> > 6 O13547 28.991597 258.99900 -0.013985507
> >
> > > dim(D)
> > [1] 3616    4
> >
> > > apply(D[,-1],2,range)
> >                X          Y          C
> > [1,]   0.3378378     0.0003 -0.7382222
> > [2,] 100.0000000 24556.4000  0.5582500
> > (Y is not linear, so I use log='y' in the plot function)
> >
> > I used a palette of 100 colors ranging from Blue to Yellow to red.
> > >pal =  colorRampPalette(c("blue","yellow","red"))(100)
> >
> > To make D$C values correspond to a color, I used a cut with the following
> > breaks (101 breaks from -1.2 to 1.2):
> > > BREAKS
> >   [1] -1.2000 -0.8000 -0.4000 -0.3600 -0.3200 -0.2800 -0.2400 -0.2000
> > -0.1925
> >  [10] -0.1850 -0.1775 -0.1700 -0.1625 -0.1550 -0.1475 -0.1400 -0.1368
> > -0.1336
> >  [19] -0.1304 -0.1272 -0.1240 -0.1208 -0.1176 -0.1144 -0.1112 -0.1080
> > -0.1048
> >  [28] -0.1016 -0.0984 -0.0952 -0.0920 -0.0888 -0.0856 -0.0824 -0.0792
> > -0.0760
> >  [37] -0.0728 -0.0696 -0.0664 -0.0632 -0.0600 -0.0568 -0.0536 -0.0504
> > -0.0472
> >  [46] -0.0440 -0.0408 -0.0376 -0.0344 -0.0312 -0.0280 -0.0248 -0.0216
> > -0.0184
> >  [55] -0.0152 -0.0120 -0.0088 -0.0056 -0.0024  0.0008  0.0040  0.0072
> > 0.0104
> >  [64]  0.0136  0.0168  0.0200  0.0232  0.0264  0.0296  0.0328  0.0360
> > 0.0392
> >  [73]  0.0424  0.0456  0.0488  0.0520  0.0552  0.0584  0.0616  0.0648
> > 0.0680
> >  [82]  0.0712  0.0744  0.0776  0.0808  0.0840  0.0872  0.0904  0.0936
> > 0.0968
> >  [91]  0.1000  0.1250  0.1500  0.1750  0.2000  0.2250  0.2500  0.4875
> > 0.7250
> > [100]  0.9625  1.2000
> > > C.levels = as.numeric(cut(D$C,breaks=BREAKS))
> > >length(C.levels)
> > [1] 3616
> >
> > C.levels ranges from 2 to 98 and then to plot the colors I used
> > pal[C.levels].
> > > plot( x=D$x, y=D$Y, col=pal[ C.levels ],log='y')
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list