[R] Manual scaling a scatter plot

Uwe Ligges ligges at statistik.uni-dortmund.de
Fri May 31 23:12:16 CEST 2002

osiander at 24on.cc wrote:
> Hello R-help,
> I have a data.frame where I want to scatterplot variable1 versus variable2 using a third variable as a factor. For each factor I want to have a regression line in the same plot with different color and symbols. Var1, and var2  contain a lot of NA,sometimes the dependent var sometimes the predictor, sometimes both together.

So you cannot plot any of those observations containing NAs (see below).

> Three problems occure:
> Makeing subsets to split the data according to the levels is tedious and needs then points(), abline(lm()) with different line and points types for each plot defined manualy.
> Is there a simpler solution?

The question is what manually means ... as one straightforward solution
I'd suggest to write the code along the following example:

 DF <- data.frame(Var1 = c(runif(11), NA), Var2 = c(NA, runif(10), NA), 
    Var3 = factor(rep(1:3, 4)))

 plot(DF$Var1, DF$Var2, pch = as.character(DF$Var3))
 for(i in 1:nlevels(DF$Var3))
    abline(lm(DF$Var2 ~ DF$Var1, subset = (DF$Var3 ==
levels(DF$Var3)[i])), col = i)

What a luck! As the default, na.omit is set and observations containing
NAs are omitted.

> Since the data need different scaling some datapoints lay outside of the window, depending on which level was printed first. Is there a way to do scaling manualy?

Why? In R you can specify things vertozized. For manual scaling use the
arguments "xlim" and "ylim".

> I start with a plot of the vars with the bigger range using the plot(type="n") function. Then I do several plot( add=T). 

Perhaps there is a better way with points() or just the way suggested
above ...

> If the biggest positive value in the predictor(x-axis) matches with a NA on the y-axis, the auto-scaling still takes the biggest positive value. The final plot then has an empty right part, which looks unprofessional. Is manual scaling an option here? I can not easily work with na.omit etc, since there are more variables which have valid data I do not want to loose.

I don't get the point here. Now you are going to predict()? How are you
going to plot data with NAs in one of the plotted variables?

Uwe Ligges
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list