[R] barplot and missing values?

Sat Jun 4 20:49:42 CEST 2005

On Sat, 2005-06-04 at 15:53 +0100, Dan Bolser wrote:
> On Sat, 4 Jun 2005, Marc Schwartz wrote:
> 
> >On Sat, 2005-06-04 at 14:50 +0100, Dan Bolser wrote:
> >
> ><snip>
> >
> >> This must be because of the "log='y'" option that I am using here.
> >> 
> >> y <- c(2,4,6,8,NA,NA,NA,NA,18)
> >> 
> >> barplot2(y,log='y')
> >> 
> >> Above fails.
> >> 
> >> 
> >> I appreciate that what I am trying to do is somewhat artificial (handle
> >> zero values on a log scale), but it does reflect the data I have.
> >> 
> >> I tried plot(..., type='h'), but that dosn't do the "beside=T" stuff that
> >> I want to do.
> >> 
> >> I am now trying things like...
> >> 
> >> barplot2(
> >>   dat.y.plot + 0.11, # Dirty hack
> >>   offset=-0.1,       #
> >>   xpd=F,             #
> >>   log='y',
> >>   beside=T
> >> )
> >> 
> >> Which looks messy. 
> >> 
> >> Any way to cleanly handle NA values with barplot2 on a log scale
> >> (log='y')?
> >
> ><snip>
> >
> >Dan,
> >
> >You are actually close in the above example, using the 'offset'
> >argument.
> >
> >In this case, you still cannot use "NA"s, since their value is unknown
> >and so must set these elements to zero. Then using a small offset value,
> >you can adjust the base value of the y axis so that it is "just above"
> >zero. This should result in a minimal shift of the bar values above
> >their actual values and should not materially affect the plot's
> >representation of the data.
> >
> >Something like the following "should" work:
> >
> >  > y <- c(2, 4, 6, 8, NA, NA, NA, NA, 18)
> >  > y
> >  [1]  2  4  6  8 NA NA NA NA 18
> >   
> >  > y[is.na(y)] <- 0
> >  > y
> >  [1]  2  4  6  8  0  0  0  0 18
> >
> >
> >  barplot2(y, log = "y", offset = 0.01, las = 2)
> >
> >Note also that if you follow the above with:
> >
> >  box()
> >
> >The residual bars from the (0 + 0.01) values are covered with the plot
> >region box, if that is an issue for you.
> 
> 
> Actually it looks a bit strange (I guess you didn't check it?) 

Yes I did...

> - I see
> what is happening. It isn't much different from...

> barplot2(y+0.01, log = "y",las = 1)
> 
> Which is the essence of the fix, but all that bar (on a log scale) between
> 1 and 0.1 and 0.01 is as big as 1 to 10, which is a bit artificial.

That's the way of course it should be by default. The space between
0.01:0.1, 0.1:1 and 1:10 should be the same.

The issue is that you want to be able to modify the default y axis
range, given the presence of the offset value as min(y) instead of it
being 0. This results in the "distracting" (not so much artificial)
space from 0.01 to 1, given the way in which the default axis ranges are
created.

> My previous fix looks best now I check it with the example ...
> 
> y
> > y
> [1]  2  4  6  8  0  0  0  0 18
> 
> barplot2(
>   y + 0.11,
>   ylim=c(1,max(y)),
>   offset = -0.10,
>   log='y',
>   xpd=F
> )
> box()
> 
> Looks like the above is what I need :)
> 
> Thanks for teh help - its reasuring to see similar fixes :)

This works and is still a bit kludgy, since it is not "automatic".

I think that the "best" option would be for me to spend some time
improving the default behavior of barplot2() with <=0 and/or NA values
in the presence of a log axis (x or y) so that it is similar to the way
in which plot() handles it:

> y
[1]  2  4  6  8  0  0  0  0 18

# NOTE THE WARNING MESSAGE HERE

> plot(y, log = "y")
Warning message:
4 y values <= 0 omitted from logarithmic plot in: xy.coords(x, y,
xlabel, ylabel, log)

> y[y == 0] <- NA
> y
[1]  2  4  6  8 NA NA NA NA 18

# NO WARNING MESSAGE HERE, BUT DOES NOT PLOT THE "NA"s

> plot(y, log = "y")

I'll take a look at that.

Marc