[Rd] [.raster bug {was "str() on raster objects fails .."}

Paul Murrell p.murrell at auckland.ac.nz
Tue Feb 8 21:59:29 CET 2011


Hi

On 7/02/2011 8:36 p.m., Martin Maechler wrote:
>>>>>> Simon Urbanek<simon.urbanek at r-project.org>
>>>>>>      on Sun, 6 Feb 2011 20:53:01 -0500 writes:
>
>      >  On Feb 6, 2011, at 8:10 PM, Paul Murrell wrote:
>
>      >>  Hi
>      >>
>      >>  On 3/02/2011 1:23 p.m., Simon Urbanek wrote:
>      >>>
>      >>>  On Feb 2, 2011, at 7:00 PM, Paul Murrell wrote:
>      >>>
>      >>>>  Hi
>      >>>>
>      >>>>  Martin Maechler wrote:
>      >>>>>  On Wed, Feb 2, 2011 at 23:30, Simon
>      >>>>>  Urbanek<simon.urbanek at r-project.org>  wrote:
>>>>>> On Feb 1, 2011, at 8:16 PM, Paul Murrell wrote:
>>>>>>
>      >>>>>>>  Hi
>      >>>>>>>
>      >>>>>>>  On 2/02/2011 2:03 p.m., Henrik Bengtsson wrote:
>      >>>>>>>>  On Tue, Feb 1, 2011 at 4:46 PM, Paul
>      >>>>>>>>  Murrell<p.murrell at auckland.ac.nz>  wrote:
>      >>>>>>>>>  Hi
>      >>>>>>>>>
>      >>>>>>>>>  On 1/02/2011 9:22 p.m., Martin Maechler wrote:
>      >>>>>>>>>>>>>>>  Henrik Bengtsson<hb at biostat.ucsf.edu>  on
>      >>>>>>>>>>>>>>>  Mon, 31 Jan 2011 11:16:59 -0800 writes:
>      >>>>>>>>>>>  Hi, str() on raster objects fails for certain
>      >>>>>>>>>>>  dimensions.  For example:
>      >>>>>>>>>>
>      >>>>>>>>>>>>  str(as.raster(0, nrow=1, ncol=100)) 'raster'
>      >>>>>>>>>>>>  chr [1, 1:100]
>      >>>>>>>>>>>  "#000000" "#000000" "#000000" "#000000" ...
>      >>>>>>>>>>
>      >>>>>>>>>>>>  str(as.raster(0, nrow=1, ncol=101)) Error in
>      >>>>>>>>>>>>  `[.raster`(object,
>      >>>>>>>>>>>  seq_len(max.len)) : subscript out of bounds
>      >>>>>>>>>>
>      >>>>>>>>>>>  This seems to do with how str() and "[.raster"()
>      >>>>>>>>>>>  is coded; when subsetting as a vector, which
>      >>>>>>>>>>>  str() relies on, "[.raster"() still returns a
>      >>>>>>>>>>>  matrix-like object, e.g.
>      >>>>>>>>>>
>      >>>>>>>>>>>>  img<- as.raster(1:25, max=25, nrow=5, ncol=5);
>      >>>>>>>>>>>>  img[1:2]
>      >>>>>>>>>>>  [,1] [,2] [,3] [,4] [,5] [1,] "#0A0A0A"
>      >>>>>>>>>>>  "#3D3D3D" "#707070" "#A3A3A3" "#D6D6D6" [2,]
>      >>>>>>>>>>>  "#141414" "#474747" "#7A7A7A" "#ADADAD"
>      >>>>>>>>>>>  "#E0E0E0"
>      >>>>>>>>>>
>      >>>>>>>>>>>  compare with:
>      >>>>>>>>>>
>      >>>>>>>>>>>>  as.matrix(img)[1:2]
>      >>>>>>>>>>>  [1] "#0A0A0A" "#3D3D3D"
>      >>>>>>>>>>
>      >>>>>>>>>>
>      >>>>>>>>>>>  The easy but incomplete fix is to do:
>      >>>>>>>>>>
>      >>>>>>>>>>>  str.raster<- function(object, ...) {
>      >>>>>>>>>>>  str(as.matrix(object), ...); }
>      >>>>>>>>>>
>      >>>>>>>>>>>  Other suggestions?
>      >>>>>>>>>>
>      >>>>>>>>>>  The informal "raster" class is behaving
>      >>>>>>>>>>  ``illogical'' in the following sense:
>      >>>>>>>>>>
>      >>>>>>>>>>>  r<- as.raster(0, nrow=1, ncol=11)
>      >>>>>>>>>>>  r[seq_along(r)]
>      >>>>>>>>>>  Error in `[.raster`(r, seq_along(r)) : subscript
>      >>>>>>>>>>  out of bounds
>      >>>>>>>>>>
>      >>>>>>>>>>  or, here equivalently,
>      >>>>>>>>>>>  r[1:length(r)]
>      >>>>>>>>>>  Error in `[.raster`(r, 1:length(r)) : subscript
>      >>>>>>>>>>  out of bounds
>      >>>>>>>>>>
>      >>>>>>>>>>  When classes do behave in such a way, they
>      >>>>>>>>>>  definitely need their own str() method.
>      >>>>>>>>>>
>      >>>>>>>>>>  However, the bug really is in "[.raster":
>      >>>>>>>>>>  Currently, r[i] is equivalent to r[i,] which is
>      >>>>>>>>>>  not at all matrix-like and its help clearly says
>      >>>>>>>>>>  that subsetting should work as for matrices. A
>      >>>>>>>>>>  recent thread on R-help/R-devel has mentioned the
>      >>>>>>>>>>  fact that "[" methods for matrix-like methods
>      >>>>>>>>>>  need to use both nargs() and missing() and that
>      >>>>>>>>>>  "[.dataframe" has been the example to follow
>      >>>>>>>>>>  "forever", IIRC already in S and S-plus as of 20
>      >>>>>>>>>>  years ago.
>      >>>>>>>>>  The main motivation for non-standard behaviour
>      >>>>>>>>>  here is to make sure that a subset of a raster
>      >>>>>>>>>  object NEVER produces a vector (because the
>      >>>>>>>>>  conversion back to a raster object then produces a
>      >>>>>>>>>  single-column raster and that may be a
>      >>>>>>>>>  "surprise").  Thanks for making the code more
>      >>>>>>>>>  standard and robust.
>      >>>>>>>>>
>      >>>>>>>>>  The r[i] case is still tricky.  The following
>      >>>>>>>>>  behaviour is quite convenient ...
>      >>>>>>>>>
>      >>>>>>>>>  r[r == "black"]<- "white"
>      >>>>>>>>>
>      >>>>>>>>>  ... but the next behaviour is quite jarring (at
>      >>>>>>>>>  least in terms of the raster image that results
>      >>>>>>>>>  from it) ...
>      >>>>>>>>>
>      >>>>>>>>>  r2<- r[1:(nrow(r) + 1)]
>      >>>>>>>>>
>      >>>>>>>>>  So I think there is some justification for further
>      >>>>>>>>>  non-standardness to try to ensure that the subset
>      >>>>>>>>>  of a raster image always produces a sensible
>      >>>>>>>>>  image.  A simple solution would be just to outlaw
>      >>>>>>>>>  r[i] for raster objects and force the user to
>      >>>>>>>>>  write r[i, ] or r[, j], depending on what they
>      >>>>>>>>>  want.
>      >>>>>>>>  FYI, I've tried out Martin's updated version at it
>      >>>>>>>>  seems like a one-column raster matrix is now
>      >>>>>>>>  returned for r[i], e.g.
>      >>>>>>>  Yes, that's what I've been looking at ...
>      >>>>>>>
>      >>>>>>>>>  r<- as.raster(1:8, max=8, nrow=2, ncol=4); r
>      >>>>>>>>  [,1] [,2] [,3] [,4] [1,] "#202020" "#606060"
>      >>>>>>>>  "#9F9F9F" "#DFDFDF" [2,] "#404040" "#808080"
>      >>>>>>>>  "#BFBFBF" "#FFFFFF"
>      >>>>>>>>
>      >>>>>>>>>  r[1:length(r)]
>      >>>>>>>>  [,1] [1,] "#202020" [2,] "#404040" [3,] "#606060"
>      >>>>>>>>  [4,] "#808080" [5,] "#9F9F9F" [6,] "#BFBFBF" [7,]
>      >>>>>>>>  "#DFDFDF" [8,] "#FFFFFF"
>      >>>>>>>  ... and the above is exactly the sort of thing that
>      >>>>>>>  will fry your mind if the image that you are
>      >>>>>>>  subsetting is, for example, a photo.
>      >>>>>>>
>>>>>> Why doesn't raster behave consistently like any matrix
>      >>>>>>>  object?
>>>>>> I would expect simply
>>>>>>
>      >>>>>>>  r[1:length(r)]
>>>>>> [1] "#202020" "#404040" "#606060" "#808080" "#9F9F9F"
>      >>>>>>>  "#BFBFBF"
>>>>>> "#DFDFDF" [8] "#FFFFFF"
>>>>>>
>>>>>> Where it's obvious what happened. I saw the comment about
>      >>>>>>>  the
>>>>>> vector but I'm not sure I get it - why don't you want a
>      >>>>>>>  vector?
>>>>>> The raster is no different than matrices - you still need
>      >>>>>>>  to
>>>>>> define the dimensions when going back anyway, moreover
>      >>>>>>>  what you
>>>>>> get now is not consistent at all since there raster never
>      >>>>>>>  had
>>>>>> that dimension anyway ...
>>>>>>
>>>>>> Cheers, Simon
>      >>>>>  I agree that this would be the most "logical" and
>      >>>>>  notably least surprising behavior, which I find the
>      >>>>>  most important argument (I'm sorry my last message was
>      >>>>>  cut off as it was sent accidentally before being
>      >>>>>  finished completely).
>      >>>>
>      >>>>  I think this behaviour might surprise some ...
>      >>>>
>      >>>>  download.file("http://cran.r-project.org/Rlogo.jpg",
>      >>>>  "Rlogo.jpg") library(ReadImages) logo<-
>      >>>>  read.jpeg("Rlogo.jpg")
>      >>>>
>      >>>>  rLogo<- as.raster(logo) rLogoBit<- rLogo[50:60, 50:60]
>      >>>>
>      >>>>  library(grid) # Original image grid.raster(rLogoBit)
>      >>>>  grid.newpage() # Subset produces a vector
>      >>>>  grid.raster(rLogoBit[1:length(rLogoBit)])
>      >>>>
>      >>>
>>> But this should fail IMHO since you're supplying a vector but
>      >>>  grid.raster (assuming it's the same as rasterImage)
>      >>>  requires a matrix - exactly as you would expect in the
>      >>>  matrix case - if a function requires a matrix and you
>      >>>  pass a vector, it will bark. I think you are explaining
>      >>>  why going to vector *is* desirable ;). In the current
>      >>>  case it simply generates the wrong dimensions instead of
>      >>>  resulting in a vector, right?
>      >>
>      >>  The raster subsetting always produces a raster, but
>      >>  grid.raster() works with vectors anyway because
>      >>  as.raster() has a vector method.
>      >>
>
>     >  Well, isn't that the actual problem? ;) It could make sense but it
>     >  should fail if dimensions are not specified for exactly the reason you
>     >  mentioned - it is fatal if what you have is really an image ...
>
>      >  Cheers, Simon
>
>
>      >>  Anyway, I'm happy to go with things as they now are.  I
>      >>  think at worst it will encourage people to specify two
>      >>  indices when subsetting a raster object, and that's not a
>      >>  bad thing.
>      >>
>      >>  Paul
>
> I and (maybe others) are getting a bit lost..
>
> AFAIK:
>
> - Simon proposes that     r[i]  should return a simple character vector
>    such that raster images behave more naturally like matrices.
>
> - Paul  seems happy with  r[i]  returning  a (k x 1) raster object
>    -- where  k  almost completely unrelated to the original
>    dim(r) -- with the argument that raster subsetting must always
>    return a "raster".

Actually, I'd prefer it to return something more sensible, or just fail 
(I don't see why raster images should behave in all ways like matrices) ...

> My vote would be for Simon's proposal, hence raster subsetting
> should return a raster only when  [i,j] or [i,] or [,j]  syntax
> is used.

... but I can also live with (Martin's interpretation of) Simon's proposal.

Paul

> Martin

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
paul at stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/



More information about the R-devel mailing list