[R] NAs introduced by coercion warning?

Sundar Dorai-Raj sundar.dorai-raj at pdf.com
Thu Feb 19 18:02:07 CET 2004


Jonathan,
   It's still hard to tell. Try this:

options(warn = 1) # see ?options for explanation

## RUN YOUR CODE

Regards,
Sundar


Jonathan Greenberg wrote:

> Its hard for me to pinpoint where this is happening, since I'm working on an
> image that¹s about 10000 x 20000 pixels, and 12 bands deep and I'm using a
> set of for-next loops to pull out subsections of data.  I can guarantee the
> input values are all floating point values.
> 
> To be more specific, I have created a classification tree, and I want to
> apply it to that large floating point image (all the band names match up)
> and write the prediction (probability) values to a file.  What happens if a
> decision tree tries to classify a set of input values that are completely
> outside of the range of the input tree?
> 
> Here's the code I was using.  I should mention that this worked on a small
> subset (400 x 400 pixels) that wouldn't have any "weird" values (negative or
> zero).  The output file from this is turning out to be slightly smaller than
> it should given the samples,lines,bands and number type, which I why I'm
> wondering if the tree is simply dropping those "bad" values rather than
> giving them some value (e.g. 0):
> 
> ## Creating the tree
> library(tree)
> bands=12
> bandnames<-paste(c("B"),1:bands,sep="")
> treetraindata=read.csv("classtrainshad040205.csv",header=TRUE)
> names(treetraindata)[2:6]<-bandnames[1:5]
> names(treetraindata)[8:14]<-bandnames[6:12]
> treetraindata$Class_Name<-as.factor(treetraindata$Class_Name)
> 
> ## Create an overfit tree
> treetrain<-tree(Class_Name ~ B1 + B2 + B3 +
> B4+B5+B6+B7+B8+B9+B10+B11+B12,treetraindata,mincut=1,minsize=2,mindev=0)
> 
> ## Extracts a slice of data out of an ENVI BSQ file
> envigetslice<-function(fileconnection,samples,lines,bands,interleave,datatyp
> e,maxpixels) {
>     currentloc=seek(fileconnection,where=NA,origin="current")
>     ## If data is integer
>     if(datatype==3) {
>         numbersize=2
>         datatype=integer()
>         if ((samples*lines)-(currentloc/numbersize) < maxpixels)
> maxpixels=(samples*lines)-(currentloc/numbersize)
>         envislice <-
> readBin(fileconnection,integer(),maxpixels,size=numbersize)
>         newloc=seek(fileconnection,where=NA,origin="current")
>         if (bands > 1) {
>             for (i in 1:(bands-1)) {
>                 
> seek(fileconnection,where=currentloc+(samples*lines*numbersize*i),origin="st
> art")
>                 currentslice <-
> readBin(fileconnection,integer(),maxpixels,size=numbersize)
>                 envislice=data.frame(envislice,currentslice)
>             }
>         }
>     }
>     ## If data is floating point
>     if(datatype==4) {
>         numbersize=4
>         if ((samples*lines)-(currentloc/numbersize) < maxpixels)
> maxpixels=(samples*lines)-(currentloc/numbersize)
>         envislice <-
> readBin(fileconnection,double(),maxpixels,size=numbersize)
>         newloc=seek(fileconnection,where=NA,origin="current")
>         if (bands > 1) {
>             for (i in 1:(bands-1)) {
>                 
> seek(fileconnection,where=currentloc+(samples*lines*numbersize*i),origin="st
> art")
>                 currentslice <-
> readBin(fileconnection,double(),maxpixels,size=numbersize)
>                 envislice=data.frame(envislice,currentslice)
>             }
>         }
>     }
>     seek(fileconnection,where=newloc,origin="start")
>     envislice
> }
> 
> ## Read ENVI files in subsets
> ## interleave: 1=bsq
> ## datatype: (follows ENVI format):
> ##    3: long integer
> ##    4:floating point
> 
> 
> ## Apply the classifier
> imageclasstree<-function(infile,outfile,dectree,samples,lines,bands,interlea
> ve,datatype,maxpixels) {
> 
> fileconnection<-file(infile,open="rb")
> outfileconnection=file(outfile,open="wb")
> 
> numpixels = samples * lines
> numslices=ceiling(numpixels/maxpixels)
> if (numslices == floor(numpixels/maxpixels)) numslices=numslices-1
> 
> bandnames<-paste(c("B"),1:bands,sep="")
> 
> ## Loop for processing images
> for(j in 0:numslices) {
>     print((j/numslices)*100)
>     
> envislice<-envigetslice(fileconnection,samples,lines,bands,interleave,dataty
> pe,maxpixels)
>     names(envislice)<-bandnames
>     predictslice<-predict(treetrain,envislice,type=c("vector"))
>     
> predictslice<-as.integer(round(as.vector(t(predictslice*10000)),digits=0))
>     predictslice
>     writeBin(predictslice,outfileconnection,size=2)
> }
> close(fileconnection)
> close(outfileconnection)
> }
> 
> imageclasstree("flt4aall","flt4adt", treetrain,11216,18173,12,1,4,25000)
> 
> On 2/18/04 2:25 PM, "Sundar Dorai-Raj" <sundar.dorai-raj at PDF.COM> wrote:
> 
> 
>>
>>Jonathan Greenberg wrote:
>>
>>
>>>I'm running a decision tree on a large dataset, and I'm getting multiple
>>>instances of "NAs introduced by coercion" (> 50).  What does this mean?
>>>
>>>--j
>>>
>>
>>My guess would be you're trying to convert from character to numeric and
>>are unable to do so. As in,
>>
>>
>>>as.numeric("A")
>>
>>[1] NA
>>Warning message:
>>NAs introduced by coercion
>>
>>>as.numeric("1")
>>
>>[1] 1
>>
>>But without more information from you it's impossible to tell.
>>
>>See the posting guide at
>>
>>http://www.R-project.org/posting-guide.html
>>
>>Regards,
>>Sundar
>>
> 
> 
>




More information about the R-help mailing list