[R] data.frame with NA

PIKAL Petr petr.pikal at precheza.cz
Tue Mar 19 14:23:44 CET 2013


Hi

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Pietro
> Sent: Tuesday, March 19, 2013 12:10 PM
> To: dcarlson at tamu.edu; dcarlson at tamu.edu
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] data.frame with NA
> 
> Yes, colClasses is the solution. Thank you very much.
> However i found a very strange thing.
> 
> If i use:
> Foglio1 <- read.xlsx2("mydb.xlsx", 1, colClasses=c("Date",
> rep("numeric",14)))
> 
> i get numeric dataframe, as you said.
> 
> I also get NaN (and not NA).
> 
> At this point i use the function:
> Foglio1 = na.locf(Foglio1,fromLast=T) and it works perfectly. All NaN
> 's were replaced with the first numeric value, as expected.
> 
> And now the enigma.
> 
> After na.locf function, Foglio1 become all CHR again! It seems that
> na.locf  convert from num to chr. Even Date is converted in chr.
> I'm reading the help of this function but i can't find trace about the
> possibility of this conversion.

I did try na.locf on some data frame (which I did not ever tested) and it stays numeric when the data frame is numeric. However with some nonumeric column it changes all values to character.

To prevent this behaviour you can use na.locf column wise

for (i in 1:15) Foglio1[,i]<-na.locf(Foglio1[,i])

Regards
Petr

> 
> It seems that i can't get in anyway a numeric dataframe without NA o
> NaN!
> Ok, i admit that i'm a newbie, but i'm trying every day to gain
> confidence with R
> 
> Can i ask you the courtesy to use na.locf function to see if also on
> your computer this function convert all to CHR?
> 
> Thank you
> 
> 
> 
> At 21.37 18/03/2013, David L Carlson wrote:
> >It appears that you MUST use the colClasses= argument with read.xlsx2:
> >
> >Foglio1 <- read.xlsx2("mydb.xlsx", 1, colClasses=c("Date",
> >rep("numeric",
> >14)))
> >
> >However, e and n are converted to NaN not NA so you would need to
> >convert those columns (at least, I didn't check for missing values in
> >the other
> >columns):
> >
> > > Foglio1$e <- ifelse(is.nan(Foglio1$e), NA, Foglio1$e) Foglio1$n <-
> > > ifelse(is.nan(Foglio1$n), NA, Foglio1$n)
> > > str(Foglio1)
> >'data.frame':   1489 obs. of  15 variables:
> >  $ Date: Date, format: "2001-08-17" "2001-08-20" ...
> >  $ a   : num  202 201 202 201 202 ...
> >  $ b   : num  231 230 230 230 232 ...
> >  $ c   : num  177 179 181 180 182 ...
> >  $ d   : num  277 277 276 276 275 ...
> >  $ e   : num  NA NA NA NA NA NA NA NA NA NA ...
> >  $ f   : num  275 277 279 279 279 ...
> >  $ g   : num  91.7 90.7 90.8 91.1 91 ...
> >  $ h   : num  11446 11258 11280 11396 11127 ...
> >  $ i   : num  388 389 393 392 393 ...
> >  $ l   : num  93.2 94 92.4 93.4 93.1 ...
> >  $ m   : num  128 127 126 129 130 ...
> >  $ n   : num  NA NA NA NA NA NA NA NA NA NA ...
> >  $ o   : num  133 133 133 133 133 ...
> >  $ p   : num  107 107 107 107 107 ...
> >
> >-------
> >David
> >
> >
> > > -----Original Message-----
> > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > > project.org] On Behalf Of David L Carlson
> > > Sent: Monday, March 18, 2013 3:22 PM
> > > To: 'Pietro'; 'Berend Hasselman'
> > > Cc: r-help at stat.math.ethz.ch
> > > Subject: Re: [R] data.frame with NA
> > >
> > > Try this
> > >
> > > Open the spreadsheet in Excel. Select all of the data click Copy.
> > > Don't close Excel.
> > >
> > > Open R and type the following command:
> > >
> > > > Foglio1 <- read.table("clipboard-128", header=TRUE, sep="\t")
> > >
> > > Now take a look at the structure of the data.frame
> > >
> > > > str(Foglio1)
> > > 'data.frame':   1489 obs. of  15 variables:
> > >  $ Date: Factor w/ 1489 levels "1/10/2002","1/10/2003",..: 1275
> 1291
> > > 1295
> > > 1299 1304 1309 1321 1325 1329 1337 ...
> > >  $ a   : num  202 201 202 201 202 ...
> > >  $ b   : num  231 230 230 230 232 ...
> > >  $ c   : num  177 179 181 180 182 ...
> > >  $ d   : num  277 277 276 276 275 ...
> > >  $ e   : num  NA NA NA NA NA NA NA NA NA NA ...
> > >  $ f   : num  275 277 279 279 279 ...
> > >  $ g   : num  91.7 90.7 90.8 91.1 91 ...
> > >  $ h   : num  11446 11258 11280 11396 11127 ...
> > >  $ i   : num  388 389 393 392 393 ...
> > >  $ l   : num  93.2 94 92.4 93.4 93.1 ...
> > >  $ m   : num  128 127 126 129 130 ...
> > >  $ n   : num  NA NA NA NA NA NA NA NA NA NA ...
> > >  $ o   : num  133 133 133 133 133 ...
> > >  $ p   : num  107 107 107 107 107 ...
> > >
> > > ----------------------------------------------
> > > David L Carlson
> > > Associate Professor of Anthropology
> > > Texas A&M University
> > > College Station, TX 77843-4352
> > >
> > > > -----Original Message-----
> > > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > > > project.org] On Behalf Of Pietro
> > > > Sent: Monday, March 18, 2013 1:57 PM
> > > > To: Berend Hasselman
> > > > Cc: r-help at stat.math.ethz.ch
> > > > Subject: Re: [R] data.frame with NA
> > > >
> > > > Yes, it's true Berend!
> > > >
> > > > What i do is simply use read.xlsx  function
> > > >
> > > > db <- read.xlsx2("c:/mydb.xlsx",1,as.data.frame=T)
> > > >
> > > > This is excel file i use:
> > > > http://dl.dropbox.com/u/102669/mydb.xlsx
> > > >
> > > > I can't find  a way to import as numeric.
> > > > My objective is to be able to work (in R) with my NA's
> > > >
> > > >
> > > > At 18.46 18/03/2013, Berend Hasselman wrote:
> > > >
> > > > >On 18-03-2013, at 16:49, Pete <freerisk3 at gmail.com> wrote:
> > > > >
> > > > > >
> > > > > > I have this little data.frame
> > > > > >
> > > > > > http://dl.dropbox.com/u/102669/nanotna.rdata
> > > > > >
> > > > > > Two column contains NA, so the best thing to do is use
> na.locf
> > > > > function (with
> > > > > > fromLast = T)
> > > > > >
> > > > > > But locf function doesn't work because NA in my data.frame
> are
> > > > > not recognized as
> > > > > > real NA.
> > > > > >
> > > > > > Is there a way to substitute fake NA with real NA? In this
> > > > > > case
> > > > > na.locf function
> > > > > > should work
> > > > > >
> > > > >
> > > > >Your data are all characters. Do
> > > > >
> > > > >str(db)
> > > > >
> > > > >to see that. What is probably supposed to be numeric is also
> > > > character,
> > > > >Somehow you have managed to read in data that R thinks is all
> chr.
> > > > >Your NA are "NA" in reality: a character string "NA".
> > > > >
> > > > >You will have to review the method you used to get the data into
> R.
> > > > >And make sure that what you want to be numeric is indeed
> numeric.
> > > > >Then you can start to think about doing something about the
> NA's.
> > > > >
> > > > >Berend
> > > >
> > > > ______________________________________________
> > > > R-help at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-
> project.org/posting-
> > > > guide.html and provide commented, minimal, self-contained,
> > > > reproducible code.
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html and provide commented, minimal, self-contained,
> > > reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list