[R] data.frame with NA

Pietro freerisk3 at gmail.com
Wed Mar 20 11:27:58 CET 2013


Thank you David and thank you Petr

At 14.18 19/03/2013, David L Carlson wrote:
>Try this instead:
>
> > Foglio1[,2:ncol(Foglio1)] <- na.locf(Foglio1[,2:ncol(Foglio1)],fromLast=T)
> > str(Foglio1)
>'data.frame':   1489 obs. of  15 variables:
>  $ Date: Date, format: "2001-08-17" "2001-08-20" ...
>  $ a   : num  202 201 202 201 202 ...
>  $ b   : num  231 230 230 230 232 ...
>  $ c   : num  177 179 181 180 182 ...
>  $ d   : num  277 277 276 276 275 ...
>  $ e   : num  2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 ...
>  $ f   : num  275 277 279 279 279 ...
>  $ g   : num  91.7 90.7 90.8 91.1 91 ...
>  $ h   : num  11446 11258 11280 11396 11127 ...
>  $ i   : num  388 389 393 392 393 ...
>  $ l   : num  93.2 94 92.4 93.4 93.1 ...
>  $ m   : num  128 127 126 129 130 ...
>  $ n   : num  103 103 103 103 103 ...
>  $ o   : num  133 133 133 133 133 ...
>  $ p   : num  107 107 107 107 107 ...
>
>It appears that na.locf() converts the object to a matrix at some point (but
>I haven't checked the source code). The first column (the Date variable) is
>treated as character. As a result, everything gets converted to character.
>This will skip the first column which does not have any missing values.
>
>----------------------------------------------
>David L Carlson
>Associate Professor of Anthropology
>Texas A&M University
>College Station, TX 77843-4352
>
>
> > -----Original Message-----
> > From: Pietro [mailto:freerisk3 at gmail.com]
> > Sent: Tuesday, March 19, 2013 6:10 AM
> > To: dcarlson at tamu.edu; dcarlson at tamu.edu
> > Cc: r-help at stat.math.ethz.ch
> > Subject: RE: [R] data.frame with NA
> >
> > Yes, colClasses is the solution. Thank you very much.
> > However i found a very strange thing.
> >
> > If i use:
> > Foglio1 <- read.xlsx2("mydb.xlsx", 1, colClasses=c("Date",
> > rep("numeric",14)))
> >
> > i get numeric dataframe, as you said.
> >
> > I also get NaN (and not NA).
> >
> > At this point i use the function:
> > Foglio1 = na.locf(Foglio1,fromLast=T) and it works perfectly. All NaN
> > 's were replaced with the first numeric value, as expected.
> >
> > And now the enigma.
> >
> > After na.locf function, Foglio1 become all CHR again! It seems that
> > na.locf  convert from num to chr. Even Date is converted in chr.
> > I'm reading the help of this function but i can't find trace about
> > the possibility of this conversion.
> >
> > It seems that i can't get in anyway a numeric dataframe without NA o
> > NaN!
> > Ok, i admit that i'm a newbie, but i'm trying every day to gain
> > confidence with R
> >
> > Can i ask you the courtesy to use na.locf function to see if also on
> > your computer this function convert all to CHR?
> >
> > Thank you
> >
> >
> >
> > At 21.37 18/03/2013, David L Carlson wrote:
> > >It appears that you MUST use the colClasses= argument with read.xlsx2:
> > >
> > >Foglio1 <- read.xlsx2("mydb.xlsx", 1, colClasses=c("Date",
> > rep("numeric",
> > >14)))
> > >
> > >However, e and n are converted to NaN not NA so you would need to
> > convert
> > >those columns (at least, I didn't check for missing values in the
> > other
> > >columns):
> > >
> > > > Foglio1$e <- ifelse(is.nan(Foglio1$e), NA, Foglio1$e)
> > > > Foglio1$n <- ifelse(is.nan(Foglio1$n), NA, Foglio1$n)
> > > > str(Foglio1)
> > >'data.frame':   1489 obs. of  15 variables:
> > >  $ Date: Date, format: "2001-08-17" "2001-08-20" ...
> > >  $ a   : num  202 201 202 201 202 ...
> > >  $ b   : num  231 230 230 230 232 ...
> > >  $ c   : num  177 179 181 180 182 ...
> > >  $ d   : num  277 277 276 276 275 ...
> > >  $ e   : num  NA NA NA NA NA NA NA NA NA NA ...
> > >  $ f   : num  275 277 279 279 279 ...
> > >  $ g   : num  91.7 90.7 90.8 91.1 91 ...
> > >  $ h   : num  11446 11258 11280 11396 11127 ...
> > >  $ i   : num  388 389 393 392 393 ...
> > >  $ l   : num  93.2 94 92.4 93.4 93.1 ...
> > >  $ m   : num  128 127 126 129 130 ...
> > >  $ n   : num  NA NA NA NA NA NA NA NA NA NA ...
> > >  $ o   : num  133 133 133 133 133 ...
> > >  $ p   : num  107 107 107 107 107 ...
> > >
> > >-------
> > >David
> > >
> > >
> > > > -----Original Message-----
> > > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > > > project.org] On Behalf Of David L Carlson
> > > > Sent: Monday, March 18, 2013 3:22 PM
> > > > To: 'Pietro'; 'Berend Hasselman'
> > > > Cc: r-help at stat.math.ethz.ch
> > > > Subject: Re: [R] data.frame with NA
> > > >
> > > > Try this
> > > >
> > > > Open the spreadsheet in Excel. Select all of the data click Copy.
> > Don't
> > > > close Excel.
> > > >
> > > > Open R and type the following command:
> > > >
> > > > > Foglio1 <- read.table("clipboard-128", header=TRUE, sep="\t")
> > > >
> > > > Now take a look at the structure of the data.frame
> > > >
> > > > > str(Foglio1)
> > > > 'data.frame':   1489 obs. of  15 variables:
> > > >  $ Date: Factor w/ 1489 levels "1/10/2002","1/10/2003",..: 1275
> > 1291
> > > > 1295
> > > > 1299 1304 1309 1321 1325 1329 1337 ...
> > > >  $ a   : num  202 201 202 201 202 ...
> > > >  $ b   : num  231 230 230 230 232 ...
> > > >  $ c   : num  177 179 181 180 182 ...
> > > >  $ d   : num  277 277 276 276 275 ...
> > > >  $ e   : num  NA NA NA NA NA NA NA NA NA NA ...
> > > >  $ f   : num  275 277 279 279 279 ...
> > > >  $ g   : num  91.7 90.7 90.8 91.1 91 ...
> > > >  $ h   : num  11446 11258 11280 11396 11127 ...
> > > >  $ i   : num  388 389 393 392 393 ...
> > > >  $ l   : num  93.2 94 92.4 93.4 93.1 ...
> > > >  $ m   : num  128 127 126 129 130 ...
> > > >  $ n   : num  NA NA NA NA NA NA NA NA NA NA ...
> > > >  $ o   : num  133 133 133 133 133 ...
> > > >  $ p   : num  107 107 107 107 107 ...
> > > >
> > > > ----------------------------------------------
> > > > David L Carlson
> > > > Associate Professor of Anthropology
> > > > Texas A&M University
> > > > College Station, TX 77843-4352
> > > >
> > > > > -----Original Message-----
> > > > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > > > > project.org] On Behalf Of Pietro
> > > > > Sent: Monday, March 18, 2013 1:57 PM
> > > > > To: Berend Hasselman
> > > > > Cc: r-help at stat.math.ethz.ch
> > > > > Subject: Re: [R] data.frame with NA
> > > > >
> > > > > Yes, it's true Berend!
> > > > >
> > > > > What i do is simply use read.xlsx  function
> > > > >
> > > > > db <- read.xlsx2("c:/mydb.xlsx",1,as.data.frame=T)
> > > > >
> > > > > This is excel file i use:
> > > > > http://dl.dropbox.com/u/102669/mydb.xlsx
> > > > >
> > > > > I can't find  a way to import as numeric.
> > > > > My objective is to be able to work (in R) with my NA's
> > > > >
> > > > >
> > > > > At 18.46 18/03/2013, Berend Hasselman wrote:
> > > > >
> > > > > >On 18-03-2013, at 16:49, Pete <freerisk3 at gmail.com> wrote:
> > > > > >
> > > > > > >
> > > > > > > I have this little data.frame
> > > > > > >
> > > > > > > http://dl.dropbox.com/u/102669/nanotna.rdata
> > > > > > >
> > > > > > > Two column contains NA, so the best thing to do is use
> > na.locf
> > > > > > function (with
> > > > > > > fromLast = T)
> > > > > > >
> > > > > > > But locf function doesn't work because NA in my data.frame
> > are
> > > > > > not recognized as
> > > > > > > real NA.
> > > > > > >
> > > > > > > Is there a way to substitute fake NA with real NA? In this
> > case
> > > > > > na.locf function
> > > > > > > should work
> > > > > > >
> > > > > >
> > > > > >Your data are all characters. Do
> > > > > >
> > > > > >str(db)
> > > > > >
> > > > > >to see that. What is probably supposed to be numeric is also
> > > > > character,
> > > > > >Somehow you have managed to read in data that R thinks is all
> > chr.
> > > > > >Your NA are "NA" in reality: a character string "NA".
> > > > > >
> > > > > >You will have to review the method you used to get the data into
> > R.
> > > > > >And make sure that what you want to be numeric is indeed
> > numeric.
> > > > > >Then you can start to think about doing something about the
> > NA's.
> > > > > >
> > > > > >Berend
> > > > >
> > > > > ______________________________________________
> > > > > R-help at r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-
> > project.org/posting-
> > > > > guide.html
> > > > > and provide commented, minimal, self-contained, reproducible
> > code.
> > > >
> > > > ______________________________________________
> > > > R-help at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > > guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list