[R] Validating data type

John Kane jrkrideau at inbox.com
Fri Aug 30 17:57:59 CEST 2013


It sounds like that column of data is not of type "date" at all. You cannot have one element of a column different from the rest of the column.  In a data.frame you can have different types of data in different columns but not in the same column.

Where mydata is your data.frame do :

str(mydata)

This will give you a listing of the type of data in each column in your data.frame.  

My guess would be that R has read in that column as character or factor.  Just because it looks like a date on the screen does not mean it is one. You probably will have to convert it to a date. 

See ?as.Date for one way to do this.

  You might also want to have a look at the lubridate package.

For further reference 
https://github.com/hadley/devtools/wiki/Reproducibility
 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

In particular for a question like yours, supplying some sample data using dput() would have really helped.  If you are still having a problem do : dput(mydata) and paste the output into the email. The reader  can then paste the data into their version of R and see exactly what you are working with.  For large datasets usually a sample amoutnt will do  , so  dput(head(mydata, 100) for example will supply 100 rows of data.

Below is a example of a data.frame in dput format. Just copy and paste it into R and you will have a new date.frame

John Kane
Kingston ON Canada

##====================dput file===================
dat1  <-  structure(list(xx = structure(c(5L, 6L, 10L, 9L, 17L, 10L, 15L, 
16L, 5L, 14L, 5L, 7L, 17L, 6L, 11L, 8L, 5L, 3L, 1L, 17L, 7L, 
10L, 5L, 15L, 15L, 16L, 17L, 14L, 8L, 13L, 12L, 13L, 18L, 9L, 
5L, 2L, 1L, 16L, 1L, 1L, 1L, 16L, 4L, 10L, 1L, 18L, 18L, 14L, 
13L, 4L), .Label = c("a", "b", "d", "f", "g", "h", "i", "j", 
"k", "l", "m", "n", "o", "p", "q", "r", "s", "t"), class = "factor"), 
    yy = c(0.332304663767243, -1.77867401940838, 0.828612337938625, 
    0.481702424196278, 0.0825987297345907, -1.40224568135063, 
    -0.243388884456876, 0.0865304079310024, -0.124012796374592, 
    -0.0107544463484595, -0.542307211820575, 0.0129727866797914, 
    -0.478553152291621, -1.63895681984396, 0.0911014618211326, 
    -0.890215628553797, -1.42140590396317, 0.202337039384179, 
    1.30089052407852, 0.07517013402338, -0.807355878474237, 1.12978841894929, 
    0.154740986108198, 0.21209595540936, 0.65345449749952, 0.533479658343466, 
    0.665882552612018, -0.604444572360781, -0.0971202279326936, 
    -0.862179166296771, -0.977706435316816, 0.559634439503645, 
    0.0320050874597674, -1.65502174652502, 0.853046541850183, 
    -0.801904205812903, -0.820335448022446, -0.912451936657161, 
    0.222469916395761, 0.0168002536713376, -0.218537143966283, 
    1.00191128410043, -0.430912734152427, -1.1327880971227, -0.664284053548425, 
    1.3082467197158, 1.46148850229679, -1.11954785811615, -1.61706514557631, 
    0.604530320200236)), .Names = c("xx", "yy"), row.names = c(NA, 
-50L), class = "data.frame")
##===================end dput file


> -----Original Message-----
> From: jeffjohn at worldvision.org
> Sent: Thu, 29 Aug 2013 20:29:54 -0700
> To: r-help at r-project.org
> Subject: [R] Validating data type
> 
> 
> 
> I'm very new to R. I have a data file that I have read in via read.csv. I
> expect one of the "columns" to be of type date for example. However at
> least one value in that column is not of date type. I know this because
> another program I am trying to process the file with is erroring, yet it
> doesn't tell me what row/value is erroring. Does R have a way to: treat
> column x as date type, and print out all values/row numbers do not
> conform
> to that type for that specified column?
> 
> Many thanks!
> Jeff
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5
Capture screenshots, upload images, edit and send them to your friends
through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST!



More information about the R-help mailing list