[R] Data Clean of Character variables

Uwe Ligges ligges at statistik.tu-dortmund.de
Wed Jun 3 08:45:41 CEST 2009



Chris Anderson wrote:
> I am new to R, and I need to do some simple data cleaning.
> In the example below I have 164 rows of data with a missing value. I want to convert these missing to unknown while keeping the other values as is. 
> 
> summary(ABIClinical$HasSkinBreakdown)
>              no unknown     yes 
>     164     914     163     178 
> 
> When I did the following code: 
>  ABIClinical$HasSkinBreakdown<-ifelse(ABIClinical$HasSkinBreakdown=='',"unknown",ABIClinical$HasSkinBreakdown)


If it is already a factor you can say:

ABIClinical$HasSkinBreakdown[ABIClinical$HasSkinBreakdown==""] <- "unknown"

You probably thought about the point to code something as "unknown" 
rather than NA...

Uwe Ligges





> I get the following:
> summary(ABIClinical$HasSkinBreakdown)
>    Length     Class      Mode 
>      1419 character character 
> 
> How do I get results as in my first statement,but with the missing values converted to "unknown"?
>  
> 
> 
> Chris Anderson
> 707.315.8486
> www.sassydeals4u.com
> ____________________________________________________________
> Click now to find great remedies for hangovers!
> http://thirdpartyoffers.netzero.net/TGL2241/fc/BLSrjpYX6cOwzVRsN4wgBL2BcrUkmhqhwnuhi3jp8A1D4w4rJCnhkDzzV9u/
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list