[R] simple recoding problem, but a trouble !

David Winsemius dwinsemius at comcast.net
Sat Feb 19 16:28:11 CET 2011


On Feb 19, 2011, at 8:40 AM, Umesh Rosyara wrote:

> Just a correction. My expected outdata frame was somehow distorted  
> to a
> single, one column. So correct one is:
>
> marker1a	 markerb	 marker2a	 marker2b	
> 1	 1	 1	 1	
> 1	 3	 1	 3	
> 3	 3	 3	 3	
> 3	 3	 3	 3	
> 1	 3	 1	 3	
> 1	 3	 1	 3	


func <- function(x) {sapply( strsplit(x, ""),
                                     match, table= c("A", NA, "C"))}
t( apply(dfr, 1, func) )

      [,1] [,2] [,3] [,4]
[1,]    1    1    1    1
[2,]    1    3    1    3
[3,]    3    3    3    3
[4,]    3    3    3    3
[5,]    1    3    1    3
[6,]    1    3    1    3


It's amatrix rather than a dataframe and doesn't have colnames but  
that should be trivial to fix.

>
> Thanks;
>
> Umesh R
>
>  _____
>
> From: Umesh Rosyara [mailto:rosyaraur at gmail.com]
> Sent: Friday, February 18, 2011 10:09 PM
> To: 'Joshua Wiley'
> Cc: 'r-help at r-project.org'
> Subject: RE: [R] recoding a data in different way: please help
>
>
> Hi Josh and R community members
>
> Thank you for quick response. I am impressed with the help.
>
> To solve my problems, I tried recode options and I had the following  
> problem
> and which motivated me to leave it. Thank you for remind me the option
> again, might help to solve my problem in different way.
>
> marker1 <- c("AA", "AC", "CC", "CC", "AC", "AC")
>
> marker2 <- c("AA", "AC", "CC", "CC", "AC", "AC")
>
> dfr <- data.frame(cbind(marker1, marker2))
>
> Objective: replace A with 1, C with 3, and split AA into 1 1 (two  
> columns
> numeric). So the intended output for the above dataframe is:
>
>
>
> marker1a
> markerb
> marker2a
> marker2b
>
> 1
> 1
> 1
> 1
>
> 1
> 3
> 1
> 3
>
> 3
> 3
> 3
> 3
>
> 3
> 3
> 3
> 3
>
> 1
> 3
> 1
> 3
>
> 1
> 3
> 1
> 3
>
> I tried the following:
>
> for(i in 1:length(dfr))
>   {
>     dfr[[i]]=recode (dfr[[i]],"c('AA')= '1,1'; c('AC')= '1,3';  
> c('CA')=
> '1,3';  c('CC')= '3,3' ")
> }
>
> write.table(dfr,"dfr.out", sep=" ,", col.names = T)
> dfn=read.table("dfr.out",header=T, sep="," )
>
> # just trying to cheat R, unfortunately the marker1 and marker columns
> remained non-numeric, even when opened in excel !!
>
>
> Unfortunately I got the following result !
>
>   marker1 marker2
> 1     1,1      1,1
> 2     1,2      1,2
> 3     2,2      2,2
> 4     2,2      2,2
> 5     1,2      1,2
> 6     1,2      1,2
>
>
> Sorry to bother all of you, but simple things are being complicated  
> these
> days to me.
>
> Thank you so much
> Umesh R
>
>
>  _____
>
> From: Joshua Wiley [mailto:jwiley.psych at gmail.com]
> Sent: Friday, February 18, 2011 12:15 AM
> Cc: r-help at r-project.org
> Subject: Re: [R] recoding a data in different way: please help
>
>
>
> Dear Umesh,
>
> I could not figure out exactly what your recoding scheme was, so I do
> not have a specific solution for you.  That said, the following
> functions may help you get started.
>
> ?ifelse # vectorized and different from using if () statements
> ?if #
> ?Logic ## logical operators for your tests
> ## if you install and load the "car" package by John Fox
> ?recode # a function for recoding in package "car"
>
> I am sure it is possible to string together some massive series of if
> statements and then use a for loop, but that is probably the messiest
> and slowest possible way.  I suspect there will be faster, neater
> options, but I cannot say for certain without having a better feel for
> how all the conditions work.
>
> Best regards,
>
> Josh
>
> On Thu, Feb 17, 2011 at 6:21 PM, Umesh Rosyara <rosyaraur at gmail.com>  
> wrote:
>> Dear R users
>>
>> The following question looks simple but I have spend alot of time  
>> to solve
>> it. I would highly appeciate your help.
>>
>> I have following dataset from family dataset :
>>
>> Here we have individuals and their two parents and their marker  
>> scores
>> (marker1, marker2,....and so on). 0 means that their parent  
>> information
> not
>> available.
>>
>>
>> Individual      Parent1  Parent2         mark1   mark2
>> 1        0       0       12      11
>> 2        0       0       11      22
>> 3        0       0       13      22
>> 4        0       0       13      11
>> 5        1       2       11      12
>> 6        1       2       12      12
>> 7        3       4       11      12
>> 8        3       4       13      12
>> 9        1       4       11      12
>> 10       1       4       11      12
>>
>> I want to recode mark1 and other mark2.....and so on column by  
>> looking
>> indvidual parent (Parent1 and Parent2).
>>
>> For example
>>
>> Take case of Individual 5, who's Parent 1 is 1 (has mark1 score 12)  
>> and
>> Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score 11.
> Suppose
>> I have following condition to recode Individual 5's mark1 score:
>>
>> For mark1 variable, If Parent1 score "11" and Parent2 score "22" and
> recode
>> indvidual 5's score, "12"=1, else 0
>>                                   If Parent1 score "12" and Parent2  
>> score
>> "22" and recode individual 5's score, "22"=1, "12"= 0.5, else 0
>>                                   .........................more
> conditions
>>
>> Similarly the pointer should move from individual 5 to n  
>> individuals at
> the
>> end of the file.
>>
>> Thank you in advance
>>
>> Umesh R
>>
>>
>>
>>
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>
>  _____
>
> No virus found in this message.
> Checked by AVG - www.avg.com
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list