[R] simple recoding problem, but a trouble !

David Winsemius dwinsemius at comcast.net
Sun Feb 20 06:03:55 CET 2011


On Feb 19, 2011, at 10:28 PM, David Winsemius wrote:

>
> On Feb 19, 2011, at 10:19 PM, Umesh Rosyara wrote:
>
>> Thank you David
>>
>> I was able to create dataframe and  restore names with the following:
>>
>> dfr1 <- data.frame(t( apply(dfr, 1, func) ))
>> names(dfr1) <- c("marker1a","marker1b", "marker2a",  
>> "marker2b" ,"marker3a", "marker3b")
>> Still I wonder if there is easier way to restore the names, in  
>> situations where there are 1000's of variables making the list as  
>> above might be tidious.
>
> Well, we wouldn't want life to be tidious, now, would we?
>
> > rep(names(dfr), each=2)
> [1] "marker1" "marker1" "marker2" "marker2"
> > rep(letters[1:2], each=2)
> [1] "a" "a" "b" "b"
> > paste(rep(names(dfr), each=2), rep(letters[1:2], each=2), sep="")
> [1] "marker1a" "marker1a" "marker2b" "marker2b"

Thanks, Jorge. Should have been one of these:

paste(rep(names(dfr), each=2), letters[1:2], sep="")
# [[1] "marker1a" "marker1b" "marker2a" "marker2b"

paste(rep(names(dfr), each=2), rep(letters[1:2], 2), sep="")



>
> -- 
> David.
>
>
>> Thank you for solving my problem. I appreciate it.
>> Umesh R
>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>> Sent: Saturday, February 19, 2011 10:28 AM
>> To: Umesh Rosyara
>> Cc: 'Joshua Wiley'; r-help at r-project.org
>> Subject: Re: [R] simple recoding problem, but a trouble !
>>
>>
>> On Feb 19, 2011, at 8:40 AM, Umesh Rosyara wrote:
>>
>> > Just a correction. My expected outdata frame was somehow distorted
>> > to a
>> > single, one column. So correct one is:
>> >
>> > marker1a       markerb         marker2a        marker2b
>> > 1      1       1       1
>> > 1      3       1       3
>> > 3      3       3       3
>> > 3      3       3       3
>> > 1      3       1       3
>> > 1      3       1       3
>>
>>
>> func <- function(x) {sapply( strsplit(x, ""),
>>                                     match, table= c("A", NA, "C"))}
>> t( apply(dfr, 1, func) )
>>
>>      [,1] [,2] [,3] [,4]
>> [1,]    1    1    1    1
>> [2,]    1    3    1    3
>> [3,]    3    3    3    3
>> [4,]    3    3    3    3
>> [5,]    1    3    1    3
>> [6,]    1    3    1    3
>>
>>
>> It's amatrix rather than a dataframe and doesn't have colnames but
>> that should be trivial to fix.
>>
>> >
>> > Thanks;
>> >
>> > Umesh R
>> >
>> >  _____
>> >
>> > From: Umesh Rosyara [mailto:rosyaraur at gmail.com]
>> > Sent: Friday, February 18, 2011 10:09 PM
>> > To: 'Joshua Wiley'
>> > Cc: 'r-help at r-project.org'
>> > Subject: RE: [R] recoding a data in different way: please help
>> >
>> >
>> > Hi Josh and R community members
>> >
>> > Thank you for quick response. I am impressed with the help.
>> >
>> > To solve my problems, I tried recode options and I had the  
>> following
>> > problem
>> > and which motivated me to leave it. Thank you for remind me the  
>> option
>> > again, might help to solve my problem in different way.
>> >
>> > marker1 <- c("AA", "AC", "CC", "CC", "AC", "AC")
>> >
>> > marker2 <- c("AA", "AC", "CC", "CC", "AC", "AC")
>> >
>> > dfr <- data.frame(cbind(marker1, marker2))
>> >
>> > Objective: replace A with 1, C with 3, and split AA into 1 1 (two
>> > columns
>> > numeric). So the intended output for the above dataframe is:
>> >
>> >
>> >
>> > marker1a
>> > markerb
>> > marker2a
>> > marker2b
>> >
>> > 1
>> > 1
>> > 1
>> > 1
>> >
>> > 1
>> > 3
>> > 1
>> > 3
>> >
>> > 3
>> > 3
>> > 3
>> > 3
>> >
>> > 3
>> > 3
>> > 3
>> > 3
>> >
>> > 1
>> > 3
>> > 1
>> > 3
>> >
>> > 1
>> > 3
>> > 1
>> > 3
>> >
>> > I tried the following:
>> >
>> > for(i in 1:length(dfr))
>> >   {
>> >     dfr[[i]]=recode (dfr[[i]],"c('AA')= '1,1'; c('AC')= '1,3';
>> > c('CA')=
>> > '1,3';  c('CC')= '3,3' ")
>> > }
>> >
>> > write.table(dfr,"dfr.out", sep=" ,", col.names = T)
>> > dfn=read.table("dfr.out",header=T, sep="," )
>> >
>> > # just trying to cheat R, unfortunately the marker1 and marker  
>> columns
>> > remained non-numeric, even when opened in excel !!
>> >
>> >
>> > Unfortunately I got the following result !
>> >
>> >   marker1 marker2
>> > 1     1,1      1,1
>> > 2     1,2      1,2
>> > 3     2,2      2,2
>> > 4     2,2      2,2
>> > 5     1,2      1,2
>> > 6     1,2      1,2
>> >
>> >
>> > Sorry to bother all of you, but simple things are being complicated
>> > these
>> > days to me.
>> >
>> > Thank you so much
>> > Umesh R
>> >
>> >
>> >  _____
>> >
>> > From: Joshua Wiley [mailto:jwiley.psych at gmail.com]
>> > Sent: Friday, February 18, 2011 12:15 AM
>> > Cc: r-help at r-project.org
>> > Subject: Re: [R] recoding a data in different way: please help
>> >
>> >
>> >
>> > Dear Umesh,
>> >
>> > I could not figure out exactly what your recoding scheme was, so  
>> I do
>> > not have a specific solution for you.  That said, the following
>> > functions may help you get started.
>> >
>> > ?ifelse # vectorized and different from using if () statements
>> > ?if #
>> > ?Logic ## logical operators for your tests
>> > ## if you install and load the "car" package by John Fox
>> > ?recode # a function for recoding in package "car"
>> >
>> > I am sure it is possible to string together some massive series  
>> of if
>> > statements and then use a for loop, but that is probably the  
>> messiest
>> > and slowest possible way.  I suspect there will be faster, neater
>> > options, but I cannot say for certain without having a better  
>> feel for
>> > how all the conditions work.
>> >
>> > Best regards,
>> >
>> > Josh
>> >
>> > On Thu, Feb 17, 2011 at 6:21 PM, Umesh Rosyara  
>> <rosyaraur at gmail.com>
>> > wrote:
>> >> Dear R users
>> >>
>> >> The following question looks simple but I have spend alot of time
>> >> to solve
>> >> it. I would highly appeciate your help.
>> >>
>> >> I have following dataset from family dataset :
>> >>
>> >> Here we have individuals and their two parents and their marker
>> >> scores
>> >> (marker1, marker2,....and so on). 0 means that their parent
>> >> information
>> > not
>> >> available.
>> >>
>> >>
>> >> Individual      Parent1  Parent2         mark1   mark2
>> >> 1        0       0       12      11
>> >> 2        0       0       11      22
>> >> 3        0       0       13      22
>> >> 4        0       0       13      11
>> >> 5        1       2       11      12
>> >> 6        1       2       12      12
>> >> 7        3       4       11      12
>> >> 8        3       4       13      12
>> >> 9        1       4       11      12
>> >> 10       1       4       11      12
>> >>
>> >> I want to recode mark1 and other mark2.....and so on column by
>> >> looking
>> >> indvidual parent (Parent1 and Parent2).
>> >>
>> >> For example
>> >>
>> >> Take case of Individual 5, who's Parent 1 is 1 (has mark1 score  
>> 12)
>> >> and
>> >> Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score  
>> 11.
>> > Suppose
>> >> I have following condition to recode Individual 5's mark1 score:
>> >>
>> >> For mark1 variable, If Parent1 score "11" and Parent2 score "22"  
>> and
>> > recode
>> >> indvidual 5's score, "12"=1, else 0
>> >>                                   If Parent1 score "12" and  
>> Parent2
>> >> score
>> >> "22" and recode individual 5's score, "22"=1, "12"= 0.5, else 0
>> >>                                   .........................more
>> > conditions
>> >>
>> >> Similarly the pointer should move from individual 5 to n
>> >> individuals at
>> > the
>> >> end of the file.
>> >>
>> >> Thank you in advance
>> >>
>> >> Umesh R
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>       [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >
>> >
>> >
>> > --
>> > Joshua Wiley
>> > Ph.D. Student, Health Psychology
>> > University of California, Los Angeles
>> > http://www.joshuawiley.com/
>> >
>> >  _____
>> >
>> > No virus found in this message.
>> > Checked by AVG - www.avg.com
>> >
>> >
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1204 / Virus Database: 1435/3453 - Release Date:  
>> 02/19/11
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list