[R] complex transformation of data

Den d.kazakiewicz at gmail.com
Sat Jan 22 00:10:45 CET 2011


That great! It's working! Thank you so much!
It is a pure magic which makes my head spin.
aggregate(.~ id, lapply(df, as.character), FUN =
function(x)paste(sort(x), collapse = ''), na.action = na.pass)

1. help says:
 Note that ‘paste()’ coerces ‘NA_character_’, the character missing
value, to ‘"NA"'
And at the same time:
 ‘na.pass’ returns the object unchanged.
I am happy, that I don't have NAs in mydata.  I just don't understand
how
it happened.
2. Can't see the real difference between 'FUN = function(x) paste(x)'
and 'FUN = paste'. However, former working perfectly while latter simply
not.
3.Finally, all help says about LHS in formulas like '.~id' is that it's
name is "dot notation". And not a single word more. Thus, I have no
clue, what dot in that formula really means.


Conclusion:
1. It's a magic. 
2. You definitely saved my investigation. (When I've started I had no
idea it would be so difficult to arrange those chemotherapy cycles in
dataframe, although I dare to call myself pharmacoepidemiologist (which
sounds rather funny after that story))
3. THANK YOU!!!!!!

Sincerely yours 
Denis Kazakiewicz
Belarus 


У Пят, 21/01/2011 у 18:37 -0200, Henrique Dallazuanna піша:
> Just change the FUN function:
> 
> aggregate(.~ id, lapply(df, as.character), FUN =
> function(x)paste(sort(x), collapse = ''), na.action = na.pass)
> 
> On Fri, Jan 21, 2011 at 6:27 PM, Den <d.kazakiewicz at gmail.com> wrote:
>         
>         Thank you for your efforts.
>         Although it is still not working, it feels like getting closer
>         and
>         closer.
>         
>         id cycle1 cycle2 cycle3
>         1  1    cmf    cmf    cmf
>         2  2    mfc    mfc    mfc
>         
>         3  3  acfNA  acfNA  NAcfm
>         
>         I really appreciate transformation from subsets ("c","m","f")
>         to "cmf".
>         That was critical for me.
>         Hopefully, I'll figure  out the rest later with ddply from
>         plyr package.
>         At least this is my idea for now.
>         
>         
>         
>         У Пят, 21/01/2011 у 18:00 -0200, Henrique Dallazuanna піша:
>         
>         > correction:
>         > aggregate(.~ id, lapply(df, as.character), FUN = paste,
>         collapse = "",
>         > na.action = na.pass)
>         >
>         > On Fri, Jan 21, 2011 at 5:56 PM, Henrique Dallazuanna
>         > <wwwhsd at gmail.com> wrote:
>         >         Try this:
>         >
>         >         aggregate(.~ id, lapply(replace(df, is.na(df), ''),
>         >         as.character), FUN = paste, collapse = "", na.action
>         =
>         >         na.pass)
>         >
>         >
>         >
>         >         On Fri, Jan 21, 2011 at 5:45 PM, Den
>         <d.kazakiewicz at gmail.com>
>         >         wrote:
>         >                 Dear Henrique
>         >                 Thank you again for helping me
>         >                 Unfortunately, your code seems not to be
>         working
>         >
>         >                 > aggregate(.~ id, lapply(df, as.character),
>         FUN =
>         >                 paste, collapse = "")
>         >                  id cycle1 cycle2 cycle3
>         >                 1  1    cmf    cmf    cmf
>         >                 2  2    mfc    mfc    mfc
>         >                 3  3     cf     cf     cf
>         >
>         >                 (letter 'a' missing in
>         df[3,c("cycle1",cycle2")]
>         >
>         >                 You suggested very interesting approach,
>         however.
>         >                 Those '.~ id' and
>         >                 'as.character' gave me hope for success.
>         >                 With very best regards
>         >                 Denis
>         >
>         >
>         >                 У Пят, 21/01/2011 у 14:16 -0200, Henrique
>         Dallazuanna
>         >                 піша:
>         >
>         >                 > Try this:
>         >                 >
>         >                 > aggregate(.~ id, lapply(test,
>         as.character), FUN =
>         >                 paste, collapse =
>         >                 > "")
>         >                 >
>         >                 > On Fri, Jan 21, 2011 at 10:25 AM, Den
>         >                 <d.kazakiewicz at gmail.com> wrote:
>         >                 >         Dear [R] people
>         >                 >         Could you please help with
>         following data
>         >                 transformation.
>         >                 >         Any suggestions, hints, references
>         and even
>         >                 guessing on
>         >                 >         performing any
>         >                 >         of the following steps are highly
>         >                 appreciated. Those
>         >                 >         transformations are
>         >                 >         crucial for my work.
>         >                 >
>         >                 >         (n_, _n, j_, k_ signify numbers)
>         >                 >
>         >                 >         SOURCE DATA:
>         >                 >         id      cycle1  cycle2  cycle3  …
>         >                 cycle_n
>         >                 >         1       c       c       c
>         c
>         >                 >         1       m       m       m
>         m
>         >                 >         1       f       f       f
>         f
>         >                 >         2       m       m       m
>         NA
>         >                 >         2       f       f       f
>         NA
>         >                 >         2       c       c       c
>         NA
>         >                 >         3       a       a       NA
>                NA
>         >                 >         3       c       c       c
>         NA
>         >                 >         3       f       f       f
>         NA
>         >                 >         3       NA      NA      m
>         NA
>         >                 >
>           ...........................................
>         >                 >
>         >                 >
>         >                 >
>         >                 >         RESULT DATA1:
>         >                 >         id      cyc1    cyc2    cyc3    …
>         >                 cyc_n
>         >                 >         1       cfm     cfm     cfm
>         cfm
>         >                 >         2       cfm     cfm     cfm
>         NA
>         >                 >         3       acf     acf     cfm
>         NA
>         >                 >
>           ...........................................
>         >                 >
>         >                 >
>         >                 >         RESULT DATA2:
>         >                 >         id      treatment
>         >                 >         1       n_cfm
>         >                 >         2       j_cfm
>         >                 >         3       2acf->k_cfm
>         >                 >         ...................
>         >                 >
>         >                 >
>         >                 >         RESULT DATA3:
>         >                 >         id      regimen numOfCycles
>         >                 >         1       cfm     n_
>         >                 >         2       cfm     j_
>         >                 >         3       asf->cfm        {2+k_}
>         >                 >         .............................
>         >                 >
>         >                 >
>         >                 >
>         >                 >         Thank you
>         >                 >         Denis
>         >                 >
>         >                 >
>         >
>         ______________________________________________
>         >                 >         R-help at r-project.org mailing list
>         >                 >
>         https://stat.ethz.ch/mailman/listinfo/r-help
>         >                 >         PLEASE do read the posting guide
>         >                 >
>         http://www.R-project.org/posting-guide.html
>         >                 >         and provide commented, minimal,
>         >                 self-contained, reproducible
>         >                 >         code.
>         >                 >
>         >                 >
>         >                 >
>         >                 > --
>         >                 > Henrique Dallazuanna
>         >                 > Curitiba-Paraná-Brasil
>         >                 > 25° 25' 40" S 49° 16' 22" O
>         >
>         >
>         >
>         >
>         >
>         >
>         >         --
>         >         Henrique Dallazuanna
>         >         Curitiba-Paraná-Brasil
>         >         25° 25' 40" S 49° 16' 22" O
>         >
>         >
>         >
>         >
>         > --
>         > Henrique Dallazuanna
>         > Curitiba-Paraná-Brasil
>         > 25° 25' 40" S 49° 16' 22" O
>         
>         
>         
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O



More information about the R-help mailing list