[R] Replace NAs in split lists

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Mon Jan 8 16:45:26 CET 2018


"Enforce" is overstating it... results will differ if there are no non-NA values for a given ID, and there is a potential further discrepancy if there are multiple non-NA values. But these issues were not identified by the OP, so may not be relevant in their case. 
-- 
Sent from my phone. Please excuse my brevity.

On January 8, 2018 6:41:33 AM PST, Eric Berger <ericjberger at gmail.com> wrote:
>You can enforce these assumptions by sorting on multiple columns, which
>leads to
>
>na.locf(df1[ order(df1$ID,df1$Value), ])
>
>
>
>On Mon, Jan 8, 2018 at 4:19 PM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us>
>wrote:
>
>> Yes, you are right if the IDs are always sequentially-adjacent and
>the
>> first non-NA value appears in the first record for each ID.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On January 8, 2018 2:29:40 AM PST, PIKAL Petr
><petr.pikal at precheza.cz>
>> wrote:
>> >Hi
>> >
>> >With the example, na.locf seems to be the easiest way.
>> >> library(zoo)
>> >
>> >> na.locf(df1)
>> >  ID ID_2 Firist Value
>> >1  a   aa   TRUE     2
>> >2  a   ab  FALSE     2
>> >3  a   ac  FALSE     2
>> >4  b   aa   TRUE     5
>> >5  b   ab  FALSE     5
>> >
>> >Cheers
>> >Petr
>> >
>> >> -----Original Message-----
>> >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>Jeff
>> >> Newmiller
>> >> Sent: Monday, January 8, 2018 9:13 AM
>> >> To: r-help at r-project.org; Ek Esawi <esawiek at gmail.com>
>> >> Subject: Re: [R] Replace NAs in split lists
>> >>
>> >> Upon closer examination I see that you are not using the split
>> >version of
>> >> df1 as I usually would, so here is a reproducible example:
>> >>
>> >> #----
>> >> df1 <- read.table( text=
>> >> "ID ID_2 Firist Value
>> >> 1  a   aa   TRUE     2
>> >> 2  a   ab  FALSE    NA
>> >> 3  a   ac  FALSE    NA
>> >> 4  b   aa   TRUE     5
>> >> 5  b   ab  FALSE    NA
>> >> ", header=TRUE, as.is=TRUE )
>> >>
>> >> sdf <- split( df1, df1$ID )
>> >> # note the extra [ 1 ] in case you have more than one non-NA value
>#
>> >per ID
>> >> sdf2 <- lapply( sdf
>> >>                , function( z ) {
>> >>                   z$Value <- ifelse( is.na( z$Value )
>> >>                                    , z$Value[ !is.na( z$Value ) ][
>1
>> >]
>> >>                                    , z$Value
>> >>                                    )
>> >>                   z
>> >>                  }
>> >>                )
>> >> df2 <- do.call( rbind, sdf2 )
>> >> df2
>> >> #>     ID ID_2 Firist Value
>> >> #> a.1  a   aa   TRUE     2
>> >> #> a.2  a   ab  FALSE     2
>> >> #> a.3  a   ac  FALSE     2
>> >> #> b.4  b   aa   TRUE     5
>> >> #> b.5  b   ab  FALSE     5
>> >>
>> >> # or using tidyverse methods
>> >>
>> >> library(dplyr)
>> >> #>
>> >> #> Attaching package: 'dplyr'
>> >> #> The following objects are masked from 'package:stats':
>> >> #>
>> >> #>     filter, lag
>> >> #> The following objects are masked from 'package:base':
>> >> #>
>> >> #>     intersect, setdiff, setequal, union
>> >> df3 <- (   df1
>> >>         %>% group_by( ID )
>> >>         %>% do({
>> >>                mutate( .
>> >>                      , Value = ifelse( is.na( Value )
>> >>                                      , Value[ !is.na( Value ) ][ 1
>]
>> >>                                      , Value
>> >>                                      )
>> >>                      )
>> >>             })
>> >>         %>% ungroup
>> >>         )
>> >> df3
>> >> #> # A tibble: 5 x 4
>> >> #>   ID    ID_2  Firist Value
>> >> #>   <chr> <chr> <lgl>  <int>
>> >> #> 1 a     aa    T          2
>> >> #> 2 a     ab    F          2
>> >> #> 3 a     ac    F          2
>> >> #> 4 b     aa    T          5
>> >> #> 5 b     ab    F          5
>> >> #----
>> >>
>> >> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>> >>
>> >> > Why do you want to modify df1?
>> >> >
>> >> > Why not just reassemble the parts as a new data frame and use
>that
>> >> > going forward in your calculations? That is generally the
>preferred
>> >> > approach in R so you can re-do your calculations easily if you
>find
>> >a
>> >> > mistake later.
>> >> > --
>> >> > Sent from my phone. Please excuse my brevity.
>> >> >
>> >> > On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at gmail.com>
>> >wrote:
>> >> >> I just came up with a solution right after i posted the
>question,
>> >but
>> >> >> i figured there must be a better and shorter one.than my
>solution
>> >> >> sdf1[[1]][1,4]<-lapplyresults[[1]]
>> >> >> sdf1[[2]][1,4]<-lapplyresults[[2]]
>> >> >>
>> >> >> EK
>> >> >>
>> >> >> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at gmail.com>
>> >wrote:
>> >> >>> Hi all--
>> >> >>>
>> >> >>> I stumbled on this problem online. I did not like the solution
>> >given
>> >> >>> there which was a long UDF. I thought why cannot split and l/s
>> >apply
>> >> >>> work here. My aim is to split the data frame, use l/sapply,
>make
>> >> >>> changes on the split lists and combine the split lists to new
>> >data
>> >> >>> frame with the desired changes/output.
>> >> >>>
>> >> >>> The data frame shown below has a column named ID which has 2
>> >> >> variables
>> >> >>> a and b; i want to replace the NAs on the Value column by 2,
>> >which
>> >> >>> is the only numeric entry, for ID=a and by 5 for ID=b.
>> >> >>>
>> >> >>> I worked out the solution but could not replace the results in
>> >the
>> >> >> split lists.
>> >> >>>
>> >> >>> Original dataframe , df1
>> >> >>>   ID ID_2 Firist Value
>> >> >>> 1  a   aa   TRUE     2
>> >> >>> 2  a   ab  FALSE    NA
>> >> >>> 3  a   ac  FALSE    NA
>> >> >>> 4  b   aa   TRUE     5
>> >> >>> 5  b   ab  FALSE    NA
>> >> >>> Sdf1
>> >> >>> $a
>> >> >>> ID ID_2 Firist Value
>> >> >>> 1  a   aa   TRUE     2
>> >> >>> 2  a   ab  FALSE    NA
>> >> >>> 3  a   ac  FALSE    NA
>> >> >>> $b
>> >> >>>   ID ID_2 Firist Value
>> >> >>> 4  b   aa   TRUE     5
>> >> >>> 5  b   ab  FALSE    NA
>> >> >>> Desired results
>> >> >>> ID ID_2 Firist Value
>> >> >>> 1  a   aa   TRUE    2
>> >> >>> 2  a   ab  FALSE    2
>> >> >>> 3  a   ac  FALSE    2
>> >> >>>
>> >> >>> $b
>> >> >>>   ID ID_2 Firist Value
>> >> >>> 4  b   aa   TRUE     5
>> >> >>> 5  b   ab  FALSE     5
>> >> >>>
>> >> >>> My code
>> >> >>>
>> >> >>> sdf <- split(df1,df$ID)
>> >> >>> lapply(sdf, function(z)
>> >> >> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>> >> >>> result:
>> >> >>> $ a: num [1:3] 2 2 2
>> >> >>> $ b: num [1:2] 5 5
>> >> >>>
>> >> >>> How could I put these two lists back in the split data frame,
>> >sdf1?
>> >> >>> Then I could use do.call to reassemble a data frame from the
>> >split
>> >> >>> lists,
>> >> >>>
>> >> >>> Thanks,
>> >> >>> EK
>> >> >>
>> >> >> ______________________________________________
>> >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> PLEASE do read the posting guide
>> >> >> http://www.R-project.org/posting-guide.html
>> >> >> and provide commented, minimal, self-contained, reproducible
>code.
>> >> >
>> >> > ______________________________________________
>> >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible
>code.
>> >> >
>> >>
>> >>
>> >-----------------------------------------------------------
>> ----------------
>> >> Jeff Newmiller                        The     .....       ..... 
>Go
>> >Live...
>> >> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#. 
>Live
>> >Go...
>> >>                                        Live:   OO#.. Dead: OO#..
>> >Playing
>> >> Research Engineer (Solar/Batteries            O.O#.       #.O#. 
>with
>> >> /Software/Embedded Controllers)               .OO#.       .OO#.
>> >rocks...1k
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> >________________________________
>> >Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a
>jsou
>> >určeny pouze jeho adresátům.
>> >Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
>> >neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a
>jeho
>> >kopie vymažte ze svého systému.
>> >Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento
>> >email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
>> >Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
>> >modifikacemi či zpožděním přenosu e-mailu.
>> >
>> >V případě, že je tento e-mail součástí obchodního jednání:
>> >- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
>> >smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
>> >- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně
>> >přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí
>nabídky
>> >ze strany příjemce s dodatkem či odchylkou.
>> >- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
>> >výslovným dosažením shody na všech jejích náležitostech.
>> >- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
>> >společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně
>> >zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly
>> >adresátovi tohoto emailu případně osobě, kterou adresát zastupuje,
>> >předloženy nebo jejich existence je adresátovi či osobě jím
>zastoupené
>> >známá.
>> >
>> >This e-mail and any documents attached to it may be confidential and
>> >are intended only for its intended recipients.
>> >If you received this e-mail by mistake, please immediately inform
>its
>> >sender. Delete the contents of this e-mail with all attachments and
>its
>> >copies from your system.
>> >If you are not the intended recipient of this e-mail, you are not
>> >authorized to use, disseminate, copy or disclose this e-mail in any
>> >manner.
>> >The sender of this e-mail shall not be liable for any possible
>damage
>> >caused by modifications of the e-mail or by delay with transfer of
>the
>> >email.
>> >
>> >In case that this e-mail forms part of business dealings:
>> >- the sender reserves the right to end negotiations about entering
>into
>> >a contract in any time, for any reason, and without stating any
>> >reasoning.
>> >- if the e-mail contains an offer, the recipient is entitled to
>> >immediately accept such offer; The sender of this e-mail (offer)
>> >excludes any acceptance of the offer on the part of the recipient
>> >containing any amendment or variation.
>> >- the sender insists on that the respective contract is concluded
>only
>> >upon an express mutual agreement on all its aspects.
>> >- the sender of this e-mail informs that he/she is not authorized to
>> >enter into any contracts on behalf of the company except for cases
>in
>> >which he/she is expressly authorized to do so in writing, and such
>> >authorization or power of attorney is submitted to the recipient or
>the
>> >person represented by the recipient, or the existence of such
>> >authorization is known to the recipient of the person represented by
>> >the recipient.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>



More information about the R-help mailing list