[R] Replace NAs in split lists

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Mon Jan 8 15:19:40 CET 2018


Yes, you are right if the IDs are always sequentially-adjacent and the first non-NA value appears in the first record for each ID.
-- 
Sent from my phone. Please excuse my brevity.

On January 8, 2018 2:29:40 AM PST, PIKAL Petr <petr.pikal at precheza.cz> wrote:
>Hi
>
>With the example, na.locf seems to be the easiest way.
>> library(zoo)
>
>> na.locf(df1)
>  ID ID_2 Firist Value
>1  a   aa   TRUE     2
>2  a   ab  FALSE     2
>3  a   ac  FALSE     2
>4  b   aa   TRUE     5
>5  b   ab  FALSE     5
>
>Cheers
>Petr
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jeff
>> Newmiller
>> Sent: Monday, January 8, 2018 9:13 AM
>> To: r-help at r-project.org; Ek Esawi <esawiek at gmail.com>
>> Subject: Re: [R] Replace NAs in split lists
>>
>> Upon closer examination I see that you are not using the split
>version of
>> df1 as I usually would, so here is a reproducible example:
>>
>> #----
>> df1 <- read.table( text=
>> "ID ID_2 Firist Value
>> 1  a   aa   TRUE     2
>> 2  a   ab  FALSE    NA
>> 3  a   ac  FALSE    NA
>> 4  b   aa   TRUE     5
>> 5  b   ab  FALSE    NA
>> ", header=TRUE, as.is=TRUE )
>>
>> sdf <- split( df1, df1$ID )
>> # note the extra [ 1 ] in case you have more than one non-NA value #
>per ID
>> sdf2 <- lapply( sdf
>>                , function( z ) {
>>                   z$Value <- ifelse( is.na( z$Value )
>>                                    , z$Value[ !is.na( z$Value ) ][ 1
>]
>>                                    , z$Value
>>                                    )
>>                   z
>>                  }
>>                )
>> df2 <- do.call( rbind, sdf2 )
>> df2
>> #>     ID ID_2 Firist Value
>> #> a.1  a   aa   TRUE     2
>> #> a.2  a   ab  FALSE     2
>> #> a.3  a   ac  FALSE     2
>> #> b.4  b   aa   TRUE     5
>> #> b.5  b   ab  FALSE     5
>>
>> # or using tidyverse methods
>>
>> library(dplyr)
>> #>
>> #> Attaching package: 'dplyr'
>> #> The following objects are masked from 'package:stats':
>> #>
>> #>     filter, lag
>> #> The following objects are masked from 'package:base':
>> #>
>> #>     intersect, setdiff, setequal, union
>> df3 <- (   df1
>>         %>% group_by( ID )
>>         %>% do({
>>                mutate( .
>>                      , Value = ifelse( is.na( Value )
>>                                      , Value[ !is.na( Value ) ][ 1 ]
>>                                      , Value
>>                                      )
>>                      )
>>             })
>>         %>% ungroup
>>         )
>> df3
>> #> # A tibble: 5 x 4
>> #>   ID    ID_2  Firist Value
>> #>   <chr> <chr> <lgl>  <int>
>> #> 1 a     aa    T          2
>> #> 2 a     ab    F          2
>> #> 3 a     ac    F          2
>> #> 4 b     aa    T          5
>> #> 5 b     ab    F          5
>> #----
>>
>> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>>
>> > Why do you want to modify df1?
>> >
>> > Why not just reassemble the parts as a new data frame and use that
>> > going forward in your calculations? That is generally the preferred
>> > approach in R so you can re-do your calculations easily if you find
>a
>> > mistake later.
>> > --
>> > Sent from my phone. Please excuse my brevity.
>> >
>> > On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at gmail.com>
>wrote:
>> >> I just came up with a solution right after i posted the question,
>but
>> >> i figured there must be a better and shorter one.than my solution
>> >> sdf1[[1]][1,4]<-lapplyresults[[1]]
>> >> sdf1[[2]][1,4]<-lapplyresults[[2]]
>> >>
>> >> EK
>> >>
>> >> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at gmail.com>
>wrote:
>> >>> Hi all--
>> >>>
>> >>> I stumbled on this problem online. I did not like the solution
>given
>> >>> there which was a long UDF. I thought why cannot split and l/s
>apply
>> >>> work here. My aim is to split the data frame, use l/sapply, make
>> >>> changes on the split lists and combine the split lists to new
>data
>> >>> frame with the desired changes/output.
>> >>>
>> >>> The data frame shown below has a column named ID which has 2
>> >> variables
>> >>> a and b; i want to replace the NAs on the Value column by 2,
>which
>> >>> is the only numeric entry, for ID=a and by 5 for ID=b.
>> >>>
>> >>> I worked out the solution but could not replace the results in
>the
>> >> split lists.
>> >>>
>> >>> Original dataframe , df1
>> >>>   ID ID_2 Firist Value
>> >>> 1  a   aa   TRUE     2
>> >>> 2  a   ab  FALSE    NA
>> >>> 3  a   ac  FALSE    NA
>> >>> 4  b   aa   TRUE     5
>> >>> 5  b   ab  FALSE    NA
>> >>> Sdf1
>> >>> $a
>> >>> ID ID_2 Firist Value
>> >>> 1  a   aa   TRUE     2
>> >>> 2  a   ab  FALSE    NA
>> >>> 3  a   ac  FALSE    NA
>> >>> $b
>> >>>   ID ID_2 Firist Value
>> >>> 4  b   aa   TRUE     5
>> >>> 5  b   ab  FALSE    NA
>> >>> Desired results
>> >>> ID ID_2 Firist Value
>> >>> 1  a   aa   TRUE    2
>> >>> 2  a   ab  FALSE    2
>> >>> 3  a   ac  FALSE    2
>> >>>
>> >>> $b
>> >>>   ID ID_2 Firist Value
>> >>> 4  b   aa   TRUE     5
>> >>> 5  b   ab  FALSE     5
>> >>>
>> >>> My code
>> >>>
>> >>> sdf <- split(df1,df$ID)
>> >>> lapply(sdf, function(z)
>> >> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>> >>> result:
>> >>> $ a: num [1:3] 2 2 2
>> >>> $ b: num [1:2] 5 5
>> >>>
>> >>> How could I put these two lists back in the split data frame,
>sdf1?
>> >>> Then I could use do.call to reassemble a data frame from the
>split
>> >>> lists,
>> >>>
>> >>> Thanks,
>> >>> EK
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>Go...
>>                                        Live:   OO#.. Dead: OO#.. 
>Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#. 
>rocks...1k
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>________________________________
>Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
>určeny pouze jeho adresátům.
>Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
>neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho
>kopie vymažte ze svého systému.
>Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento
>email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
>Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
>modifikacemi či zpožděním přenosu e-mailu.
>
>V případě, že je tento e-mail součástí obchodního jednání:
>- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
>smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
>- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně
>přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky
>ze strany příjemce s dodatkem či odchylkou.
>- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
>výslovným dosažením shody na všech jejích náležitostech.
>- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
>společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně
>zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly
>adresátovi tohoto emailu případně osobě, kterou adresát zastupuje,
>předloženy nebo jejich existence je adresátovi či osobě jím zastoupené
>známá.
>
>This e-mail and any documents attached to it may be confidential and
>are intended only for its intended recipients.
>If you received this e-mail by mistake, please immediately inform its
>sender. Delete the contents of this e-mail with all attachments and its
>copies from your system.
>If you are not the intended recipient of this e-mail, you are not
>authorized to use, disseminate, copy or disclose this e-mail in any
>manner.
>The sender of this e-mail shall not be liable for any possible damage
>caused by modifications of the e-mail or by delay with transfer of the
>email.
>
>In case that this e-mail forms part of business dealings:
>- the sender reserves the right to end negotiations about entering into
>a contract in any time, for any reason, and without stating any
>reasoning.
>- if the e-mail contains an offer, the recipient is entitled to
>immediately accept such offer; The sender of this e-mail (offer)
>excludes any acceptance of the offer on the part of the recipient
>containing any amendment or variation.
>- the sender insists on that the respective contract is concluded only
>upon an express mutual agreement on all its aspects.
>- the sender of this e-mail informs that he/she is not authorized to
>enter into any contracts on behalf of the company except for cases in
>which he/she is expressly authorized to do so in writing, and such
>authorization or power of attorney is submitted to the recipient or the
>person represented by the recipient, or the existence of such
>authorization is known to the recipient of the person represented by
>the recipient.



More information about the R-help mailing list