[R] a difficult situation, how to do this using base function.

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Sat Jul 22 02:38:12 CEST 2017


This is a plain text email list. Please learn how to explain this to your 
email client because what YOU saw before you sent it is not what WE saw 
after it bounced through the mailing list, and that can lead to 
misunderstandings.

If at all possible you should try to augment your table with an additional 
table to contain the range.coordinates, or replace the range.coordinates 
column with a list of tables. I have parsed your format into tables on the 
fly, but this is inefficient and fragile.

######
DF <- data.frame( match.start=c( 5, 10, 100, 200 )
                 , range.coordinates = c( "1000-1050"
                                        , "1500-1555"
                                        , "5000-5050,6000-6180"
                                        , "100-150,200-260,600-900"
                                        )
                 , stringsAsFactors = FALSE
                 )
lookupFunction <- function( ms, rc ) {
   rcdf <- as.data.frame( lapply( as.data.frame( t( as.data.frame(
     strsplit( strsplit( rc, ",", fixed=TRUE )[[ 1 ]], "-" ) ) ),
     stringsAsFactors=FALSE ), as.numeric ) )
   rcdf$V3 <- with( rcdf, cumsum( V2-V1 ) )
   rcq <- c( 0, rcdf$V3 )
   idx <- findInterval( ms, rcdf$V3 ) + 1
   rcdf$V1[ idx ] + ms - rcq[ idx ]
}

DF$match.start.updated <-
    unlist( lapply( seq.int( nrow( DF ) )
                  , function( i ) {
                       lookupFunction( DF$match.start[ i ]
                                     , DF$range.coordinates[ i ]
                                     )
                    }
                  )
         )
######

On Fri, 21 Jul 2017, Stephen HonKit Wong wrote:

> Hello,
>
> I have a following dataframe with many rows.
> data.frame(match.start=c(5,10,100,200),range.coordinates=c("1000-1050","1500-1555","5000-5050,6000-6180","100-150,200-260,600-900"))
>
> match.start       range.coordinates
>           5               1000-1050
>          10               1500-1555
>         100               5000-5050,6000-6180
>         200              100-150,200-260,600-900
>
> I want to test for each row element in column "match.start" (e.g. 100 on
> 3rd row) if it is less than the accumulated range (e.g. for 5000-5050,
> 6000-6180, the accumulated range is: 50, 230), then update the match start
> as 6000+ (100-50) = 6050. The result is put on third column.
>
> match.start         range.coordinates   match.start.updated
>          5                   1000-1050                                 1005
>         10                 1500-1555                                  1510
>        100       5000-5050,6000-6180                         6050
>        200   100-150,200-260,600-900                        690
>
> Many thanks.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list