[R] dataframe operation

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jan 25 15:32:00 CET 2007


In conversing offline with Indermaur it seems that the elements of
b are supposed to correspond to the rows rather than columns.
In that case we can have the simpler solution:

0 * DF + b

On 1/24/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> Here is a slight variation on Marc's idea:
>
> isna <- is.na(DF)
> DF[] <- replace(100 * col(isna), isna, NA)
>
> On 1/24/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
> > On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> > > On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:
> > > > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > > > hi
> > > > > i have a dataframe "a" which looks like:
> > > > >
> > > > > column1, column2, column3
> > > > > 10,12, 0
> > > > > NA, 0,1
> > > > > 12,NA,50
> > > > >
> > > > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > > > >
> > > > > any idea i can do it?
> > > > >
> > > > > i appreciate any hint
> > > > > regards
> > > > > lukas
> > > > >
> > > >
> > > > Here is one possibility:
> > > >
> > > > > sapply(seq(along = colnames(DF)),
> > > >          function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> > > >      [,1] [,2] [,3]
> > > > [1,]   10   12    0
> > > > [2,]  100    0    1
> > > > [3,]   12  200   50
> > > >
> > > >
> > > > Note that the returned object will be a matrix, so if you need a data
> > > > frame, just coerce the result with as.data.frame().
> > >
> > > OK....that's what I get for pulling the trigger too fast.
> > >
> > > Just reverse the logic in the function:
> > >
> > > > sapply(seq(along = colnames(DF)),
> > >          function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
> > >      [,1] [,2] [,3]
> > > [1,]  100  200  300
> > > [2,]   NA  200  300
> > > [3,]  100   NA  300
> > >
> > >
> > > I misread the query initially.
> >
> > Here is another possibility, which may be faster depending upon the
> > actual size and dims of your initial data frame.
> >
> > Preallocate a matrix of replacement values:
> >
> > Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),
> >              ncol = ncol(DF))
> >
> > > Mat
> >     [,1] [,2] [,3]
> > [1,]  100  200  300
> > [2,]  100  200  300
> > [3,]  100  200  300
> >
> >
> > Now do the replacement:
> >
> > > ifelse(!is.na(DF), Mat, NA)
> >  column1 column2 column3
> > 1     100     200     300
> > 2      NA     200     300
> > 3     100      NA     300
> >
> >
> > In doing some testing, the above may be about 10 times faster than using
> > sapply() in my first solution, again depending upon the structure of
> > your DF.
> >
> > HTH,
> >
> > Marc
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



More information about the R-help mailing list