[R] Populating then sorting a matrix and/or data.frame

Peter Langfelder peter.langfelder at gmail.com
Fri Nov 12 01:01:11 CET 2010


I see 4 ways to write the code:

1. make the frame very long at the start and use my code - this is
practical if you know that your data frame will not be longer than a
certain number of rows, be it a million;

2a. use something like

result1 = data.frame(a=a, b=b, c=c, d=d)

within the loop to create a 1x4 data frame that you can rbind to
results within the loop;

2b. make the code a bit more intelligent, for example by allocating
blocks of say n=1000 at a time as needed and rbind-ing them to result;

3. fill up results with characters using your rbind(results,
c(a,b,c,d)), then use something like

results[, c(2:4)] = apply(apply(results[, c(2:4), 2, as.character), 2,
as.numeric)

to convert the characters in columns 2:4 to numbers (this construct
also works with factors)

The difference between 2a and 2b is that 2b may be faster if n is
large, because 2a grows 4 objects by 1 unit n times, which is quite
slow. The same holds for solution 3. In that sense solution 1 may be
less wasteful than solutions 2a or 3 although it may not look like
that.

Peter



On Thu, Nov 11, 2010 at 3:38 PM, Noah Silverman <noah at smartmediacorp.com> wrote:
> That makes perfect sense.  All of my numbers are being coerced into
> strings by the c() function.  Subsequently, my data.frame contains all
> strings.
>
> I can't know the length of the data.frame ahead of time, so can't
> predefine it like your example.
> One thought would be to make it arbitrarily long filled with 0 and
> delete off the unused rows.  But this seems rather wasteful.
>
> -N



More information about the R-help mailing list