jim holtman
jholtman at gmail.com
Fri Jun 13 18:45:23 CEST 2008
What is the structure of 'd.frame' and 'segFile'? Run Rprof so that
we can see which of the functions it is spending its time in. What
happens if x$index is not in seqFile$index? Are the values in the
'index' unique in both structures? Subsetting a data frame can be
expensive when compared to using a matrix. Could you use a matrix
instead of a data frame; are all the columns the same mode? Again
either a subset of data would be helpful or an 'str' on the data
objects being used so that we can understand what they are.
On Fri, Jun 13, 2008 at 12:03 PM, Lana Schaffer wrote:
> Jim,
> My code is this:
> mergefunc <- function(x,seqFile){
> # merge(seqFile,x)
> cbind(x, seqFile[ match(as.vector(x$index), as.vector(seqFile$index)),
> ])
> }
> LIX <- lapply(d.frame[[1]], mergefunc,seqFile=seqFile)
> Each matrix/data.frame takes 0.2 seconds and then to do this
> 1240 times takes ~4 minutes.
> Thanks,
> Lana
>
From: jim holtman
Sent: Thursday, June 12, 2008 6:40 PM
> To: Lana Schaffer
> Cc: r-help at r-project.org
>
> It would be nice if you at least included the code that you are using
> and a subset of the data. Have you run Rprof to determine which of the
> functions is consuming the time?
>
On Thu, Jun 12, 2008 at 3:25 PM, Lana Schaffer
wrote:
> wrote:
>> Greetings,
>> I am doing matching/merge for a table (40919x3) to data which is in
>> the form of a list of 1268 data.frames. Using lapply this is taking
>> ~5 minutes. I know that the match/merge functions are time consuming,
>> so is there an alternative to this accomplish this goal? is lapply
>> not efficient?
>>
>> Lana Schaffer
>> Biostatistics/Informatics
>> The Scripps Research Institute
>> DNA Array Core Facility
>> La Jolla, CA 92037
>> (858) 784-2263
>> (858) 784-2994
>> schaffer at scripps.edu
