[R] Performance issue with attributes

luke-tierney at uiowa.edu luke-tierney at uiowa.edu
Tue Mar 11 15:31:22 CET 2014


You can also upgrade to R-devel or to R 3.1.0 due out in a month or so
-- those will run this code much more efficiently.

Using setattr is OK if you really know what you are doing, but if you
are not careful using it can modify objects you do not intend to
modify.

Best,

luke

On Tue, 11 Mar 2014, Smart Guy wrote:

> Apologies for the late reply. I was out on vacation.
> I tried setattr() from data.table package and it worked like a magic.
>
> Thanks a lot for the help. setattr() is really faster than "attributes".
>
> Regards,
> SG
>
>
> On 22 February 2014 12:29, Philippe Grosjean <phgrosjean at sciviews.org>wrote:
>
>> You can use setattr() in the data.table package. It can be used too on
>> data.frames or other objects.
>> Best,
>>
>> Philippe Grosjean
>>
>>
>> On 22 Feb 2014, at 03:13, Smart Guy <smartguy3k at gmail.com> wrote:
>>
>>> Hi All
>>>
>>> I am having problem running the 'attributes' command to set a attribute
>> on
>>> each column of a large dataset. Dataset has 80 columns and 312407 rows.
>> Its
>>> taking more than 60 seconds to set simple attributes like split=TRUE,
>>> usermissing=FALSE.
>>>
>>> Here is the source code, assuming Dataset1 is the one that is large :-
>>>
>>> myfunction <- function()
>>> {
>>> cat("Before for loop:")
>>> print(Sys.time())
>>> for( colIndex in 1 : 80)
>>> {
>>> cat("Before Attr", colIndex)
>>> print(Sys.time())
>>>
>>> attributes(Dataset1[1]) <- c(attributes(Dataset1[, colIndex]),
>> list(coldesc
>>> = c(), usermissing = c(FALSE), missingvalues  = NULL, split = c(FALSE),
>>> levelLabels = c("")))
>>>
>>> cat("After Attr:")
>>> print(Sys.time())
>>> }
>>> cat("After for loop:")
>>> print(Sys.time())
>>> }
>>>
>>> Its my feeling that R is passing all 312407 rows to set 'attributes' on a
>>> cloumn.
>>>
>>> Is there a more efficent way to do this?
>>>
>>>
>>> Thanks,
>>> SG
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu




More information about the R-help mailing list