[R] How to build a large identity matrix faster?

R. Michael Weylandt michael.weylandt at gmail.com
Sat Jun 9 00:32:29 CEST 2012


On Fri, Jun 8, 2012 at 5:31 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> For the matter and hand, you can probably get to it by simply trying
> base:::diag. In this case, it's not too hard because what you're
> seeing is the S4 generic that the Matrix package is defining _over_
> the regular base function generic.

Sorry -- the regular base function is not a generic in this case.
Everything else still holds though. (Same tricks to find things work
with print which becomes both an S3 and S4 generic on loading Matrix)

>
> More generally, going down the rabbit hole of S4:
>
> As it suggests, first try
>
> showMethods("diag")
>
> and you'll see a long list of types. The parallel to the *.default
> method is the one with signature "ANY" so you can try that:
>
> getMethod("diag", "ANY")
>
> which gets you where you need to be.
>
> Hope this helps,
> Michael
>
> On Fri, Jun 8, 2012 at 5:11 PM, Spencer Graves
> <spencer.graves at structuremonitoring.com> wrote:
>>      How can one get the source code for diag?  I tried the following:
>>
>>
>>> diag
>> standardGeneric for "diag" defined from package "base"
>>
>> function (x = 1, nrow, ncol)
>> standardGeneric("diag")
>> <environment: 0x0000000009dc1ab0>
>> Methods may be defined for arguments: x, nrow, ncol
>> Use  showMethods("diag")  for currently available ones.
>>
>>
>>      How can I look at the code past the methods dispatch?
>>
>>
>>> methods('diag')
>> [1] diag.panel.splom
>> Warning message:
>> In methods("diag") : function 'diag' appears not to be generic
>>
>>
>>      So "diag" is an S4 generic.  I tried the following:
>>
>>
>>> dumpMethods('diag', file='diag.R')
>>> readLines('diag.R')
>> character(0)
>>
>>
>>      More generally, what do you recommend I read to learn about S4
>> generics?  I've read fair portions of Chambers (1998, 2008), which produced
>> more frustration than enlightenment for me.
>>
>>
>>      Thanks,
>>      Spencer
>>
>>
>> On 6/8/2012 12:07 PM, Uwe Ligges wrote:
>>>
>>> I quickly looked at it, and the difference comes from:
>>>
>>> n <- 5e3
>>> system.time(x <- array(0, c(n, n))) # from diag()
>>> system.time(x <- matrix(0, n, n))   # from Rdiag()
>>>
>>> Replaced in R-devel.
>>>
>>> Best,
>>> Uwe Ligges
>>>
>>>
>>>
>>> On 07.06.2012 12:11, Spencer Graves wrote:
>>>>
>>>> On 6/7/2012 2:27 AM, Rui Barradas wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> To my great surprise, on my system, Windows 7, R 15.0, 32 bits, an R
>>>>> version is faster!
>>>>
>>>>
>>>> I was also surprised, Windows 7, R 2.15.0, 64-bit
>>>>
>>>>
>>>> > rbind(diag=t1, Rdiag=t2, ratio=t1/t2)
>>>> user.self sys.self elapsed user.child sys.child
>>>> diag 0.72 0.080000 0.81 NA NA
>>>> Rdiag 0.09 0.030000 0.12 NA NA
>>>> ratio 8.00 2.666667 6.75 NA NA
>>>> >
>>>> > sessionInfo()
>>>> R version 2.15.0 (2012-03-30)
>>>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=English_United States.1252
>>>> [2] LC_CTYPE=English_United States.1252
>>>> [3] LC_MONETARY=English_United States.1252
>>>> [4] LC_NUMERIC=C
>>>> [5] LC_TIME=English_United States.1252
>>>>
>>>> attached base packages:
>>>> [1] splines stats graphics grDevices utils datasets methods
>>>> [8] base
>>>>
>>>> other attached packages:
>>>> [1] fda_2.2.9 Matrix_1.0-6 lattice_0.20-6 zoo_1.7-7
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] grid_2.15.0 tools_2.15.0
>>>> >
>>>>
>>>>
>>>> Spencer
>>>>>
>>>>>
>>>>>
>>>>> Rdiag <- function(n){
>>>>> m <- matrix(0, nrow=n, ncol=n)
>>>>> m[matrix(rep(seq_len(n), 2), ncol=2)] <- 1
>>>>> m
>>>>> }
>>>>>
>>>>> Rdiag(4)
>>>>>
>>>>> n <- 5e3
>>>>> t1 <- system.time(d1 <- diag(n))
>>>>> t2 <- system.time(d2 <- Rdiag(n))
>>>>> all.equal(d1, d2)
>>>>> rbind(diag=t1, Rdiag=t2, ratio=t1/t2)
>>>>>
>>>>>
>>>>> Anyway, why don't you create it once, save a copy and use it many times?
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Rui Barradas
>>>>>
>>>>> Em 07-06-2012 08:55, Ceci Tam escreveu:
>>>>>>
>>>>>> Hello, I am trying to build a large size identity matrix using
>>>>>> diag(). The
>>>>>> size is around 23000 and I've tried diag(23000), that took a long time.
>>>>>> Since I have to use this operation several times in my program, the
>>>>>> running
>>>>>> time is too long to be tolerable. Are there any alternative for
>>>>>> diag(N)?
>>>>>> Thanks
>>>>>>
>>>>>> Cheers,
>>>>>> yct
>>>>>>
>>>>>> [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list