[R] Matrix Size

Thu Jul 15 11:07:19 CEST 2010

paul s wrote:
> 
> On 07/14/2010 06:15 PM, Peter Dalgaard wrote:
>> A quick calculation reveals that a matrix of that size requires about
>> 2.7 TERAbytes of storage, so I'm a bit confused as to how you might
>> expect to fit it into 16GB of RAM...
>>
>> However, even with terabytes of memory, you would be running into the
>> (current) limitation that a single vector in R can have at most 2^31-1 =
>> ca. 2 trillion elements.
> 
> thank you for also confirming what Douglas had written.

(except that it should of course be billions not trillions; I got the
DK/UK difference of whether a billion is a thousand or a million
millions bass ackward there.)

>> Yes, you could be doing it wrong, but what is "it"?
> 
> we are trying to create a hedonic index: http://tinyurl.com/2fnl3jf

That link is to a description of Linear Regression. I'm sure you didn't
intend to instruct an audience of statisticioan about that....

Assuming that you intended

http://en.wikipedia.org/wiki/Hedonic_index

it could be useful for the understanding to know what is the number of
periods, the number of prices and the dimension and nature of the set of
characteristics. (There's a typo, right? c[it] inside the sum could just
be taken outside, so should likely be something with "j".)

>> If the matrix is sparse, there are sparse matrix tools around...
> 
> interesting yet again! just read what this was and it seems like it 
> could be. also part of the matrix could be a diagonal matrix.
> 
> 1 0 0 0 0 0 0 0 0 0 0 1
> 1 0 0 0 0 0 1 0 0 0 0 0
> 1 0 0 0 0 0 0 0 1 0 0 0
> 0 1 0 0 0 0 0 1 0 0 0 0
> 0 1 0 0 0 0 0 0 0 1 0 0
> 0 1 0 0 0 0 1 0 0 0 0 0
> 0 0 1 0 0 0 0 0 0 0 0 1
> 0 0 1 0 0 0 0 0 0 1 0 0
> 0 0 1 0 0 0 0 0 1 0 0 0
> 0 0 1 0 0 0 0 0 0 0 1 0
> 0 0 1 0 0 0 0 0 0 0 0 1
> 0 0 0 1 0 0 0 0 0 0 1 0
> 0 0 0 1 0 0 0 0 1 0 0 0
> 0 0 0 1 0 0 0 1 0 0 0 0
> 0 0 0 1 0 0 0 0 0 0 1 0
> 
> if it is a sparse matrix how would i test? just a smaller subset of data 
> that i have run regression on producing similar coefficients?

Hmm, do you really have only 2 nonzero entries per row? If so, then a
sparse matrix representation would involve "only" 4 million entries and
it's beginning to look like the matrices involved in mixed-effects
models with crossed random factors, so that techniques like minimal
fill-in Choleski decomposition would apply.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com