[R] Matrix scalar operation that saves memory?

Richard O'Keefe r@oknz @end|ng |rom gm@||@com
Thu Apr 13 13:57:04 CEST 2023


"wear your disc quite badly"?
If you can afford a computer with 512 GB of memory,
you can afford to pay $100 for a 2 TB external SSD,
use it as scratch space, and throw it away after a
month of use.  A hard drive is expected to last for
more than 40,000 hours of constant use.  Are you
sure that your own disc is so fragile?  Hard drives
are pretty cheap these days.  You could afford to
pay $50 for a 2 TB external hard drive, use it as
scratch space, and throw it away.

I've been around long enough to remember when the idea
of processing a 1000x1000 matrix in memory was greeted
with hysterical laughter and a recommendation to stop
smoking whatever I was smoking.   (But not old enough
to remember shuffling matrices around on tape.  Shudder.)

If you want to work on two 300GB matrices using a machine
with 512GB of RAM, you are going to be using a disc or
SSD, like it or not.  You can leave it up to the paging
subsystem of your OS, which will do its poor best, or you
can explicitly schedule reads and writes in your program,
and if you use asynchronous I/O it might overlap quite nicely.

Assuming for the sake of arithmetic that your matrix
elements are complex numbers represented as pairs of
double precision floats, that's 16 bytes per element,
or 300e9/16 = 1.975e10 elements = n^2 elements where n = 136,930.
Other than adding, subtracting, and multiplying by a scalar,
there's not much you can do with an nxn matrix that won't take
time proportional to n^3.

Is there any way you can divide the matrix into (possibly
overlapping) blocks and do the work on a cluster?  Or a
block at a time?


On Wed, 12 Apr 2023 at 15:54, Shunran Zhang <
szhang using ngs.gen-info.osaka-u.ac.jp> wrote:

> Thanks for the info.
>
> For the data type, my matrix as of now is indeed a matrix in a perfect
> square shape filled in a previous segment of code, but I believe I could
> extract one row/column at a time to do some processing. I can also
> change that previous part of code to change the data type of it to
> something else if that helps.
>
> Saving it to a file for manipulation and reading it back seems to be
> quite IO intensive - writing 600G of data and reading 300G back from a
> hard drive would make the code extremely heavy as well as wear my disk
> quite badly.
>
> For now I'll try the row-by-row method and hope it works...
>
> Sincerely,
> S. Zhang
>
>
> On 2023/04/12 12:39, avi.e.gross using gmail.com wrote:
> > The example given does not leave room for even a single copy of your
> matrix
> > so, yes, you need alternatives.
> >
> > Your example was fairly trivial as all you wanted to do is subtract each
> > value from 100 and replace it. Obviously something like squaring a matrix
> > has no trivial way to do without multiple copies out there that won't
> fit.
> >
> > One technique that might work is a nested loop that changes one cell of
> the
> > matrix at a time and in-place. A variant of this might be a singe loop
> that
> > changes a single row (or column) at a time and in place.
> >
> > Another odd concept is to save your matrix in a file with some format you
> > can read back in such as a line or row at a time, and then do the
> > subtraction from 100 and write it back to disk in another file. If you
> need
> > it again, I assume you can read it in but perhaps you should consider
> how to
> > increase some aspects of your "memory".
> >
> > Is your matrix a real matrix type or something like a list of lists or a
> > data.frame? You may do better with some data structures that are more
> > efficient than others.
> >
> > Some OS allow you to use virtual memory that is mapped in and out from
> the
> > disk that allows larger things to be done, albeit often much slower. I
> also
> > note that you can remove some things you are not using and hope garbage
> > collection happens soon enough.
> >
> > -----Original Message-----
> > From: R-help <r-help-bounces using r-project.org> On Behalf Of Shunran Zhang
> > Sent: Tuesday, April 11, 2023 10:21 PM
> > To: r-help using r-project.org
> > Subject: [R] Matrix scalar operation that saves memory?
> >
> > Hi all,
> >
> > I am currently working with a quite large matrix that takes 300G of
> > memory. My computer only has 512G of memory. I would need to do a few
> > arithmetic on it with a scalar value. My current code looks like this:
> >
> > mat <- 100 - mat
> >
> > However such code quickly uses up all of the remaining memory and got
> > the R script killed by OOM killer.
> >
> > Are there any more memory-efficient way of doing such operation?
> >
> > Thanks,
> >
> > S. Zhang
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list