[R] Removing columns from big.matrix which have only one value

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Sat Apr 21 22:54:33 CEST 2018


Maybe base R's  unique() function might be useful? It uses hashing I
believe.

Bert

On Sat, Apr 21, 2018, 12:17 PM Jack Arnestad <jackarnestad using gmail.com> wrote:

> I have a very large binary matrix, stored as a big.matrix to conserve
> memory (it is over 2 gb otherwise - 5 million columns and 100 rows).
>
> r <- 100
> c <- 10000
> m4 <- matrix(sample(0:1,r*c, replace=TRUE),r,c)
> m4 <- cbind(m4, 1)
> m4 <- as.big.matrix(m4)
>
> I need to remove every column which has only one unique value (in this
> case, only 0s or only 1s). Because of the number of columns, I want to be
> able to do this in parallel.
>
> How can I accomplish this while keeping the data compressed as a
> big.matrix? I can convert it into a df and loop over the columns looking
> for the number of unique values, but this takes too much RAM.
>
> Thanks!
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list