[R] Cube of Matrices or list of Matrices
Ben Tupper
btupper at bigelow.org
Tue Jan 20 03:11:38 CET 2015
Hi,
On Jan 19, 2015, at 5:17 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote:
> Thanks Ben.
> I need to learn more about apply. Have you a link or tutorial about apply. R documentation is very short.
>
> How can obtain:
> z <- list (Col1, Col2, Col3, Col4......)?
>
This may not be the most efficient way and there certainly is no error checking, but you can wrap one lapply within another as shown below. The innermost iterates over your list of input matrices, extracting one column specified per list element. The outer lapply iterates over the various column numbers you want to extract.
getMatrices <- function(colNums, dataList = x){
# the number of rows required
n <- max(sapply(dataList, nrow))
lapply(colNums, function(x, dat, n) { # iterate along requested columns
do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along input data list
}, dataList, n)
}
getMatrices(c(1,3), dataList = x)
If we are lucky, one of the plyr package users might show us how to do the same with a one-liner.
There are endless resources online, here are some gems.
http://www.r-project.org/doc/bib/R-books.html
http://www.rseek.org/
http://www.burns-stat.com/documents/
http://www.r-bloggers.com/
Also, I found "Data Manipulation with R" ( http://www.r-project.org/doc/bib/R-books_bib.html#R:Spector:2008 ) helpful.
Ben
> Thanks
>
> Ô__
> c/ /'_;~~~~kmezhoud
> (*) \(*) ⴽⴰⵔⵉⵎ ⵎⴻⵣⵀⵓⴷ
> http://bioinformatics.tn/
>
>
>
> On Mon, Jan 19, 2015 at 8:22 PM, Ben Tupper <btupper at bigelow.org> wrote:
> Hi again,
>
> On Jan 19, 2015, at 1:53 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote:
>
>> Yes Many thanks.
>> That is my request using lapply.
>>
>> do.call(cbind,col1)
>>
>> converts col1 to matrix but does not fill empty value with NA.
>>
>> Even for
>>
>> matrix(unlist(col1), ncol=5,byrow = FALSE)
>>
>>
>> How can get Matrix class of col1? And fill empty values with NA?
>>
>
> Perhaps best is to determine the maximum number of rows required first, then force each subset to have that length.
>
> # make a list of matrices, each with nCol columns and differing
> # number of rows
> nCol <- 3
> nRow <- sample(3:10, 5)
> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol = nc, nrow = x)}, nCol)
> x
>
> # make a simple function to get a single column from a matrix
> getColumn <- function(x, colNum, len = nrow(x)) {
> y <- x[,colNum]
> length(y) <- len
> y
> }
>
> # what is the maximum number of rows
> n <- max(sapply(x, nrow))
>
> # use the function to get the column from each matrix
> col1 <- lapply(x, getColumn, 1, len = n)
> col1
>
> do.call(cbind, col1)
> [,1] [,2] [,3] [,4] [,5]
> [1,] 3 8 5 7 9
> [2,] 4 9 6 8 10
> [3,] 5 10 7 9 11
> [4,] NA 11 8 10 12
> [5,] NA 12 9 11 13
> [6,] NA 13 NA 12 14
> [7,] NA 14 NA 13 15
> [8,] NA 15 NA NA 16
> [9,] NA NA NA NA 17
>
> Ben
>
>> Thanks
>> Karim
>>
>>
>> Ô__
>> c/ /'_;~~~~kmezhoud
>> (*) \(*) ⴽⴰⵔⵉⵎ ⵎⴻⵣⵀⵓⴷ
>> http://bioinformatics.tn/
>>
>>
>>
>> On Mon, Jan 19, 2015 at 4:36 PM, Ben Tupper <ben.bighair at gmail.com> wrote:
>> Hi,
>>
>> On Jan 18, 2015, at 4:36 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote:
>>
>> > Dear All,
>> > I am trying to get correlation between Diseases (80) in columns and
>> > samples in rows (UNEQUAL) using gene expression (at less 1000,numeric). For
>> > this I can use CORREP package with cor.unbalanced function.
>> >
>> > But before to get this final matrix I need to load and to store the
>> > expression of 1000 genes for every Disease (80). Every disease has
>> > different number of samples (between 50 - 500).
>> >
>> > It is possible to get a cube of matrices with equal columns but unequal
>> > rows? I think NO and I can't use array function.
>> >
>> > I am trying to get à list of matrices having the same number of columns but
>> > different number of rows. as
>> >
>> > Cubist <- vector("list", 1)
>> > Cubist$Expression <- vector("list", 1)
>> >
>> >
>> > for (i in 1:80){
>> >
>> > matrix <- function(getGeneExpression[i])
>> > Cubist$Expression[[Disease[i]]] <- matrix
>> >
>> > }
>> >
>> > At this step I have:
>> > length(Cubist$Expression)
>> > #80
>> > dim(Cubist$Expression$Disease1)
>> > #526 1000
>> > dim(Cubist$Expression$Disease2)
>> > #106 1000
>> >
>> > names(Cubist$Expression$Disease1[4])
>> > #ABD
>> >
>> > names(Cubist$Expression$Disease2[4])
>> > #ABD
>> >
>> > Now I need to built the final matrices for every genes (1000) that I will
>> > use for CORREP function.
>> >
>> > Is there a way to extract directly the first column (first gene) for all
>> > Diseases (80) from Cubist$Expression? or
>> >
>>
>> I don't understand most your question, but the above seems to be straight forward. Here's a toy example:
>>
>> # make a list of matrices, each with nCol columns and differing
>> # number of rows, nRow
>> nCol <- 3
>> nRow <- sample(3:10, 5)
>> x <- lapply(nRow, function(x, nc) {matrix(x:(x + nc*x - 1), ncol = nc, nrow = x)}, nCol)
>> x
>>
>> # make a simple function to get a single column from a matrix
>> getColumn <- function(x, colNum) {
>> return(x[,colNum])
>> }
>>
>> # use the function to get the column from each matrix
>> col1 <- lapply(x, getColumn, 1)
>> col1
>>
>> Does that help answer this part of your question? If not, you may need to create a very small example of your data and post it here using the head() and dput() functions.
>>
>> Ben
>>
>>
>>
>> > I need to built 1000 matrices with 80 columns and unequal rows?
>> >
>> > Cublist$Diseases <- vector("list", 1)
>> >
>> > for (k in 1:1000){
>> > for (i in 1:80){
>> >
>> > Cublist$Diseases[[gene[k] ]] <- Cubist$Expression[[Diseases[i] ]][k]
>> > }
>> >
>> > }
>> >
>> > This double loops is time consuming...Is there a way to do this faster?
>> >
>> > Thanks,
>> > karim
>> > Ô__
>> > c/ /'_;~~~~kmezhoud
>> > (*) \(*) ⴽⴰⵔⵉⵎ ⵎⴻⵣⵀⵓⴷ
>> > http://bioinformatics.tn/
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
>
>
>
>
>
>
>
>
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org
[[alternative HTML version deleted]]
More information about the R-help
mailing list