Hi anonymous guest,
On 03/06/2014 01:43 PM, guest [guest] wrote:
> Dear R user,
Note that this is the Bioconductor mailing list. Looks like your
question is a general question R question, not a Bioconductor
specific one.
> I have a matrix like:
> ID group1 group2 group3
> s1 0 2 3
> s2 1 0 4
> s1 3 4 1
> s4 2 2 0
> I would like to sum the values with same ID to have the matrix as below:
> ID group1 group2 group3
> s1 3 6 4
> s2 1 0 4
> s4 2 2 0
> I checked aggregate() may help to complete this job, but unfortunately I have the error message when I do this.
>> all.data <- read.csv("test.csv")
Note that 'all.data' is a data.frame, not a matrix.
>> aggregate(group1 ~ ID, data=all.data, FUN=sum)
> Error in eval(expr, envir, enclos) : object 'ID' not found
Trying with a matrix:
m <- matrix(sample(12L), ncol=3)
ID <- c("s1", "s2", "s1", "s4")
rownames(m) <- ID
colnames(m) <- paste0("group", 1:3)
Then:
> m
group1 group2 group3
s1 1 9 7
s2 11 12 10
s1 2 5 6
s4 8 3 4
> aggregate(group1 ~ ID, data=m, FUN=sum)
ID group1
1 s1 3
2 s2 11
3 s4 8
aggregate() will probably be too slow anyway on a matrix with many many
rows (hundreds of thousands or more). Here is a faster solution that
leverages the IRanges infrastructure:
library(IRanges)
m2 <- apply(m, 2, function(x) sum(splitAsList(x, ID)))
Cheers,
H.
PS: IRanges is a Bioconductor package.
> Please help me to generate the sum for the matrix. It's been appreciated for any help.
>
> Thanks a lot
