[R] Odp: log2() and -min() very quick question

Mon Jun 13 18:14:56 CEST 2011

Hi

r-help-bounces at r-project.org napsal dne 13.06.2011 17:59:03:

> Ben Ganzfried <ben.ganzfried at gmail.com> 
> Odeslal: r-help-bounces at r-project.org
> 
> 13.06.2011 17:59
> 
> Komu
> 
> r-help at r-project.org
> 
> Kopie
> 
> Předmět
> 
> [R] log2() and -min() very quick question
> 
> I'm looking over good-code a post-doc in my lab wrote and trying to 
learn
> how it works.  I came across the following:
> rel.abundance <- 
as.matrix(read.delim("rel.abundance.csv",row.names=1,as.is
> =TRUE))
> rel.abundance <- log2(rel.abundance-min(rel.abundance)+1)
> 
> I'm not sure what the second line is doing.  I ran each line in R and
> couldn't see a noticeable difference in the output.  I assume log2() 
takes
> the log base 2 of the values?  I'm not clear what -min(rel.abundance) is
> doing either...my hunch would be that it would take the smallest value 
in
> each row?

No. If rel.abundance is matrix min(rel.abundance) is overall minimum

> mat<-matrix(1:12, 3,4)
> min(mat)
[1] 1

so
log2(rel.abundance-min(rel.abundance)+1)

subtract minimum value from all numbers, after that it add 1 do all 
numbers, takes log base 2 from each number and returns matrix with the 
same dimensions as input matrix.

> I'd really like to figure out:
> 1) What's actually going on?
> 2) Is there a good way to run a command over a large dataset in R and 
better
> be able to tell what is going on?  More specifically, when I run each 
line
> in R it looks something like this (w/ dif. values per row):
> Archaea|Euryarchaeota|Methanobacteria|Methanobacteriales|
> 
Methanobacteriaceae|Methanobrevibacter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 
0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,3,0,0,0,0,0,
> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,
> 0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,23,0,3,0,0,0
> 
> 
> There are a lot of cells w/ values per row, which is one reason why I 
think
> it is difficult to detect a pattern....

there are some summary and structure commands

summary(data) or str(data)

which can tell you some overall information about your data.

Regards
Petr

> 
> Thanks in advance!
> 
> Ben
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.