[BioC] How to read in log2 ratio data

Thomas Hampton thomas.h.hampton at dartmouth.edu
Wed Apr 6 16:46:53 CEST 2011


Hi Peter,

Jim (as always) made a great point regarding what sort of a question  
you are asking.

In any case, here are three thoughts.

First, there are a zillion ways to read data into R. This simple step  
can be unbelievably
easy (or not). Look a the R function read.table.

Second, the business of determining what is differentially expressed   
largely depends on
how you want to define "differentially expressed".  Back in the day,  
people often chose
to define this as different by two or more fold, and in practice much  
more sophisticated
definitions still rely on fold change as part of their definition,  
because no one really cares
about minute changes, no matter how significant they are. On the other  
hand, no one cares about
mean differences that appear to be driven by a single oddball  
observation.  Last but not least, there is
the business of adjusting p values for multiple hypothesis tests.  
Limma handles all these considerations
well, but it is still up to you to determine how you want to define  
differential expression.

Third, if these were my data, I would first like to establish that  
subjects in the same experimental
group were more similar to each other than subjects from different  
groups. One way to do this is to identify genes
that show high variability across samples and then see whether samples  
from subjects in the same group cluster
together based on these genes. If they don't, one generally adopts a  
more skeptical point of view about the experiment:
biological variability may overwhelm whatever treatment effect you  
have set out to find.

Good luck,

Tom


On Apr 6, 2011, at 4:25 AM, Peter Davidsen wrote:

> Hi all,
>
> I would like to conduct a time-course analysis using the limma  
> package on my
> chip data (run as dual-color). I have two classes/groups with 8  
> subjects in
> each. Each 'experimental unit' has been measured at three different  
> time
> points.
> However, I already have all the data as lowess normalized log-ratios  
> =>
> log2(Hy3/Hy5).  How do I read in my txt-file with my log2 ratio data  
> into R?
> And how do I define a vector/data frame?
>
> I have arranged the data  so I have probe ID in the first column  
> (row 2 to
> 200) and individual slide data in the following columns (that is,  
> slide 1
> data in column 2, and slide 2 data in column 3 and so on...). I have  
> 48
> slides in total.
>
> The main question I want to answer is which genes are differentially
> expressed between the two groups of subjects - at time point 1, 2,  
> and 3,
> respectively.
>
> Cheers,
> Peter
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list