[BioC] take the average log fc for each gene

James W. MacDonald jmacdon at uw.edu
Mon Aug 12 21:14:42 CEST 2013


Hi Helen,

On 8/12/2013 2:55 PM, Helen Smith wrote:
> Hi All.
>
> Apologies as I'm just getting to grips with R.
>
> I have a set of genes and log fold changes.
> As the genes have been converted from Affymetrix probes, and up to 11 probes represent one gene, I have a range of different log fold changes for each gene. I would like to take the average log fc for each gene when duplicated in the list.

This part isn't quite clear to me. I can't tell if you are just 
summarizing the data using RMA or something similar and are then trying 
to collapse the data over duplicated genes, or if you are doing 
something at the probe level.

I think you are trying to do the latter, but will you please let us know 
before we talk about your code below?

Best,

Jim



>
> I used the script below but as I'm a bit of a novice it isn't working too well and I get the error message stated at the bottom:
>
> dat<-read.table("test.txt")
> dim(dat)
> dat[1:664,1:2]
>
> Gene<-dat[,1]
> fc<-dat[,2]
>
> LogFC<-matrix(NA,664,1)
> for(i in 1:664){
>                  for(j in 1:1){
>                                  LogFC[i]<-fc[i]
>                                  }
>                  }
> fc[1:664,1]
>
> ####Take Average logfc of multiples of the same gene####
> gid<-unique(Gene)
> length(gid)
> mGene<-matrix(NA,640,1)
> mGene
> for(i in 1:640){
>                                  rid<-which(Gene==gid[i])
>                                  for(j in 1:1){
>                                                  mGene[i]<-mean(fc[rid,j])
>                                  }
>                  }
> ####I get the error message "Error in fc[rid, j] : incorrect number of dimensions"
>
> Any help would be much appriciated,
>
> Many thanks everyone!
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list