[BioC] problem with data processing in R

Thomas Girke thomas.girke at ucr.edu
Thu Dec 10 23:50:31 CET 2009


I am not sure if I understand every part of your problem correctly,
but here is an example how something like this could be done in R.
Its main idea is to keep the entire data set in one matrix and use
the cell note feature of heatmap.2 for sample tracking. 

## Sample matrix for demo purpose. If your 
y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""), paste("t", 1:5, sep="")))

## Sort each row by its values
mydata <- t(apply(y, 1, sort))

## Obtain sample labels (column titles) for sorted rows
mysamples <-  t(apply(y, 1, function(x) names(sort(x))))

## Plot heatmap where the sample labels are given as cell notes for tracking purposes
library(gplots)
heatmap.2(mysort, dendrogram="none", Rowv=F, Colv=F, col=redgreen(75), scale="row", trace="none", key=T, cellnote=mysamples

Thomas


On Thu, Dec 10, 2009 at 03:13:31PM +0100, Maxim wrote:
> Hi,
> 
> 
> I'm stuck with parsing data into R for heatmap representation.
> 
> 
> The data looks like:
> 
> 1 id1 x1 x2 x3 .... x20
> 
> 2 id1 x1 x2 x3 .... x20
> 
> 3 id1 x1 x2 x3 .... x20
> 
> 4 id1 x1 x2 x3 .... x20
> 
> .........
> 
> 348 id2 x1 x2 x3 .... x20
> 
> 349 id2 x1 x2 x3 .... x20
> 
> 350 id2 x1 x2 x3 .... x20
> 
> 351 id2 x1 x2 x3 .... x20
> 
> .........
> 
> 
> 
> The data is sorted for the IDs (id1,id2 .....id40) and I like to produce 40
> heatmaps thereof, 1 heatmap per data corresponding to a single ID.  The data
> that has to be plotted is 20 values (x1 to x20). There is different amounts
> of data for respective IDs. In the end I'd like to have the 40 heatmaps
> stacked on top of each other sorted by ID and heatmap heights according to
> the amount (number of rows) of data. Unfortunately the individual data lines
> have to be sorted with respect to the maximum of the values X1 to x20 in
> individual rows. Actually this not that important as I guess this might be
> easier to realize in upstream Perl scripts producing the data.
> 
> 
> The data is available as data per ID in individual files or as a sorted file
> with the complete dataset (as shown above).
> 
> 
> Is it possible in R to break a file as above into distinct blocks (depending
> on ID) and then to process it (sorting according to maximum, heatmap)?
> 
> 
> Which commands do I have to issue for the manipulation of the data.frame? I
> tried the
> 
> 
> I'd be glad  if someone could help me finding the correct direction to solve
> my problem!
> 
> 
> Best regards
> 
> 
> Maxim
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list