[BioC] FlowViz graphics problems

Thu May 21 20:50:27 CEST 2009

Anja Schiel wrote:
> Dear Deepayan, 
>
> Thanks a lot for your help! I very much appreciate that you take the
> time to help me. Bellow some comments and another question, if you have
> time.....
>
> On Wed, 2009-05-20 at 14:13 -0700, Deepayan Sarkar wrote:
>   
>> On Wed, May 20, 2009 at 6:06 AM, Anja Schiel <a.e.schiel at medisin.uio.no> wrote:
>>     
>>> Hi,
>>>
>>> I am currently testing flowCore and flowViz and have encountered some
>>> problems.
>>>
>>> I am running :
>>> R version 2.9.0 (2009-04-17)
>>> i486-pc-linux-gnu
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods
>>> base
>>>
>>> other attached packages:
>>> [1] flowViz_1.8.0    lattice_0.17-25  flowCore_1.10.0  rrcov_0.5-01
>>> [5] pcaPP_1.6        mvtnorm_0.9-5    robustbase_0.4-5 Biobase_2.4.1
>>>
>>> loaded via a namespace (and not attached):
>>>  [1] feature_1.2.3      graph_1.22.2       grid_2.9.0
>>> KernSmooth_2.22-22
>>>  [5] ks_1.6.3           latticeExtra_0.5-4 MASS_7.2-47
>>> RColorBrewer_1.0-2
>>>  [9] stats4_2.9.0       tools_2.9.0
>>>
>>>
>>> I have noticed that when I use
>>> xyplot(`SSC-H` ~ `FSC-H`, data = fs.trans[[1]], filter = eGate)
>>> I get a plot with the Gate defined by eGate plotted, but when I try to
>>> do the same with
>>> flowPlot(fs.trans[[1]], filter = eGate)
>>> the gate is not drawn. Since the default settings seem to be filter =
>>> NULL (and I pass eGate to filter) and showFilter = TRUE I am wondering
>>> if this is a glinch in the system or if my command is wrong.
>>>       
>> The 'flowPlot' function is not really maintained any more; the method
>> for "flowFrame" does have a 'filter' argument, but it is never used in
>> the actual function definition. xyplot() should be able to do
>> everything flowPlot does. If not, please let us know.
>>
>>     
>>> Second I am somewhat confused about the plot function. When I transform
>>> my FL-H signals with
>>> fs.trans <- transform('FL1-H' = asinh, 'FL2-H' = asinh) %on% fs
>>> and then run
>>> plot (fs.trans[[7]], 'FL1-H', breaks=256)
>>> I get a histogram with all my data crammed into the left corner due to
>>> the y-axis scale that seems to be extremely large.
>>>       
>> The following seems to work for me:
>>
>> data(GvHD)
>> fs <- GvHD
>> fs.trans <- transform('FL1-H' = asinh, 'FL2-H' = asinh) %on% fs
>> plot(fs[[7]], "FL1-H") ## most of the data in left-most bin
>> plot(fs.trans[[7]], "FL1-H") ## much more spread out
>>
>> So we need a reproducible example to figure out why you are seeing
>> different behavior.
>>
>>     
> It seems that my original FACS data must be different than the one used
> for GvHD. I do see a relatively normal histogram with the GvHD data set,
> as you point out. But when I use my own files the y-axis is set to 30
> 000 and not like in the example to 3000. 
>
> By the way, the problem becomes more obvious if I actually set the
> breaks. If i set the breaks=256 then my graph gets extremely 'small', if
> I do the same with the GvHD the effect is not the same.
>
> I have attached 3 png files to show you what I get.
>
> I could sent you some of my original files (like one with no signal in
> FL1-H and one with a signal), but I am not sure if I can just attach
> those to an e-mail. Maybe I can sent you an zip archive?
>   
Hi Anja,
maybe I can chip in here:
It seems that your data has tons of values on the lower measurement 
margin, quite a common problem for flow data. The more sophisticated 
plotting functions in flowViz (e.g. densityplot) try to ignore these 
artefactual values, the simple histogram in the plot function does not. 
You have a couple of options here:
1.) remove those values before plotting. There is the boundaryFilter 
function which should help you do that. Please note that margin events 
might not be particularly informative in one channel, but they might 
have perfectly fine values in others, so blindly removing them for other 
purposes than visualization is usually not a good idea.
2.) Play around with ylim once this is fixed. This however will only 
clip the extremely large bin at around 0 and the picture might not be 
particularly nice.
3.) If you have multiple FCS files in a flowSet you could use the 
densityplot function. As mentioned before, this should ignore the margin 
events, although they are still indicated in the plot by little bars, as 
far as I remember. I guess there should be a densityplot method for 
flowFrames as well, and I will talk to Deepayan to add this for the next 
release.
>   
>>> Also the axis changes
>>> between the files. I have tried to figure out how this function works
>>> (checked the normal and lattice information), but I am clearly not
>>> understanding what is the underlying set of data points that determines
>>> the y-axis scale. I would like to know how to reduce the y-axis scale
>>> and keep it constant between different files (at least if this is not
>>> something totally stupid to try).
>>>       
>> I'm not sure what you mean. Different calls with different flow frames
>> will have different scales, based on the data for that frame. You
>> should be able to explicitly specify 'xlim' and 'ylim' to be the same
>> in all calls. This doesn't work now, and that's a bug.  We will fix it
>> soon.
>>
>>     
> Well that is related to the problem above. I tried to pass xlim and ylim
> to the plot but nothing happened. I didn't get an error message either
> so I thought I was doing it wrong (but if I define xlim and ylim in
> xyplot it does work). So I was a bit confused. But if it is a bug then
> in theory what I tried was correct and once it is fixed it should work.
> In principal my idea was that I could just force the scale to be smaller
> on the plot to make my data look better. It might also be necessaryr if I
> need files for presentations or publications as it is usually expected
> that all axis are of the same scale.
That is a reason why I prefer density plots. They are already scaled in 
a way, and having similar axes is much easier. Hard to do that on a 
frequency histogram when your sample sizes differ a lot...
>  
>
>   
>> You could always use densityplot() instead to compare multiple FCS files.
>>
>> densityplot(~`FL1-H` | names, data = fs.trans[1:5])
>>
>>     
> And in fact this is probably a better way to combine several 'signals'
> in one plot to be honest.
>
>   
>>> Third, I have created densityplots and noticed that the order of files
>>> is not like the order in the phenoData info. In phenoData the files are
>>> ordered according to their file-names (or more precise by the trailing
>>> numbers given by CellQuest), while they are plotted in some kind of
>>> alphabetical order in densityplot. Is it possible to pass an argument to
>>> densityplot that will plot the files in the file-names order?
>>>       
>> Yes, that's the default for factor levels (see ?factor) when the file
>> names get converted to a factor. You can control the order by
>> specifying the levels explicitly. For example, compare:
>>
>> densityplot(factor(name, levels = rev(unique(name))) ~`FL1-H`, data =
>> fs.trans[1:5])
>>
>> and
>>
>> densityplot(factor(name, levels = unique(name)) ~`FL1-H`, data = fs.trans[1:5])
>>
>>     
> Exactly what I needed! Perfect.
>
>   
>>> And is it also possible to have the plot in black and white and not in color?
>>>       
>> Yes, e.g.
>>
>> densityplot( ~`FL1-H`, data = fs.trans[1:5], par.settings =
>> standard.theme(color = FALSE))
>>
>> See ?trellis.device and ?flowViz.par.get for more details.
>>
>>     
> I admit that the higher plot functions are still a bit of a miracle to
> me but I think I start getting how to change some higher level
> functions. 
>   
If you are not afraid of a lattice overdose, I strongly recommend 
Deepayan's Springer book...
>   
>>> I have also tried different gates and managed to create ellipsoid,
>>> rectangular and n2Filter, but failed to produce a polygon gate. Could
>>> anyone provide me with an simple example how to do that?
>>>       
>> See the example in ?polygonGate.
>>
>>     
> Ok, I missed this example, my fault. I have now managed to get an
> polygon gate!
>
>   
>>> And a last question, how could I produce a densityplot where I have an
>>> overaly instead of shingles for several files in one figure (such as is
>>> often used for publications, to show the shift from unstained, isotype
>>> control to specific staining).
>>>       
>> Unfortunately that's not yet supported by  the "flowSet" densityplot
>> method. You could however use the underlying lattice functions
>> directly to get what you want:
>>
>> tmpe <-
>>     fsApply(fs.trans[1:6],
>>             function(x) exprs(x)[, "FL1-H"],
>>             simplify = FALSE)
>>
>> densityplot(~data, do.call(make.groups, tmpe), groups = which,
>>             plot.points = FALSE, auto.key = list(columns = 3))
>>
>>     
> This works nicely too. Thanks.
>
>   
>> -Deepayan
>>     
>
> I have another question now. I was wondering if I can use four
> rectangular gates at the same time. This would be a bit like having
> quadrant statistics in other FACS software. I tried this by creating
> four gates with
> UL <- rectangleGate(filterId='UL', 'SSC-H' = c(400, 1000), 'FSC-H' =
> c(0,500))
> LL <- rectangleGate(filterId='LL', 'SSC-H' = c(0, 400), 'FSC-H' =
> c(0,500))
> UR <- rectangleGate(filterId='UR', 'SSC-H' = c(400, 1000), 'FSC-H' =
> c(500,1000))
> LR <- rectangleGate(filterId='LR', 'SSC-H' = c(0, 400), 'FSC-H' =
> c(500,1000))
> Now I can create subsets for each of these gates and get the percentages
> gated and create results with
> result_LL <- filter(fs.trans[1:4], LL)
> Percent.LL <- lapply(result_LL, summary)
> Percent.LL
> I have used FSC and SSC for this example but obviously this is something
> I eventually want to do with two fluorescent channels to identify double
> positive populations.
>
> I was wondering if I can now create a graph in which all 4 gates are
> plotted and the percentage in each gate is also plotted?
>   
In this case you could directly use the quadGate class (see ? quadGate). 
flowViz knows how to plot those.
data(GvHD)
foo <- GvHD[[1]]
qg <- quadGate("FSC-H"=500, "SSC-H"=400)
xyplot(`FSC-H` ~ `SSC-H`, foo, filter=qg)

Adding additional gates to a plot is also possible using the glpolygon 
or glpoints methods for trellis-type plots and the gpoints or glines 
methods for base graphics plots. For the trellis plots you could either 
adjust the panel function to deal with multiple gates and call glpolygon 
in there, or you could use trellis.focus() to get to a particular panel 
in your plot (by clicking on it, or you get it for free if there is only 
one...) and now you can interactively add whatever you like. 
trellis.unfocus() will get rid of the red boundary after you are done.

xyplot(`FSC-H` ~ `SSC-H`, foo)
trellis.focus()
glpolygon(UL, gpar=list(gate=list(col="black", fill="red", alpha=0.2)))
glpolygon(LL, gpar=list(gate=list(col="black", fill="blue", alpha=0.2)))
glpolygon(UR, gpar=list(gate=list(col="black", fill="green", alpha=0.2)))
glpolygon(LR, gpar=list(gate=list(col="black", fill="black", alpha=0.2)))
trellis.unfocus()

Florian

> I have failed in plotting all 4 gates and I looked at the filterSet
> function but I am not sure if making a filter set is the right way to do
> this? And in fact I am not sure that this is possible at all. But from
> what I have figured out about lattice I thought that it is possible to
> 'add' further information to a graph after it is created. Maybe you can
> point me in the right direction how to do this?
>
> And I would like to thank you for the time and effort you have put into
> making this package for Flow-data. I have been searching for
> Flow-software working in Linux some time now and really this is the
> first time I have come across something that allows me to get some nice
> output and at the same time control over what I am doing with my data. I
> have used R mainly for microarray data in the past, so my learning curve
> wasn't that steep this time, so this might not be true for first time
> users. But I can only recommend taking the time to learn how to use
> flowViz to anyone looking for Linux based Flow-software.
>
> Thanxs,
>
> Anja
>
>   
>
> ------------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Florian Hahne, PhD
Computational Biology Program
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
PO Box 19024
Seattle, Washington 98109-1024
206-667-3148
fhahne at fhcrc.org