[BioC] Median-Interquartile Normalization

Thu Nov 8 12:10:26 CET 2007

Dear BioC Users,

Does anyone have a reference for (or a good definition of) median-interquartile normalization?  Is it the same as interquartile normalization? (If not, I would like a reference/definition of both)  I have looked for such a definition in the literature, but I haven't found a satisfactory one.

Thank you,
Monnie McGee

Monnie McGee, Ph.D.
Associate Professor
Department of Statistical Science
Southern Methodist University
Ph: 214-768-2462
Fax: 214-768-4035

-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch on behalf of bioconductor-request at stat.math.ethz.ch
Sent: Thu 11/8/2007 5:00 AM
To: bioconductor at stat.math.ethz.ch
Subject: Bioconductor Digest, Vol 57, Issue 8

Send Bioconductor mailing list submissions to
	bioconductor at stat.math.ethz.ch

To subscribe or unsubscribe via the World Wide Web, visit
	https://stat.ethz.ch/mailman/listinfo/bioconductor
or, via email, send a message with subject or body 'help' to
	bioconductor-request at stat.math.ethz.ch

You can reach the person managing the list at
	bioconductor-owner at stat.math.ethz.ch

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioconductor digest..."

Today's Topics:

   1. Re: Limma and Imagene flags (Leonardo Rocha)
   2. Re: Problem running eBayes/lmFit (smohapat at vbi.vt.edu)
   3. Combine dataset from different chip (Tom)
   4. Re: Problem running eBayes/lmFit (elliott harrison)
   5. Re: Affymetrix reannotation (Samuel Wuest)
   6. Limma: how to combine duplicateCorrelation, dyeeffect and
      arrayweights? (dorthe.belgardt at medisin.uio.no)
   7. Re: how to build a GOstats-compatible annotation package for
      plasmodium falciparum? (Marc Carlson)
   8. Re: Affymetrix reannotation (James W. MacDonald)
   9. GOTERM (Mete Civelek)
  10. Question about a packages not yet in Bioconductor:
      AffyProbeMiner in R260 under windows (phguardiol at aol.com)
  11. Re: GOTERM (Robert Gentleman)
  12. Re: Question about a packages not yet in Bioconductor:
      AffyProbeMiner in R260 under windows (Robert Gentleman)
  13. Re : Question about reannotated Affy annotation files -
      Question about AffyProbeMiner in R260 under windows
      (phguardiol at aol.com)
  14. Re: Re : Question about reannotated Affy annotation files -
      Question about AffyProbeMiner in R260 under windows (Jarno Tuimala)
  15. optimizazion cluster in Heatmap (Alessandro Fazio)

----------------------------------------------------------------------

Message: 1
Date: Wed, 7 Nov 2007 21:02:22 -0200
From: "Leonardo Rocha" <leobernardesrocha at gmail.com>
Subject: Re: [BioC] Limma and Imagene flags
To: <bioconductor at stat.math.ethz.ch>
Message-ID: <002001c82192$434cfd40$2d6691c8 at Biela>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
	reply-type=original

Hello Sally,

You should try the following commands:

myfun <- function(x) as.numeric(x$Flag ==0)

RG <- read.imagene(filenames$target, wt.fun=myfun)

I hope it helps.

Good luck.

Leonardo

----- Original Message ----- 
From: "Sally" <sagoldes at shaw.ca>
To: <bioconductor at stat.math.ethz.ch>
Sent: Tuesday, November 06, 2007 9:36 PM
Subject: Re: [BioC] Limma and Imagene flags

> >From reading the Limma User's Guide it says that a flag value of 0 means 
> >that Limma considers this a "bad spot" and removes data flagged 0 from 
> >subsequent preprocessing.  But in Imagene a flag value of 0 means that 
> >this is a "good spot".  How do you get around this?
>
>
> Sally
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor

------------------------------

Message: 2
Date: Wed, 7 Nov 2007 07:07:02 -0500 (EST)
From: smohapat at vbi.vt.edu
Subject: Re: [BioC] Problem running eBayes/lmFit
To: "elliott harrison" <e.harrison at epistem.co.uk>
Cc: bioconductor at stat.math.ethz.ch
Message-ID:
	<50466.71.171.41.163.1194437222.squirrel at webmail.vbi.vt.edu>
Content-Type: text/plain;charset=iso-8859-1

Hello Elliott:

I am trying to understand the problem here. From the design:

> A9802 A9811 A9813 A9842 SAMPLE1 SAMPLE2 SAMPLE3 SAMPLE4
> 1     0     0     0     0       1       0       0       0
> 2     0     0     0     0       0       1       0       0
> 3     0     0     0     0       0       0       1       0
> 4     0     0     0     0       0       0       0       1
> 5     1     0     0     0       0       0       0       0
> 6     0     1     0     0       0       0       0       0
> 7     0     0     1     0       0       0       0       0
> 8     0     0     0     1       0       0       0       0

I understand that there is one sample of A9802(#5) and another of SAMPLE1
(#1).

In the contrasts, these two groups are compared (OneVOne):
> makeContrasts(OneVOne="A9802-SAMPLE1",OneVTwo="A9802-A9811",TwoVOne="SAM
> PLE1-SAMPLE2",levels=design)
>

I guess that because of the number of samples in each group being one, it
is not possible to calculate variance, and hence the error message.

This is how I understood Gordon's earlier post
(https://stat.ethz.ch/pipermail/bioconductor/2005-May/009056.html):

------------
The "no residual degrees of freedom" message occurs because you have
filtered out so many spots
(by setting the weight to 0) that you have no more than one spot left for
any of the probes.
Hence there is no replication left in your experiment.  No estimate of
variability can be made and
no statistical analysis can be done.
-------------

If anyone knows more clearly, please elaborate.

Saroj

> The matrix seems to be doing what I want
>
>
>> cont.matrix
> Contrasts
> Levels    OneVOne OneVTwo TwoVOne
> A9802         1       1       0
> A9811         0      -1       0
> A9813         0       0       0
> A9842         0       0       0
> SAMPLE1      -1       0       1
> SAMPLE2       0       0      -1
> SAMPLE3       0       0       0
> SAMPLE4       0       0       0
>
>
>> fit2  <- contrasts.fit(fit, cont.matrix) fit2  <- eBayes(fit2)
> Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim =
> stdev.coef.lim) : No residual degrees of freedom in linear model fits
>
>
> I've found a post that says this error message occurs because all data
> is weighted out. I've checked the data after it is loaded, after
> backgroundCorrect and it does not appear to be. Beyond that I doesn't look
> like the normalizeBetweenArrays of RTotalbg$R RTRN has any weights. So I
> must not be setting up the design matrix correctly?
>
> Any and all clues as to where I'm going wrong greatly appreciated.
>
>
>
> Elliott Harrison
>
>
>
>
>
> This message has been scanned for viruses by BlackSpider...{{dropped:3}}
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

------------------------------

Message: 3
Date: Wed, 7 Nov 2007 12:04:07 +0000 (UTC)
From: Tom <tomnovy at email.it>
Subject: [BioC] Combine dataset from different chip
To: bioconductor at stat.math.ethz.ch
Message-ID: <loom.20071107T114711-812 at post.gmane.org>
Content-Type: text/plain; charset=us-ascii

Hi,
I'm trying to merge the data comes from two different microarray chips (hgu133a
and HThgu133a).
I have seen that a lot of probe sets are in common in the two chip.

Have you any suggestion??

------------------------------

Message: 4
Date: Wed, 7 Nov 2007 12:15:43 -0000
From: "elliott harrison" <e.harrison at epistem.co.uk>
Subject: Re: [BioC] Problem running eBayes/lmFit
To: <smohapat at vbi.vt.edu>
Cc: bioconductor at stat.math.ethz.ch
Message-ID:
	<DFDB9D8E7F453A4D9C29C66DE3410D835B1DA7 at server.epistem.local>
Content-Type: text/plain;	charset="us-ascii"

Hi Saroj,

I see so multiple arrays in each group are needed.
So I'll need to do some simpler test between the 2 arrays?
Any suggestions?

Thanks

Elliott 

-----Original Message-----
From: smohapat at vbi.vt.edu [mailto:smohapat at vbi.vt.edu] 
Sent: Wednesday, November 07, 2007 12:07 PM
To: elliott harrison
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Problem running eBayes/lmFit

Hello Elliott:

I am trying to understand the problem here. From the design:

> A9802 A9811 A9813 A9842 SAMPLE1 SAMPLE2 SAMPLE3 SAMPLE4
> 1     0     0     0     0       1       0       0       0
> 2     0     0     0     0       0       1       0       0
> 3     0     0     0     0       0       0       1       0
> 4     0     0     0     0       0       0       0       1
> 5     1     0     0     0       0       0       0       0
> 6     0     1     0     0       0       0       0       0
> 7     0     0     1     0       0       0       0       0
> 8     0     0     0     1       0       0       0       0

I understand that there is one sample of A9802(#5) and another of
SAMPLE1 (#1).

In the contrasts, these two groups are compared (OneVOne):
> makeContrasts(OneVOne="A9802-SAMPLE1",OneVTwo="A9802-A9811",TwoVOne="S
> AM
> PLE1-SAMPLE2",levels=design)
>

I guess that because of the number of samples in each group being one,
it is not possible to calculate variance, and hence the error message.

This is how I understood Gordon's earlier post
(https://stat.ethz.ch/pipermail/bioconductor/2005-May/009056.html):

------------
The "no residual degrees of freedom" message occurs because you have
filtered out so many spots (by setting the weight to 0) that you have no
more than one spot left for any of the probes.
Hence there is no replication left in your experiment.  No estimate of
variability can be made and no statistical analysis can be done.
-------------

If anyone knows more clearly, please elaborate.

Saroj

> The matrix seems to be doing what I want
>
>
>> cont.matrix
> Contrasts
> Levels    OneVOne OneVTwo TwoVOne
> A9802         1       1       0
> A9811         0      -1       0
> A9813         0       0       0
> A9842         0       0       0
> SAMPLE1      -1       0       1
> SAMPLE2       0       0      -1
> SAMPLE3       0       0       0
> SAMPLE4       0       0       0
>
>
>> fit2  <- contrasts.fit(fit, cont.matrix) fit2  <- eBayes(fit2)
> Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim =
> stdev.coef.lim) : No residual degrees of freedom in linear model fits
>
>
> I've found a post that says this error message occurs because all data

> is weighted out. I've checked the data after it is loaded, after 
> backgroundCorrect and it does not appear to be. Beyond that I doesn't 
> look like the normalizeBetweenArrays of RTotalbg$R RTRN has any 
> weights. So I must not be setting up the design matrix correctly?
>
> Any and all clues as to where I'm going wrong greatly appreciated.
>
>
>
> Elliott Harrison
>
>
>
>
>
> This message has been scanned for viruses by 
> BlackSpider...{{dropped:3}}
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

------------------------------

Message: 5
Date: Wed, 07 Nov 2007 15:45:20 +0100
From: "Samuel Wuest" <swuest at botinst.uzh.ch>
Subject: Re: [BioC] Affymetrix reannotation
To: Marc Carlson <mcarlson at fhcrc.org>
Cc: Bioconductor at stat.math.ethz.ch
Message-ID: <web-10826981 at idmailbe2b.unizh.ch>
Content-Type: text/plain;charset=iso-8859-1;format="flowed"

Thanks a lot,

I am familiar with the ath1121501 package. Now I don?t know whether it 
contains informations on the probe-level rather than on the probeset-level 
(e.g. whether a designed probe still targets an updated gene model? or which 
probe-sets are obsolete and can be considered to be outdated?)

And: How can I retrieve such information without writing a blast-script to 
check every single probe against the most updated cDNA database?

Thanks for any help on this.
Samuel

On Tue, 06 Nov 2007 10:31:38 -0800
  Marc Carlson <mcarlson at fhcrc.org> wrote:
> Samuel Wuest wrote:
>> Hi,
>>
>> I trying to assess a modified version of the Affymetrix present/absent 
>>call 
>> algorithm on my data, which have been generated from two-round amplified 
>>RNA 
>> hybridized to the Arabidopsis ATH1 GeneChip.
>>
>> I would need some negative control probesets, and thought of probesets 
>>that 
>> were designed based on wrong gene annotations (similar to what was used in 
>> the PANP method from Peter Warren et al.).
>> Therefore, I wondered whether someone does a reannotation of Affy 
>>probesets 
>> on a regular base? (for example, the version 7 of the Arabidopsis Genome 
>> Annotation has been released recently, and some probesets on the AffyChip 
>> are most probably not matching any gene model anymore).
>>
>> If yes, where could I find the updated datafiles? Is there anything in the 
>> BioC annotation package for the ATH1Chip?
>>
>> Thanks for any help.
>> Best, Sam
>>
>> -----------------------------------------------------
>> Dipl. Bot. Samuel Wust
>> Dep. of Developmental Genetics
>> Institute for Plant Biology
>> University of Zuerich
>> Zollikerstrasse 107
>> CH - 8008 Z?rich
>> Phone: +41-(0)44 634 82 42
>> Mobile: + 41 (0)76 501 69 22
>> Email: swuest at botinst.uzh.ch
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>   
> 
> We provide a "standard" annotation package for this chip.  You can find
> it here:
> 
> http://bioconductor.org/packages/2.1/data/annotation/html/ath1121501.db.html
> 
> 
>    Marc
>

------------------------------

Message: 6
Date: Wed, 7 Nov 2007 17:21:09 +0100 (CET)
From: dorthe.belgardt at medisin.uio.no
Subject: [BioC] Limma: how to combine duplicateCorrelation, dyeeffect
	and arrayweights?
To: bioconductor at stat.math.ethz.ch
Message-ID: <1667.129.240.47.68.1194452469.squirrel at webmail.uio.no>
Content-Type: text/plain;charset=iso-8859-1

Hi,

I am quite insecure if some parts of the analyis I did in Limma are really
correct and I would highly appreciate if someone could take a look and
give advice. My main concern is that I may not use the
duplicatecorrelation, dyeeffect,arrayweights and spotweights correctly.

The arrays I use are printed in duplicates with a spacing of 15000 (so
30000 features in total), and I did the imageprocessing in GenePixPro6.1.
Thereby I flagged all spots close to backgroundsignal and with a rgn r2
<0.5 bad, and only 30% of my data remain unflagged.

And this is what I did using Limma:

> targets=readTargets("Targets_basicSat.txt")
> targets
   SlideNumber          FileName Cy3 Cy5
1            1 3096_basicSat.gpr ref   A
2            2 3079_basicSat.gpr   A ref
3            3 3089_basicSat.gpr ref   A
4            4 3081_basicSat.gpr   A ref
5            5 3071_basicSat.gpr ref   B
6            6 3082_basicSat.gpr   B ref
7            7 3085_basicSat.gpr ref   B
8            8 8268_basicSat.gpr   B ref
9            9 7829_basicSat.gpr ref   C
10          10 3086_basicSat.gpr   C ref
11          11 7823_basicSat.gpr ref   C
12          12 7826_basicSat.gpr   C ref
13          13 3090_basicSat.gpr ref   D
14          14 3091_basicSat.gpr   D ref
15          15 3092_basicSat.gpr ref   D
16          16 7827_basicSat.gpr   D ref

Every other slide is a dyeswapped technical replicate and per "group"
(A,B,C,D) there are 2 biological replicates.

> K=read.maimages(targets$FileName, source="genepix.median",
wt.fun=wtflags(0))
> types=readSpotTypes("SpottypesGAPDH.txt")
> Status=controlStatus(types, K)
> K$genes$Status=Status
> K3=backgroundCorrect(K, method=?normexp?, offset=50)
> K3=normalizeWithinArrays(K3, method="median")
> K3a=normalizeBetweenArrays(K3, method="quantile")
> design=modelMatrix(targets, ref="ref")
> design
      A  B  C  D
 [1,]  1  0  0  0
 [2,] -1  0  0  0
 [3,]  1  0  0  0
 [4,] -1  0  0  0
 [5,]  0  1  0  0
 [6,]  0 -1  0  0
 [7,]  0  1  0  0
 [8,]  0 -1  0  0
 [9,]  0  0  1  0
[10,]  0  0 -1  0
[11,]  0  0  1  0
[12,]  0  0 -1  0
[13,]  0  0  0  1
[14,]  0  0  0 -1
[15,]  0  0  0  1
[16,]  0  0  0 -1

Since I am expecting a non-negligible dyeeffect I created an other
designmatrix and the following contrastMatrix:

>design1=cbind(DyeEffect=1, design)
>design.cont=makeContrasts("A", ?B?, ?A-B", levels=design1)

Next I estimate the correlation of within-array-duplicates:

>cor=duplicateCorrelation(K3b, design=design1, ndups=2, spacing=15000,
weights=K3b$weights)

My first question is: is it correct to use here the designmatrix for the
dyeeffect (design1 in this case)?

When fitting the linear model, I also want to use arrayweights, combined
with spotweights. So I gave following commands:

> aw=arrayWeights(K3b, design=design1)
> w=matvec(K3b$weights, aw)

Again the question: is it correct to use here the "design1"-matrix
considering the dyeeffect?

Then I fit the linear model:

>fit=lmFit(K3b, design=design1, ndups=2, spacing=15000, cor=cor$consensus,
weights=w)
>fit1=contrasts.fit(fit, design.cont)
>eb=eBayes(fit1)

Another thing I am worried about is that taking into account the dyeeffect
plus arrayweights plus spotweights might be a bit "too much"? Like in a
way "overtransforming" my data? Especially since approx 70% of my data
have a spotweight of zero. Might it be better to use the spotweight of 0,1
for bad spots, so that I do not loose the data completely?

My apologies for this long email, I tried hard to find out the answers for
myself reading the limmaguide and lots of other documents I found
googleing, but still feel quite "stuck" in my analysis process.

Thanks very much for any kind of help in advance!
Best regards
Dorthe

-- 
Dorthe Belgardt
Institute of Basic Medical Sciences
Department of Physiology
P.O. Box 1103 Blindern
0317 Oslo
Norway

------------------------------

Message: 7
Date: Wed, 07 Nov 2007 10:47:54 -0800
From: Marc Carlson <mcarlson at fhcrc.org>
Subject: Re: [BioC] how to build a GOstats-compatible annotation
	package for plasmodium falciparum?
To: Paul Shannon <pshannon at systemsbiology.org>
Cc: bioc <bioconductor at stat.math.ethz.ch>
Message-ID: <4732085A.9040907 at fhcrc.org>
Content-Type: text/plain; charset=ISO-8859-1

Paul Shannon wrote:
> Hi Marc,
>
> It sounds like you have your hands full!
>
> Would it be crazy if we were to undertake this ourselves, at the SBRI?
> I am thinking of an organism-based package (like YEAST) rather than
> a chip-specific package.
>
> Does the new pipeline create annotation packages which work with
> GOstats and GSEA?
>
>  - Paul
>
>
> On Nov 6, 2007, at 4:41 PM, Marc Carlson wrote:
>
>> Well I plan to do this.  But it's just not going to happen overnight.  I
>> am booked right now with a homology implementation and I also have to
>> finish getting part of the promised pipeline in place for our other
>> annotation package collaborators so that they can update their packages
>> to the newer format for the upcoming release.  This means that I may not
>> get to start this for a couple of months, but it has been added to my
>> todo list.
>>
>> For now, I recommend that you try to use the AnnBuilder package.  We are
>> planning to retire this package along with the style of annotation
>> package that it spawns so this is definitely NOT a good long term
>> solution and is to be used for the short term ONLY.  But I think that it
>> will probably get you the quick fix that you need.
>>
>> http://bioconductor.org/packages/2.1/bioc/html/AnnBuilder.html
>>
>>
>>     Marc

Yes I am a very busy guy.  I would love to collaborate with you on
this.  But I don't think that making a new package from scratch would be
a very efficient use of your time.  That is, there are good reasons why
its going to take me a little while to get it to you.  There are a lot
of things to do.  I agree that an organism based package is what is
called for here and that is what I was planning to work on. 

As for your immediate needs, all you should need for GO stats or GSEA is
an environment which you could make for yourself from the appropriate
information.  I can give you those parts in an unformated form if you
want them.   To format them into a proper environment you should only
need to wrap them up in one.

#Lets suppose that we rip off some of the info from the YEAST package to
see how this would work:
library(YEAST)
res=mget(ls(YEASTGO), YEASTGO)

#Then we could quickly make a couple quick fakey environments:
MYGO=new.env(parent=emptyenv())
for (nm in names(res)) MYGO[[nm]] <- res[[nm]]

MYENTREZID <- new.env()
for (nm in ls(MYGO)) MYENTREZID[[nm]] <- paste("fauxId", nm)

#Then we could package them up into a local environments:
MYpkg <- new.env(parent=emptyenv())
MYpkg[["MYGO"]] <- MYGO
MYpkg[["MYENTREZID"]] <- MYENTREZID

#And attach them
attach(MYpkg, 2, "package:MY")

#At this point we should be able to do with these environments whatever
we need to.

I have the relevant information here from NCBI for falciparum to make
both of these environments (for real).  If you send me a personal email,
I can arrange to get it to you...

    Marc

------------------------------

Message: 8
Date: Wed, 07 Nov 2007 20:39:47 -0500
From: "James W. MacDonald" <jmacdon at med.umich.edu>
Subject: Re: [BioC] Affymetrix reannotation
To: Samuel Wuest <swuest at botinst.uzh.ch>
Cc: Bioconductor at stat.math.ethz.ch
Message-ID: <473268E3.5030007 at med.umich.edu>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Samuel,

If you just want to ensure the probes in each probeset actually map to 
the current genome build, then you might consider the MBNI re-mapped 
cdfs. For the ath1121501 chip there are two you could consider; the 
ath1atrefseqcdf or ath1attaircdf packages.

The probes were first blasted against this version of the genome 
(1con.01222004, file date is 08/10/2006) -- not sure what that means, as 
I am not a plant guy. The probes that mapped to a single unique sequence 
on the genome were then annotated to genes using either TAIR or RefSeq 
(this is part of the package name above), so you are assured that any 
probes that are outdated have been removed.

More info is available here (note these are version 10):

http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF_download.asp#v10

You can get the package(s) in one of two ways. First, you can get them 
automagically by doing something like:

dat <- ReadAffy(filenames=list.celfiles(), cdfname="ath1atrefseqcdf")

or you can download/install using biocLite() and then specify in your 
call to ReadAffy() (or justRMA()/justGCRMA() if you are going that route).

Best,

Jim

Samuel Wuest wrote:
> Thanks a lot,
> 
> I am familiar with the ath1121501 package. Now I don?t know whether it 
> contains informations on the probe-level rather than on the probeset-level 
> (e.g. whether a designed probe still targets an updated gene model? or which 
> probe-sets are obsolete and can be considered to be outdated?)
> 
> And: How can I retrieve such information without writing a blast-script to 
> check every single probe against the most updated cDNA database?
> 
> Thanks for any help on this.
> Samuel
> 
> On Tue, 06 Nov 2007 10:31:38 -0800
>   Marc Carlson <mcarlson at fhcrc.org> wrote:
> 
>>Samuel Wuest wrote:
>>
>>>Hi,
>>>
>>>I trying to assess a modified version of the Affymetrix present/absent 
>>>call 
>>>algorithm on my data, which have been generated from two-round amplified 
>>>RNA 
>>>hybridized to the Arabidopsis ATH1 GeneChip.
>>>
>>>I would need some negative control probesets, and thought of probesets 
>>>that 
>>>were designed based on wrong gene annotations (similar to what was used in 
>>>the PANP method from Peter Warren et al.).
>>>Therefore, I wondered whether someone does a reannotation of Affy 
>>>probesets 
>>>on a regular base? (for example, the version 7 of the Arabidopsis Genome 
>>>Annotation has been released recently, and some probesets on the AffyChip 
>>>are most probably not matching any gene model anymore).
>>>
>>>If yes, where could I find the updated datafiles? Is there anything in the 
>>>BioC annotation package for the ATH1Chip?
>>>
>>>Thanks for any help.
>>>Best, Sam
>>>
>>>-----------------------------------------------------
>>>Dipl. Bot. Samuel Wust
>>>Dep. of Developmental Genetics
>>>Institute for Plant Biology
>>>University of Zuerich
>>>Zollikerstrasse 107
>>>CH - 8008 Z?rich
>>>Phone: +41-(0)44 634 82 42
>>>Mobile: + 41 (0)76 501 69 22
>>>Email: swuest at botinst.uzh.ch
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>Search the archives: 
>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>  
>>
>>We provide a "standard" annotation package for this chip.  You can find
>>it here:
>>
>>http://bioconductor.org/packages/2.1/data/annotation/html/ath1121501.db.html
>>
>>
>>   Marc
>>
> 
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald
University of Michigan
Affymetrix and cDNA Microarray Core
1500 E Medical Center Drive
Ann Arbor MI 48109
734-647-5623

------------------------------

Message: 9
Date: Wed, 07 Nov 2007 21:25:12 -0500
From: Mete Civelek <mete at seas.upenn.edu>
Subject: [BioC] GOTERM
To: Bioconductor at stat.math.ethz.ch
Message-ID:
	<6.2.0.14.1.20071107211626.049b0750 at mete.mail.seas.upenn.edu>
Content-Type: text/plain; charset="us-ascii"; format=flowed

Hi All,

I have a simple problem that I can't seem to solve. I am using R 2.4.0 on 
windows. I am trying to get the GO Terms of a list of GO IDs using the 
following code. The list of GOIDs is in a file labelled SDGO.txt, which is 
a single column text file with no header.

 >library(GO)

 >SDGO<-read.table("SDGO.txt", header=F)

 >summary(SDGO)
           V1
  GO:0000075:  1
  GO:0000375:  1
  GO:0000377:  1
  GO:0000398:  1
  GO:0001503:  1
  GO:0001505:  1
  (Other)   :134

 >apply(SDGO, 1, GOTERM)
Error in get(x, envir, mode, inherits) : variable "GOTERM" of mode 
"function" was not found

I am sure I am way off in this code since I am a beginner of R and 
Bioconductor but I will appreciate any help?

Best,

Mete

------------------------------

Message: 10
Date: Thu, 08 Nov 2007 01:28:43 -0500
From: phguardiol at aol.com
Subject: [BioC] Question about a packages not yet in Bioconductor:
	AffyProbeMiner in R260 under windows
To: Bioconductor at stat.math.ethz.ch
Message-ID: <8C9EFE7BE8C9965-BA8-39AB at mblk-d19.sysops.aol.com>
Content-Type: text/plain

Dear colleagues,

I m trying to use the AffyProbeMiner packages available from http://gauss.dbb.georgetown.edu/liblab/affyprobeminer/gene.html?that can replace the usual CDF probes and annotation files available in BioC.

I m using R 2.6.0 under WinXPPro SP2.
I have downloaded the files available on the webpage above in the R library folder. 
These are gz.rar compressed files and are not recognized in R windows : Packages -> Install packages from local zip files. 
I have used WinRar to uncompress these files and have copy and paste the?folders (ex: hgu133ageneccds) located in the uncompressed folders (ex: hgu133ageneccds_1.1.0) in the R folder library.??

Then if I type:
?> library(hgu133ageneccds)

I obtain the following error:
Error in library(hgu133ageneccds) : 
? 'hgu133ageneccdscdf' is not a valid package -- installed < 2.0.0?
The same is true for all of these files

Is it a problem of compatibility with R (too old files ?) ? Should I use a different way to install these packages in R 2.6.0 ?

Is there a plan to include these files and their update in BioC metadata ?

Thanks for your help
and hoping that this request will not be out of scope from this list since it is not a bioconductor package.

Philippe Guardiola, MD

	[[alternative HTML version deleted]]

------------------------------

Message: 11
Date: Wed, 07 Nov 2007 23:21:54 -0800
From: Robert Gentleman <rgentlem at fhcrc.org>
Subject: Re: [BioC] GOTERM
To: Mete Civelek <mete at seas.upenn.edu>
Cc: Bioconductor at stat.math.ethz.ch
Message-ID: <4732B912.7040404 at fhcrc.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

First, your R, and hence your Bioconductor packages are a year out of 
date, so you should update them.

Mete Civelek wrote:
> Hi All,
> 
> I have a simple problem that I can't seem to solve. I am using R 2.4.0 on 
> windows. I am trying to get the GO Terms of a list of GO IDs using the 
> following code. The list of GOIDs is in a file labelled SDGO.txt, which is 
> a single column text file with no header.
> 
>  >library(GO)
> 
>  >SDGO<-read.table("SDGO.txt", header=F)
> 
>  >summary(SDGO)
>            V1
>   GO:0000075:  1
>   GO:0000375:  1
>   GO:0000377:  1
>   GO:0000398:  1
>   GO:0001503:  1
>   GO:0001505:  1
>   (Other)   :134
> 
>  >apply(SDGO, 1, GOTERM)
> Error in get(x, envir, mode, inherits) : variable "GOTERM" of mode 
> "function" was not found

  why do you think that is what you should do?

  something more like
   mget(SDGO[,1], GOTERM)

  will be more appropriate

  or sapply(SDGO[,1], getGOTerm)
  would be another alternative

> 
> I am sure I am way off in this code since I am a beginner of R and 
> Bioconductor but I will appreciate any help?

> 
> Best,
> 
> Mete
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org

------------------------------

Message: 12
Date: Wed, 07 Nov 2007 23:27:37 -0800
From: Robert Gentleman <rgentlem at fhcrc.org>
Subject: Re: [BioC] Question about a packages not yet in Bioconductor:
	AffyProbeMiner in R260 under windows
To: phguardiol at aol.com
Cc: Bioconductor at stat.math.ethz.ch
Message-ID: <4732BA69.9060701 at fhcrc.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

It looks like you are using windows and they do not provide packages for 
windows users. The reason they are not recognized by R is that they 
won't work.

Your options are to
1) install enough tools on your computer to build packages (details can 
be found on CRAN)
2) use the remapped probe packages that are released through the 
Bioconductor project for which we do have windows packages

I have no idea what the intentions of AffyProbeMiner are, they have 
never submitted anything, and until they do, it won't even be considered,

  best wishes
    Robert

phguardiol at aol.com wrote:
> Dear colleagues,
> 
> 
> I m trying to use the AffyProbeMiner packages available from http://gauss.dbb.georgetown.edu/liblab/affyprobeminer/gene.html?that can replace the usual CDF probes and annotation files available in BioC.
> 
> I m using R 2.6.0 under WinXPPro SP2.
> I have downloaded the files available on the webpage above in the R library folder. 
> These are gz.rar compressed files and are not recognized in R windows : Packages -> Install packages from local zip files. 
> I have used WinRar to uncompress these files and have copy and paste the?folders (ex: hgu133ageneccds) located in the uncompressed folders (ex: hgu133ageneccds_1.1.0) in the R folder library.??
> 
> Then if I type:
> ?> library(hgu133ageneccds)
> 
> I obtain the following error:
> Error in library(hgu133ageneccds) : 
> ? 'hgu133ageneccdscdf' is not a valid package -- installed < 2.0.0?
> The same is true for all of these files
> 
> Is it a problem of compatibility with R (too old files ?) ? Should I use a different way to install these packages in R 2.6.0 ?
> 
> 
> Is there a plan to include these files and their update in BioC metadata ?
> 
> 
> Thanks for your help
> and hoping that this request will not be out of scope from this list since it is not a bioconductor package.
> 
> Philippe Guardiola, MD
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org

------------------------------

Message: 13
Date: Thu, 08 Nov 2007 03:33:33 -0500
From: phguardiol at aol.com
Subject: [BioC] Re : Question about reannotated Affy annotation files
	- Question about AffyProbeMiner in R260 under windows
To: Bioconductor at stat.math.ethz.ch
Message-ID: <8C9EFF92E9FB66A-754-23AD at FRR1-L18.sis.aol.com>
Content-Type: text/plain

Thanks for these information.

Regarding your proposal ofÂ using the remapped probe packages that are released through the 
Bioconductor project for which we do have windows packages:
1- Are those which are named CustomCDF ? 
2- The way to use theseÂ is not clear for me (if these are the one to be used) there are multiple files for a given chip and what they represent,Â how they have been built,Â and how to use these (for instanceÂ for affy chips using gcrma) is not clearly explained (unlessÂ I have missed something somewhere..?). Could it be possible to obtain a little bit of help on this from the BioCÂ group....Â 
Thanks again
Philippe Guardiola, MD

-----E-mail d'origine-----
De : Robert Gentleman <rgentlem at fhcrc.org>
A : phguardiol at aol.com
Cc : Bioconductor at stat.math.ethz.ch
EnvoyÃ© le : Je 8 Novembre 2007 8:27
Sujet : Re: [BioC] Question about a packages not yet in Bioconductor: AffyProbeMiner in R260 under windows

It looks like you are using windows and they do not provide packages for 
indows users. The reason they are not recognized by R is that they 
on't work.
Your options are to
) install enough tools on your computer to build packages (details can 
e found on CRAN)
) use the remapped probe packages that are released through the 
ioconductor project for which we do have windows packages
I have no idea what the intentions of AffyProbeMiner are, they have 
ever submitted anything, and until they do, it won't even be considered,
  best wishes
   Robert

hguardiol at aol.com wrote:
 Dear colleagues,

 I m trying to use the AffyProbeMiner packages available from 
ttp://gauss.dbb.georgetown.edu/liblab/affyprobeminer/gene.html?that can replace 
he usual CDF probes and annotation files available in BioC.

 I m using R 2.6.0 under WinXPPro SP2.
 I have downloaded the files available on the webpage above in the R library 
older. 
 These are gz.rar compressed files and are not recognized in R windows : 
ackages -> Install packages from local zip files. 
 I have used WinRar to uncompress these files and have copy and paste 
he?folders (ex: hgu133ageneccds) located in the uncompressed folders (ex: 
gu133ageneccds_1.1.0) in the R folder library.??

 Then if I type:
 ?> library(hgu133ageneccds)

 I obtain the following error:
 Error in library(hgu133ageneccds) : 
 ? 'hgu133ageneccdscdf' is not a valid package -- installed < 2.0.0?
 The same is true for all of these files

 Is it a problem of compatibility with R (too old files ?) ? Should I use a 
ifferent way to install these packages in R 2.6.0 ?

 Is there a plan to include these files and their update in BioC metadata ?

 Thanks for your help
 and hoping that this request will not be out of scope from this list since it 
s not a bioconductor package.

 Philippe Guardiola, MD

   [[alternative HTML version deleted]]

 _______________________________________________
 Bioconductor mailing list
 Bioconductor at stat.math.ethz.ch
 https://stat.ethz.ch/mailman/listinfo/bioconductor
 Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
obert Gentleman, PhD
rogram in Computational Biology
ivision of Public Health Sciences
red Hutchinson Cancer Research Center
100 Fairview Ave. N, M2-B876
O Box 19024
eattle, Washington 98109-1024
06-667-7700
gentlem at fhcrc.org
_______________________________________________
ioconductor mailing list
ioconductor at stat.math.ethz.ch
ttps://stat.ethz.ch/mailman/listinfo/bioconductor
earch the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

	[[alternative HTML version deleted]]

------------------------------

Message: 14
Date: Thu, 8 Nov 2007 10:56:25 +0200 (EET)
From: Jarno Tuimala <jtuimala at csc.fi>
Subject: Re: [BioC] Re : Question about reannotated Affy annotation
	files - Question about AffyProbeMiner in R260 under windows
To: phguardiol at aol.com
Cc: Bioconductor at stat.math.ethz.ch
Message-ID: <Pine.LNX.4.62.0711081042300.25104 at sampo3.csc.fi>
Content-Type: text/plain; charset="iso-8859-1"

Hi!

The methodology of generating these remapped probes packages is described 
in NAR, and on the the web. See:

http://nar.oxfordjournals.org/cgi/content/full/33/20/e175
http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/cdfreadme.htm

There is also an example of how to use these package at the end of 
the readme page.

In addition to Bioconductor, you can download the probe packages from:

http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF_download_v10.asp

If you check the list on this site, you'll see that there are several 
packages for the same chip. These packages are based on different genome 
features (gene or transcript) or on different databases (Unigene, RefSeq, 
Entrez Gene, Ensembl, Vega).

Note that in order to be able to use GCRMA, you need to install both CDF 
and probe packages for your chip.

Best wishes,
Jarno

On Thu, 8 Nov 2007 phguardiol at aol.com wrote:

>
> Thanks for these information.
>
> Regarding your proposal of??using the remapped probe packages that are released through the
> Bioconductor project for which we do have windows packages:
> 1- Are those which are named CustomCDF ?
> 2- The way to use these??is not clear for me (if these are the one to 
> be used) there are multiple files for a given chip and what they 
> represent,??how they have been built,??and how to use these (for 
> instance??for affy chips using gcrma) is not clearly explained 
> (unless??I 
> have missed something somewhere..?). Could it be possible to obtain a 
> little bit of help on this from the BioC??group....??
> Thanks again
> Philippe Guardiola, MD
>
>
> -----E-mail d'origine-----
> De : Robert Gentleman <rgentlem at fhcrc.org>
> A : phguardiol at aol.com
> Cc : Bioconductor at stat.math.ethz.ch
> Envoy?? le : Je 8 Novembre 2007 8:27
> Sujet : Re: [BioC] Question about a packages not yet in Bioconductor: AffyProbeMiner in R260 under windows
>
>
>
> It looks like you are using windows and they do not provide packages for
> indows users. The reason they are not recognized by R is that they
> on't work.
> Your options are to
> ) install enough tools on your computer to build packages (details can
> e found on CRAN)
> ) use the remapped probe packages that are released through the
> ioconductor project for which we do have windows packages
> I have no idea what the intentions of AffyProbeMiner are, they have
> ever submitted anything, and until they do, it won't even be considered,
>  best wishes
>   Robert
>
> hguardiol at aol.com wrote:
> Dear colleagues,
>
>
> I m trying to use the AffyProbeMiner packages available from
> ttp://gauss.dbb.georgetown.edu/liblab/affyprobeminer/gene.html?that can replace
> he usual CDF probes and annotation files available in BioC.
>
> I m using R 2.6.0 under WinXPPro SP2.
> I have downloaded the files available on the webpage above in the R library
> older.
> These are gz.rar compressed files and are not recognized in R windows :
> ackages -> Install packages from local zip files.
> I have used WinRar to uncompress these files and have copy and paste
> he?folders (ex: hgu133ageneccds) located in the uncompressed folders (ex:
> gu133ageneccds_1.1.0) in the R folder library.??
>
> Then if I type:
> ?> library(hgu133ageneccds)
>
> I obtain the following error:
> Error in library(hgu133ageneccds) :
> ? 'hgu133ageneccdscdf' is not a valid package -- installed < 2.0.0?
> The same is true for all of these files
>
> Is it a problem of compatibility with R (too old files ?) ? Should I use a
> ifferent way to install these packages in R 2.6.0 ?
>
>
> Is there a plan to include these files and their update in BioC metadata ?
>
>
> Thanks for your help
> and hoping that this request will not be out of scope from this list since it
> s not a bioconductor package.
>
> Philippe Guardiola, MD
>
>
>   [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
> obert Gentleman, PhD
> rogram in Computational Biology
> ivision of Public Health Sciences
> red Hutchinson Cancer Research Center
> 100 Fairview Ave. N, M2-B876
> O Box 19024
> eattle, Washington 98109-1024
> 06-667-7700
> gentlem at fhcrc.org
> _______________________________________________
> ioconductor mailing list
> ioconductor at stat.math.ethz.ch
> ttps://stat.ethz.ch/mailman/listinfo/bioconductor
> earch the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> 	[[alternative HTML version deleted]]
>
>

-----------------------------------------------------------------------------
Jarno Tuimala, FT, bioinformatiikan asiantuntija, CSC, PL 405, 02101 Espoo 
puh.: (09) 457 2226, fax: (09) 457 2302, s-posti: jarno.tuimala at csc.fi
CSC on tieteen tietotekniikan keskus, http://www.csc.fi/molbio

Jarno Tuimala, PhD, bioinformatics, CSC, P.O.Box 405, FI-02101 Espoo, Finland 
tel.: +358 9 457 2226, fax: +358 9 457 2302, e-mail: jarno.tuimala at csc.fi
CSC is the Finnish IT Center for Science, http://www.csc.fi/molbio
-----------------------------------------------------------------------------

------------------------------

Message: 15
Date: Thu, 8 Nov 2007 09:10:50 +0000
From: Alessandro Fazio <fazioalessandro at hotmail.com>
Subject: [BioC] optimizazion cluster in Heatmap
To: Bioconductor <bioconductor at stat.math.ethz.ch>
Message-ID: <BLU120-W177771F390B42893B313E4AA8B0 at phx.gbl>
Content-Type: text/plain

Hello everybody, I have a problem in doing a cluster with Heatmap. Briefly, I want a cluster with fixed column order and row order depending on the dendrogram produced by the clustering method. BUT, it seems that the row sorting is not optimal, that is genes with similar expression profiles are not group tpgrther but spread. This is the code I used: > mydist <- function(x) cor.dist(x)> myhclust <- function(x) hclust(x, method='average')> heatmap(exprs(exampleSet),Colv=NA, dist=mydist, hclust=myhclust) > sessionInfo()R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale:LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252attached base packages:[1] 'splines'   'tools'     'stats'     'graphics'  'grDevices' 'utils'    [7] 'datasets'  'methods'   'base'     other attached packages:    bioDist          GO  genefilter    survival         ALL   Rgraphviz geneplotter     '1.8.0!
 '    '1.16.0'    '1.14.1'      '2.31'     '1.4.3'    '1.14.1'    '1.14.0'     lattice    annotate     Biobase        RBGL       graph    '0.15-4'    '1.14.1'    '1.14.0'    '1.12.0'    '1.14.2' ANY idea about what I should do to have a good row sorting? THANK you in advance Regards, Alessandro
_________________________________________________________________
[[replacing trailing spam]]

	[[alternative HTML version deleted]]

------------------------------

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor

End of Bioconductor Digest, Vol 57, Issue 8
*******************************************