[BioC] hgu133plus2cdfSYMBOL not found

Tony McBryan tony at mcbryan.co.uk
Tue Dec 15 10:14:53 CET 2009


Hello Philip,

Thank you very much for you help.

After some experimentation, the following code excerpt works when placed 
at the head of my previous script:

---
get.annotation <- function (x, cdfname, verbose = FALSE)
{
    library(paste(cdfname,".db",sep=""), character.only = TRUE)

    symb <- simpleaffy:::.strip.list(mget(x, envir = get(paste(cdfname, 
"SYMBOL", sep = ""))))
    desc <- simpleaffy:::.strip.list(mget(x, envir = get(paste(cdfname, 
"GENENAME", sep = ""))))
    accno <- simpleaffy:::.strip.list(mget(x, envir = get(paste(cdfname, 
"ACCNUM", sep = ""))))
    uni <- simpleaffy:::.strip.list(mget(x, envir = get(paste(cdfname, 
"UNIGENE", sep = ""))))

    ok <- (symb != "NoAnno") & (desc != "NoAnno") & (accno !=
        "NoAnno") & (uni != "NoAnno")
    names(ok) <- x
    if (!ok && verbose) {
        warning(paste("value for '", names(ok)[ok], "' not found", sep = 
""), call. = FALSE)
    }
    acc.lnk <- 
paste("=HYPERLINK(\"http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=nucleotide&term=", 

        accno, "\",\"", accno, "\")", sep = "")
    acc.lnk[!ok] <- "NoAnno"
    uni.lnk <- 
paste("=HYPERLINK(\"http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=unigene&term=", 

        uni, "&dopt=unigene\",\"", uni, "\")", sep = "")
    uni.lnk[!ok] <- "NoAnno"
    res <- cbind(symb, acc.lnk, uni.lnk, desc)
    res[res == "NoAnno"] <- "No Annotation Found"
    colnames(res) <- c("gene name", "accession", "unigene", "description")
    return(res)
}
#<environment: namespace:simpleaffy>

name = "get.annotation"
env = getNamespace("simpleaffy")
pkgName = "simpleaffy"
value = get.annotation
unlockBinding(name, env);
assignInNamespace(name, value, ns=pkgName, envir=env);
assign(name, value, envir=env);
lockBinding(name, env);
---

The additional lines at the bottom are to put the newly specified 
get.annotation function into the namespace for simpleaffy.



Groot, Philip de wrote:
> Dear Tony,
>  
> Yes, I was already afraid for this. simpleaffy is not updated to properly handle the .db annotation packages. In the past (5 R-versions ago), a annotation library was loaded by the command: 
>  
> library(hgu133plus2)
>  
> However, nowadays it should be loaded by:
> library(hgu133plus2.db)
>  
> Unfortunately, things go wrong in the get.annotation function (simpleaffy), which reads like this:
>   
>> get.annotation
>>     
> function (x, cdfname, verbose = FALSE) 
> {
>     library(cdfname, character.only = TRUE)
>     symb <- .strip.list(mget(x, envir = get(paste(cdfname, "SYMBOL", 
>         sep = "")), ifnotfound = list(.if.probeset.not.found)))
>     desc <- .strip.list(mget(x, envir = get(paste(cdfname, "GENENAME", 
>         sep = "")), ifnotfound = list(.if.probeset.not.found)))
>     accno <- .strip.list(mget(x, envir = get(paste(cdfname, "ACCNUM", 
>         sep = "")), ifnotfound = list(.if.probeset.not.found)))
>     uni <- .strip.list(mget(x, envir = get(paste(cdfname, "UNIGENE", 
>         sep = "")), ifnotfound = list(.if.probeset.not.found)))
>     ok <- (symb != "NoAnno") & (desc != "NoAnno") & (accno != 
>         "NoAnno") & (uni != "NoAnno")
>     names(ok) <- x
>     if (!ok && verbose) {
>         warning(paste("value for '", names(ok)[ok], "' not found", 
>             sep = ""), call. = FALSE)
>     }
>     acc.lnk <- paste("=HYPERLINK(\"http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=nucleotide&term=", 
>         accno, "\",\"", accno, "\")", sep = "")
>     acc.lnk[!ok] <- "NoAnno"
>     uni.lnk <- paste("=HYPERLINK(\"http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=unigene&term=", 
>         uni, "&dopt=unigene\",\"", uni, "\")", sep = "")
>     uni.lnk[!ok] <- "NoAnno"
>     res <- cbind(symb, acc.lnk, uni.lnk, desc)
>     res[res == "NoAnno"] <- "No Annotation Found"
>     colnames(res) <- c("gene name", "accession", "unigene", "description")
>     return(res)
> }
> <environment: namespace:simpleaffy>
>
> The top line of this function loads the library (without .db extension) and hence either loading the library will fail or getting the annotation will fail. Copying this text into a new function and deleting the top-line will fix your problem. Provided that you load the .db library first.
>  
> Regards,
>  
>  
> Dr. Philip de Groot Ph.D.
> Bioinformatics Researcher
>
> Wageningen University / TIFN
> Nutrigenomics Consortium
> Nutrition, Metabolism & Genomics Group
> Division of Human Nutrition
> PO Box 8129, 6700 EV Wageningen
> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
> Dreijenweg 2, 6703 HA  Wageningen
> Room: 0052a
> T: +31-317-485786
> F: +31-317-483342
> E-mail:   Philip.deGroot at wur.nl <mailto:Philip.deGroot at wur.nl> 
> Internet: http://www.nutrigenomicsconsortium.nl <http://www.nutrigenomicsconsortium.nl/> 
>              http://humannutrition.wur.nl <http://humannutrition.wur.nl/> 
>              https://madmax.bioinformatics.nl <https://madmax.bioinformatics.nl/> 
>  
>  
>  
>
> ________________________________
>
> From: Tony McBryan [mailto:tony at mcbryan.co.uk]
> Sent: Mon 14-12-2009 12:35
> To: Groot, Philip de; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] hgu133plus2cdfSYMBOL not found
>
>
>
> Hello Philip,
>
> I have applied the change as you suggested but I'm afraid it still fails
> at the same location; although the content of the error message has
> changed slightly and now reads:
>
> ---
> Error in library(cdfname, character.only = TRUE) :
>   there is no package called 'hgu133plus2'
> ---
>
> Thank you for your previous extremely quick response,
>
> Tony
>
> Groot, Philip de wrote:
>   
>> Hello Tony,
>>
>> object 'hgu133plus2cdfSYMBOL'  should be object 'hgu133plus2SYMBOL' .
>>
>> Please use:
>> summary <- results.summary(pc,cleancdfname(cdfName(raw.data), addcdf=FALSE))
>>                                                                                                                  ^^^^^^^^^^^^^^
>>
>> Regards,
>>
>> Dr. Philip de Groot Ph.D.
>> Bioinformatics Researcher
>>
>> Wageningen University / TIFN
>> Nutrigenomics Consortium
>> Nutrition, Metabolism & Genomics Group
>> Division of Human Nutrition
>> PO Box 8129, 6700 EV Wageningen
>> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
>> Dreijenweg 2, 6703 HA  Wageningen
>> Room: 0052a
>> T: +31-317-485786
>> F: +31-317-483342
>> E-mail:   Philip.deGroot at wur.nl <mailto:Philip.deGroot at wur.nl>
>> Internet: http://www.nutrigenomicsconsortium.nl <http://www.nutrigenomicsconsortium.nl/>  <http://www.nutrigenomicsconsortium.nl/>
>>              http://humannutrition.wur.nl <http://humannutrition.wur.nl/>  <http://humannutrition.wur.nl/>
>>              https://madmax.bioinformatics.nl <https://madmax.bioinformatics.nl/>  <https://madmax.bioinformatics.nl/>
>>
>>
>>
>>
>> ________________________________
>>
>> From: Tony McBryan [mailto:tony at mcbryan.co.uk]
>> Sent: Mon 14-12-2009 11:49
>> To: bioconductor at stat.math.ethz.ch
>> Subject: [BioC] hgu133plus2cdfSYMBOL not found
>>
>>
>>
>> Hello list,
>>
>> I'm having trouble using part of the affy packages within Bio conductor.
>> I have a dozen U133Plus2 microarrays I'm doing a differential analysis
>> on. Everything works fine until the pairwise comparison stage where I am
>> unable to perform a summary (results.summary()) of the results of the
>> comparison. The script terminates with the error message:
>>
>> ---
>> Error in get(paste(cdfname, "SYMBOL", sep = "")) :
>> object 'hgu133plus2cdfSYMBOL' not found
>> ---
>>
>> as a result of the call to:
>>
>> summary <- results.summary(pc,cleancdfname(cdfName(raw.data)))
>>
>> where pc is the result of "pc <- pairwise.comparison(x.rma, "treatment",
>> spots=raw.data )".
>>
>> The only result I could find on Google was a previous posting to this
>> list [1] from 2006 which stated that the "hgu133plus2" package was
>> required however this seems unavailable from the package repositories:
>>
>> ---
>>  > source("http://www.bioconductor.org/biocLite.R")
>>  > biocLite("hgu133plus2")
>> Using R version 2.10.0, biocinstall version 2.5.8.
>> Installing Bioconductor version 2.5 packages:
>> [1] "hgu133plus2"
>> Please wait...
>>
>> Warning in install.packages(pkgs = pkgs, repos = repos, ...) :
>> argument 'lib' is missing: using
>> '/home/mcbryan/R/x86_64-pc-linux-gnu-library/2.10'
>> Warning message:
>> In getDependencies(pkgs, dependencies, available, lib) :
>> package 'hgu133plus2' is not available
>> ---
>>
>> I have however attached the "hgu133plus2.db" (which seems to be
>> hgu133plus2's replacement?) "hgu133plus2cdf" and "hgu133plus2probe"
>> packages to no additional success.
>>
>> Vital stats: 64bit Linux (Ubuntu 9.04), R2.10.0 (should be latest from
>> "deb http://cran.uk.r-project.org/bin/linux/ubuntu karmic/" repository).
>> Bioc installed using Bioclite (and all packages updated to latest
>> versions using "update.packages(repos=biocinstallRepos(), ask=FALSE)").
>>
>> I have attached a short script below which reproduces the error for me
>> as well as the output of running that script (including sessionInfo()).
>>
>> If anyone could provide a prod in the right direction it would very much
>> be appreciated.
>>
>> Tony McBryan
>> Beatson Institute for Cancer Research, UK
>>
>> [1] https://stat.ethz.ch/pipermail/bioconductor/2006-January/011746.html
>>
>>
>> ===
>>
>> runme.R:
>>
>> ## load Bioconductor relevant package
>> ## limma: define experimental design and perform ratio statistics
>> ## affy: CEL file manipulations
>> ## affyQCReport: Quality Control Report
>>
>> library(limma)
>> library(simpleaffy)
>> library(affy)
>> library(affyQCReport)
>>
>> library(hgu133plus2.db)
>> library(hgu133plus2cdf)
>> library(hgu133plus2probe)
>>
>> ## set locations for input and output data files
>>
>> qcDirectory <- "output/"
>> outputDirectory <- "output/"
>>
>> ## read target file detailing input file names into an object
>> ## start with defining the name and location of the target file
>> ## then read data into the targets object
>>
>> ## read in CEL files listed in covdesc into raw affy object
>>
>> raw.data <- read.affy()
>>
>> ## compile Affy AC Report
>> # Temporarily commented out
>> # qcFile <- paste(qcDirectory, "affy_qc_report.pdf", sep="")
>> # QCReport(raw.data, file=qcFile)
>>
>> ## MAS5: present/Marginal/Absent analysis
>> ## writing the output to a file
>>
>> x.mas <- call.exprs(raw.data,"mas5")
>> outputFile <- paste(outputDirectory, "MAS5_output.csv", sep="")
>> write.exprs(x.mas, outputFile, sep="\t")
>>
>> ## RMA: probe summary intensity value
>>
>> x.rma <- call.exprs(raw.data,"rma")
>> outputFile <- paste(outputDirectory, "RMA_output.csv", sep="")
>> write.exprs(x.rma, outputFile, sep="\t")
>>
>> ## Pairwise comparison
>>
>> pc <- pairwise.comparison(x.rma, "treatment", spots=raw.data )
>>
>> ## ---
>> ## Fails on next line
>> ## ---
>> summary <- results.summary(pc,cleancdfname(cdfName(raw.data)))
>>
>> ## This is what I wanted to do next
>> summaryfile <- paste(outputDirectory,"spreadsheet.csv",summary)
>> write.annotation(file=summaryfile,summary)
>>
>>
>> ===
>>
>> Result of source("runme.R"):
>>
>> R version 2.10.0 (2009-10-26)
>> Copyright (C) 2009 The R Foundation for Statistical Computing
>> ISBN 3-900051-07-0
>>
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>>
>> Natural language support but running in an English locale
>>
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>>
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>>
>>  > source("runme.R")
>> Loading required package: affy
>> Loading required package: Biobase
>>
>> Welcome to Bioconductor
>>
>> Vignettes contain introductory material. To view, type
>> 'openVignette()'. To cite Bioconductor, see
>> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>>
>> Loading required package: genefilter
>> Loading required package: gcrma
>> Loading required package: xtable
>> Loading required package: affyPLM
>> Loading required package: preprocessCore
>>
>> Attaching package: 'affyPLM'
>>
>>
>> The following object(s) are masked from package:stats :
>>
>> resid,
>> residuals,
>> weights
>>
>> Loading required package: RColorBrewer
>> Loading required package: lattice
>> Loading required package: AnnotationDbi
>> Loading required package: org.Hs.eg.db
>> Loading required package: DBI
>> Background correcting
>> Normalizing
>> Calculating Expression
>> Error in get(paste(cdfname, "SYMBOL", sep = "")) :
>> object 'hgu133plus2cdfSYMBOL' not found
>>  > sessionInfo()
>> R version 2.10.0 (2009-10-26)
>> x86_64-pc-linux-gnu
>>
>> locale:
>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
>> [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8
>> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] hgu133plus2probe_2.5.0 hgu133plus2cdf_2.5.0 hgu133plus2.db_2.3.5
>> [4] org.Hs.eg.db_2.3.6 RSQLite_0.7-3 DBI_0.2-4
>> [7] AnnotationDbi_1.8.1 affyQCReport_1.24.0 lattice_0.17-26
>> [10] RColorBrewer_1.0-2 affyPLM_1.22.0 preprocessCore_1.8.0
>> [13] xtable_1.5-6 simpleaffy_2.22.0 gcrma_2.18.0
>> [16] genefilter_1.28.2 affy_1.24.2 Biobase_2.6.1
>> [19] limma_3.2.1
>>
>> loaded via a namespace (and not attached):
>> [1] affyio_1.14.0 annotate_1.24.0 Biostrings_2.14.8 grid_2.10.0
>> [5] IRanges_1.4.9 splines_2.10.0 survival_2.35-7 tools_2.10.0
>>
>>
>>
>>
>>
>>  
>>     
>
>
>
>
>
>



More information about the Bioconductor mailing list