[BioC] New Package: annaffy

Laurent Gautier laurent@cbs.dtu.dk
Fri, 2 Aug 2002 12:18:46 +0200


--Q68bSM7Ycu6FN28Q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Thu, Aug 01, 2002 at 05:29:15PM -0700, Colin A. Smith wrote:
> This summer I've written a new package that makes extensive use of 
> the Bioconductor data packages to create annotated HTML and text 
> files. It will take a vector or Affymetrix probe ids and generate a 
> table with quite a bit of useful annotation data. It was mostly 
> inspired by Robert Gentlemen's challenge found in the "ll.htmlpage" 
> documentation:


It looks very nice... thanks...


> 
> "Details: A simple markup is used to provide clickable entries. 
> Anyone that is energetic could greatly improve this."
> 
> The usefulness of the package essentially boils down to the function 
> "aaf.summary". For an overview of what the package does, see the 
> documentation for that function. For a quick example, run 
> example(aaf.summary). There may be novel uses for the other functions 
> in the package but they're mainly just slaves to the master function. 
> In this version I haven't officially declared any of the functions 
> private but I think I'll do that at some point soon.
> 
> annaffy depends on three Bioconductor packages: Biobase (only for the 
> multiget function), GO, and KEGG (for second level annotation). Also, 
> to really be useful for anything, it needs one of the other full 
> annotation data packages available on the Bioconductor web site. I've 
> tested it with the rgu34a package but I haven't gotten around to the 
> other data packages yet.
> 
> To my knowledge, annaffy is Bioconductor compliant with the exception 
> that I haven't written a vignette yet. It does pass R CMD check. 
> There are most certainly still bugs, typos, and misspellings in the 
> code/documentation. There is one big optimization I know of that 
> still needs to be done. I would greatly appreciate any 
> comments/suggestions/complaints. I haven't implemented functions that 
> handle all the annotation data provided with the annotation packages. 
> I am especially interested in finding out which of the other data 
> sources would be useful and worth implementing.
> 


I thought of a tiny option to aaf.summary
(displays the output produced in a browser...).
and I couldn't help adding it...
You'll find the code and the doc attached...
(if one has any use for it).



L.

> -Colin
> 
> Source package:
> 
> http://homepages.nyu.edu/~cas277/annaffy_0.5.tar.gz
> 
> HTML page produced by example(aaf.summary):
> 
> http://homepages.nyu.edu/~cas277/affy.html
> -- 
> 
> Colin Alexander Smith
> cas277@nyu.edu
> 
> PGP Public Key:
>  http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x0B159DFF
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> http://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

--Q68bSM7Ycu6FN28Q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="aaf.summary.R"

aaf.summary <- function (probeids, chip, filename, userdata, columns, sort, 
                         decreasing = FALSE, format = "html", 
                         title = "BioConductor Affymetrix Probe Listing",
                         browser = getOption("browser"),
                         display = FALSE) {
  
  anncols <- aaf.handler()
  if (missing(userdata))
    usercols <- NULL
  else {
    userdata <- aaf.checkTable(userdata, length(probeids))
    usercols <- names(userdata)
  }
  
  if (missing(columns))
    columns <- c(anncols, usercols)
  else
    for(item in columns)
      if (is.null(indexTo(c(anncols, usercols), item)))
        stop(paste("Column \"", item, "\" does not exist", sep = ""))
  
  if (! missing(sort))
    for(item in sort)
      if (is.null(indexTo(c(anncols, usercols), item)))
        stop(paste("Sort column \"", item, "\" does not exist", 
                   sep = ""))
  
  table <- aaf.table(probeids, chip, columns)
  if (! missing(userdata))
    table <- c(table, userdata)
  if (! missing(sort))
    table <- sortTable(table, sort, decreasing)
  
  switch(format,
         html = aaf.HTML(table, columns, filename, title),
         text = aaf.text(table, columns, filename),
         "Error: Unsupported format")

  
  if (display) {
    url <- paste("file://", getwd(),  Platform()$file.sep, filename, sep="")
    o.browser <- options()$browser
    on.exit(options(browser=o.browser))
    
    ## wild copy/paste from the function 'help.search'
    ## DEBUG: not quite sure about "/dev/null" and windows or Mac ports...
    if (browser != getOption("browser")) {
      msg <- paste("Changing the default browser", "(as specified by the `browser' option)", 
                   "to the given browser.")
      writeLines(strwrap(msg, exdent = 4))
      options(browser = browser)
    }
    system(paste(browser, " -remote \"openURL(", url, ")\" 2>/dev/null || ", 
                 browser, " ", url, " &", sep = ""))
  }
}

--Q68bSM7Ycu6FN28Q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="aaf.summary.Rd"

\name{aaf.summary}
\alias{aaf.summary}
\title{Summarize annotation for Affymetrix probes}
\description{
  Given a set of Affymetrix probe ids and a standard BioConductor
  Affymetrix annotation package, output a summary file.
}

\usage{
aaf.summary(probeids, chip, filename, userdata, columns, sort, 
            decreasing = FALSE, format = "html", 
            title = "BioConductor Affymetrix Probe Listing")
}
\arguments{
  \item{probeids}{character vector containing probe ids}
  \item{chip}{name of chip, see details}
  \item{filename}{filename for output file}
  \item{userdata}{named list of vectors the same length as \code{probeids}}
  \item{columns}{character vector with column names to be output}
  \item{sort}{character vector with order of columns used for sorting}
  \item{decreasing}{sort by decreasing value (\code{TRUE}) or increasing value (\code{FALSE})}
  \item{format}{\code{html} or \code{text}, format of output file, either HTML 
    or tab-delimited text (excluding any hypertext links)}
  \item{title}{title of HTML page}
  \item{browser}{the browser to use to disply the results (See
    \code{display} below).}
  \item{display}{a logical to indicate whether display in a browser is
    wished or not.}
}
\details{
  The core workings of this function depend on an (informal) protocol
  used in creating the BioConductor Affymetrix annotation data
  packages. Based on currently published (and unpublished) data packages,
  the current protocol includes the following features:

  The package is named after the chip, \code{<chip name>} \cr
  The package contains datasets named \code{<chip name><data type>}

  This function will map the following datasets into the corresponding
  column names:

  \code{Probe} - a column containing the contents of \code{probeids} \cr
  \code{<chip name>SYMBOL} - \code{Symbol} - Gene symbol \cr
  \code{<chip name>GENENAME} - \code{Description} - Gene description \cr
  \code{<chip name>ACCNUM} - \code{GenBank} - GenBank accession number
  including a hypertext link \cr
  \code{<chip name>LOCUSID} - \code{LocusLink} - LocusLink ID including a
  hypertext link \cr
  \code{<chip name>UNIGENE} - \code{UniGene} - UniGene cluster ID including a
  hypertext link \cr
  \code{<chip name>PMID} - \code{PubMed} - Number of PubMed abstracts
  including a hypertext link to the corresponding abstracts \cr
  \code{<chip name>GO} - \code{Gene Ontology} - Gene Ontology classes including
  hypertext link and descriptive JavaScript rollover \cr
  \code{<chip name>PATH} - \code{Pathway} - KEGG Pathway names including
  hypertext link \cr

  \code{userdata} allows users to provide their own data to be
  displayed in the table. For clarity, the vectors in the list should
  be named. However, generic names will be generated if they are omitted.

  \code{columns} changes the order in which the columns are
  displayed. Additionally, columns may be omitted in the same way. User
  columns must be included in this vector for their display. If
  \code{columns} is omitted, all of the above columns will be displayed
  followed by the user columns.
}

\value{
  No value is returned. The function is evaluated solely for the side
  effect of producing the file \code{filename}.
}
\note{
  Written at the NASA Center for Computational Astrobiology \cr
  \url{http://cca.arc.nasa.gov/}
}
\author{Colin A. Smith, \email{cas277@nyu.edu}}

\examples{
probes <- ls(annaffySYMBOL)
aaf.summary(probes, "annaffy", "affy.html")
}
\keyword{ file }

--Q68bSM7Ycu6FN28Q--