[BioC] forge BSgenome data package

Hervé Pagès hpages at fhcrc.org
Thu Jan 27 00:54:58 CET 2011


Hi Steve,

Since we've started this on the list, let's keep it there.

Please provide your seed file so I can get a better picture of what you
are doing. As I already mentioned before, you need to provide 1 FASTA
file per chromosome. But your cflo_v3.3.fold.fa file seems to contain
24026 sequences! So you have 2 options: (1) either you split it into 1
file per sequence, (2) either you store it as a multiple sequence in
your BSgenome data package.

Option 1: It's easy to split a FASTA file into 1 file per FASTA
record using the Biostrings package. You can do something like:

   library(Biostrings)
   x <- read.DNAStringSet("cflo_v3.3.fold.fa")
   for (i in seq_len(length(x))) {
     filepath <- paste(names(x)[i], ".fa", sep="")
     write.XStringSet(x[i], filepath)
   }

The problem is that then you will need to list *all* the sequences in
the seqnames field of your seed file. Another problem with this is that
it's probably not the right thing to do. See below.

Option 2: First some background. As you've probably noticed, the
sequence objects stored in a BSgenome data package are divided in 2
groups: the group of single sequences and the group of multiple
sequences. If you've ever displayed a BSgenome object that contains
upstream sequences, you can't have missed it. For example:

 > library(BSgenome.Celegans.UCSC.ce2)
 > Celegans
Worm genome
|
| organism: Caenorhabditis elegans (Worm)
| provider: UCSC
| provider version: ce2
| release date: Mar. 2004
| release name: WormBase v. WS120
|
| single sequences (see '?seqnames'):
|   chrI    chrII   chrIII  chrIV   chrV    chrX    chrM
|
| multiple sequences (see '?mseqnames'):
|   upstream1000  upstream2000  upstream5000
|
| (use the '$' or '[[' operator to access a given sequence)

Objects in the 1st group are DNAString (or MaskedDNAString) objects.
Each DNAString (or MaskedDNAString) object can only contain 1 sequence.
Objects in the 2nd group are DNAStringSet objects. Each DNAStringSet
object can contain an (almost) arbitrary number of sequences.
The 2 groups are respectively defined thru the seqnames and mseqnames
fields of the seed file. I'm just paraphrasing the existing
documentation here: all this is documented in the man page for
BSgenome objects and in the BSgenomeForge vignette.

Typically the 1st group will contain chromosome sequences, or, more
generally (and to use Ensembl terminology), the "top-level" sequences
of the genome. This can include the Mitochondrial DNA (often denoted
chrM), the 2-micron plasmid, the haplotype sequences, and other exotic
DNA material that is considered to be at the "top-level". The number
of top-level sequences varies from one genome to the other but it
remains generally small (< 50).

Typically the 2nd group will contain 1 or more big collections of
sequences. Each collection itself can have thousands of sequences.
But the number of collections should preferably remain small
(i.e. < 50). So I guess this would be the place of choice for your
cflo_v3.3.fold object (which seems to be a collection of 24026
scaffolds).

Now for the problem with your gap file, I can't really help without
seeing your seed file first. However, I'm a little bit puzzled that
you have a file containing assembly gaps for your scaffolds: that
doesn't seem to make a lot of sense to me. Anyway, you can't put
masks on a multiple sequence object so, if you choose option 2, you
will have to set nmask_per_seq to 0.

Also, as a general rule when forging a BSgenome data package, you
should ask yourself whether it is worth the extra effort to deal
with the masks at all. A BSgenome data package doesn't *have* to
contain masks.

Cheers,
H.


On 01/26/2011 02:53 PM, Steve Shen wrote:
> Hi Herve,
>
> Thank you so much for your help. I was finally able to have a working
> seed file. However, there seem to be still long way to go. I guess that
> most likely these errors are due to genome itself and gap file. I just
> wonder if you could kindly give me some hints what do these error
> message mean. Is gap file required for building up data package? Why do
> package only use first sequence and others are truncated? Is this
> because of two many sequences? What steps can I take in order to make it
> work with GenomicRange packages? I really appreciate your help. Thanks
> so much,
>
> Steve
>
>  > forgeBSgenomeDataPkg("Cflo_seed.R")
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>    Vignettes contain introductory material. To view, type
> 'openVignette()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>
> Attaching package: 'Biobase'
>
> The following object(s) are masked from 'package:IRanges':
>
>      updateObject
>
> Creating package in ./BSgenome.CFlo.NYU.v3
> Saving 'seqlengths' object to compressed data file
> './BSgenome.CFlo.NYU.v3/inst/extdata/seqlengths.rda'... DONE
> Loading FASTA file '/home/steve/Data/Genomes/cflo_v3.3.fold.fa' in
> 'cflo_v3.3.fold' object... DONE
> Saving 'cflo_v3.3.fold' object to compressed data file
> './BSgenome.CFlo.NYU.v3/inst/extdata/cflo_v3.3.fold.rda'... DONE
> Error in .guessGapFileCOL2CLASS(file) :
>    unable to guess the column names in "gap" file
> './cflo_v3.3.fold_gap.txt', sorry
> In addition: Warning messages:
> 1: In FUN("cflo_v3.3.fold"[[1L]], ...) :
>    In file '/home/steve/Data/Genomes/cflo_v3.3.fold.fa': 24026 sequences
> found, using first sequence only
> 2: In FUN("cflo_v3.3.fold"[[1L]], ...) :
>    In file '/home/steve/Data/Genomes/cflo_v3.3.fold.fa': sequence
> description
> "scaffold9scaffold12scaffold16scaffold2scaffold4scaffold20scaffold10scaffold11scaffold24scaffold1scaffold3scaffold6scaffold15scaffold28scaffold30scaffold22scaffold18scaffold21scaffold7scaffold32scaffold36scaffold13scaffold23scaffold39scaffold19scaffold41scaffold29scaffold37scaffold33scaffold45scaffold38scaffold17scaffold5scaffold46scaffold48scaffold31scaffold25scaffold51scaffold49scaffold44scaffold47scaffold54scaffold43scaffold55scaffold60scaffold35scaffold42scaffold63scaffold53scaffold50scaffold40scaffold67scaffold61scaffold69scaffold34scaffold70scaffold27scaffold73scaffold8scaffold75scaffold71scaffold74scaffold14scaffold66scaffold65scaffold80scaffold62scaffold76scaffold79scaffold85scaffold78scaffold86scaffold88scaffold81scaffold87scaffold26scaffold92scaffold93scaffold94scaffold59scaffold52scaffold84scaffold77scaffold89scaffold100scaffold97scaffold99scaffold64scaffold103scaffold90scaffold106scaffold56scaffold98
> [... truncated]
> 3: In .forgeSeqFile(name, prefix, suffix, seqs_srcdir, seqs_destdir,  :
>    file contains 24026 sequences, using the first sequence only
> 4: In file(file, "rt") :
>    cannot open file './cflo_v3.3.fold_gap.txt': No such file or directory
> 5: In file(file, "rt") :
>    cannot open file './cflo_v3.3.fold_gap.txt': No such file or directory
>
>
> 2011/1/20 Hervé Pagès <hpages at fhcrc.org <mailto:hpages at fhcrc.org>>
>
>     Steve,
>
>
>     On 01/19/2011 04:37 PM, Steve Shen wrote:
>
>         Hi Herve,
>
>         Thank you very much. I tried carefully with correct path and
>         file name
>         by triggering auto-complete feature. But it didn't help either.
>         It may
>         be something wrong with my seed file, so I also appended the
>         file below.
>         Please have a look. Thanks in advance,
>
>         Steve
>
>          > forgeBSgenomeDataPkg("/home/steve/Data/Genomes/seed")
>         Error in as.list(.readSeedFile(x, verbose = verbose)) :
>            error in evaluating the argument 'x' in selecting a method for
>         function 'as.list'
>          > forgeBSgenomeDataPkg("seed")
>         Error in as.list(.readSeedFile(x, verbose = verbose)) :
>            error in evaluating the argument 'x' in selecting a method for
>         function 'as.list'
>
>
>     An easy way to check if your seed file is a valid DCF file is:
>
>      > read.dcf("seed")
>       Error in read.dcf("seed") :
>         Line starting 'I'm a broken DCF fil ...' is malformed!
>
>     Again, the error message you get when calling forgeBSgenomeDataPkg() on
>     a broken DCF file should return exactly that, but, unfortunately, it
>     doesn't with the current version of BSgenome. Updated versions of
>     BSgenome (1.18.3 in release and 1.19.3 in devel) are correcting this
>     and will become available tomorrow around noon, and midnight,
>     respectively (Seattle time).
>
>
>
>         Seed file starts:
>
>         Package: BSgenome.CFlo.YU.v3
>         Title: Camponotus floridanus (Ants) full genome (YU version V3.3)
>         Description: Camponotus floridanus (Ants) full genome as
>         provided by YU
>         (V3.3, Jan. 2011)
>
>       ^^^^^^^^^^^^^^^^^
>     Is this on a line of its own in your file? This might be the problem.
>
>     Please consult the help for the read.dcf() function (?read.dcf from
>     your R session) for how to format a DCF file. Note that a 'tag:value'
>     should either be placed on a single line, or, if the value spans
>     several lines, then the extra lines (aka "continuation lines") must
>     start with a whitespace. Like this:
>
>
>     Description: Camponotus floridanus (Ants) full genome as provided
>         by YU (V3.3, Jan. 2011)
>
>     (I'm using 4 whitespaces here but 1 is enough. Using a <TAB> would
>     also be OK.)
>
>
>         Version: 1.0.0
>         organism: camponotus floridanus
>         species: Ant
>         provider: YU
>         provider_version: Assembly V3.3
>         release_date: Jan, 2011
>         release_name: Ant Genome Reference Consortium
>         source_url: NA
>         organism_biocview: C_flo
>         BSgenomeObjname: CFlo
>         seqnames: "cflo_v3.3.fold"
>         mseqnames: character(0)
>         nmask_per_seq:2
>
>         seqs_srcdir: /home/steve/Data/Genomes/
>
>
>     So you have a FASTA file named "cflo_v3.3.fold.fa" located in
>     /home/steve/Data/Genomes/ that contains a *single* FASTA record
>     that corresponds to the cflo_v3.3.fold sequence? Please double check
>     this. (You can see the list of records contained in a FASTA file with
>     the fasta.info <http://fasta.info>() function from Biostrings.)
>
>     H.
>
>         # masks_srcdir: /home/steve/Data/Masks_src
>         # AGAPSfiles_name: cflo_v3.3.fa.masked
>
>
>         2011/1/19 Hervé Pagès <hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org> <mailto:hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org>>>
>
>             Hi Steve,
>
>
>             On 01/19/2011 01:04 PM, Steve Shen wrote:
>
>                 Hi Herve,
>
>                 Thank you so much for your quick reply. Your vignette is
>         pretty
>                 clear. I
>                 think the problem is on my side. I just lack of
>         experience on
>                 dealing
>                 with this issue and maybe the file format doesn't fit. I
>         sort of
>                 gave up
>                 using the seed file but just using low level commands. I
>         still have
>                 errors. I really have no idea what is going on. Your
>         help will
>                 be much
>                 appreciated.
>
>                 Best,
>                 Steve
>
>                 1. with seed file, I got
>          > forgeBSgenomeDataPkg("./cflo_seed.R")
>                 Error in as.list(.readSeedFile(x, verbose = verbose)) :
>                    error in evaluating the argument 'x' in selecting a
>         method for
>                 function 'as.list'
>
>
>             I agree that the error message is kind of obscure but the
>         problem
>             is probably that the specified path to your seed file is
>         invalid.
>             I'm about to commit a change to the BSgenomeForge code that will
>             produce this error message instead:
>
>          > forgeBSgenomeDataPkg("aaaaa")
>             Error in .readSeedFile(x, verbose = verbose) :
>               seed file 'aaaaa' not found
>
>             An easy way to make sure the path is valid is to press <TAB>
>         when
>             you are in the middle of typing the path: this will trigger the
>             auto-completion feature.
>
>
>
>                 2. with command forgeSeqlengthsFile,
>          > forgeSeqlengthsFile("cflo_v3.3.fold", prefix="",
>                 suffix=".fa",seqs_srcdir=".", seqs_destdir=".",
>         verbose=TRUE)
>                 Saving 'seqlengths' object to compressed data file
>         './seqlengths.rda'...
>                 DONE
>                 Warning messages:
>                 1: In FUN("cflo_v3.3.fold"[[1L]], ...) :
>                    In file './cflo_v3.3.fold.fa': 24026 sequences found,
>         using first
>                 sequence only
>                 2: In FUN("cflo_v3.3.fold"[[1L]], ...) :
>                    In file './cflo_v3.3.fold.fa': sequence description
>         "scaffold9scaffold12scaffold16scaffold2scaffold4scaffold20scaffold10scaffold11scaffold24scaffold1scaffold3scaffold6scaffold15scaffold28scaffold30scaffold22scaffold18scaffold21scaffold7scaffold32scaffold36scaffold13scaffold23scaffold39scaffold19scaffold41scaffold29scaffold37scaffold33scaffold45scaffold38scaffold17scaffold5scaffold46scaffold48scaffold31scaffold25scaffold51scaffold49scaffold44scaffold47scaffold54scaffold43scaffold55scaffold60scaffold35scaffold42scaffold63scaffold53scaffold50scaffold40scaffold67scaffold61scaffold69scaffold34scaffold70scaffold27scaffold73scaffold8scaffold75scaffold71scaffold74scaffold14scaffold66scaffold65scaffold80scaffold62scaffold76scaffold79scaffold85scaffold78scaffold86scaffold88scaffold81scaffold87scaffold26scaffold92scaffold93scaffold94scaffold59scaffold52scaffold84scaffold77scaffold89scaffold100scaffold97scaffold99scaffold64scaffold103scaffold90scaffold106scaffold56scaffold98scaffold109scaffold68sc
>                 [... truncated]
>
>                 3. with forgeSeqfiles,
>          > forgeSeqFiles("cflo_v3.3.fold", mseqnames=NULL, prefix="",
>                 suffix=".fa",seqs_srcdir=".", seqs_destdir="./BSgenome",
>                 verbose=TRUE)
>                 Loading FASTA file './cflo_v3.3.fold.fa' in 'cflo_v3.3.fold'
>                 object... DONE
>                 Saving 'cflo_v3.3.fold' object to compressed data file
>         './BSgenome/cflo_v3.3.fold.rda'... DONE
>                 Warning message:
>                 In .forgeSeqFile(name, prefix, suffix, seqs_srcdir,
>         seqs_destdir,  :
>                    file contains 24026 sequences, using the first
>         sequence only
>
>
>             Sorry but I can only try to help if you stick to the
>         vignette and
>             use a seed file. Using low-level functions like
>         forgeSeqlengthsFile()
>             or forgeSeqFiles() is undocumented/unsupported. The reason
>         for this
>             is that this route is *much* more complicated and error-prone
>             than using a seed file! This is exactly why seed files where
>             invented: to make the whole process much easier for you, and to
>             make troubleshooting much easier for us.
>
>             Cheers,
>             H.
>
>
>
>
>                 2011/1/19 Hervé Pagès <hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>>>
>
>
>                     Hi Steve,
>
>                     There are several things that look wrong with your
>         seed file.
>
>                     1. The # must be the first character in a line to
>         make it a
>                        line of comment (ignored).
>                        For example, those 2 lines will certainly not be
>         interpreted
>                        as you might expect:
>
>
>                          mseqnames: NA #paste("upstream", c("1000", "2000",
>         "5000"), sep="")
>
>                          source_url: NA #
>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/
>
>                     2. If you don't have multiple sequences, specify:
>
>                          mseqnames: character(0)
>
>                     3. As explained in the BSgenomeForge vignette, the
>         default for
>                        seqfiles_prefix is .fa so this won't work:
>
>                          seqnames: "clof_v3.fa"
>
>                        Unless your file is named clof_v3.fa.fa?
>                        If your file is named clof_v3.fa, then you should
>         specify:
>
>                          seqnames: "clof_v3"
>
>                     4. This doesn't look like a valid file of assembly gaps:
>
>
>                          AGAPSfiles_name: clof_v3.fa.masked
>
>                        Please refer to the BSgenomeForge vignette for
>         what kind
>                 of masks
>                        and what file formats are supported.
>
>                     5. Having this
>
>
>                          PkgExamples: Hsapiens
>                                 seqlengths(Hsapiens)
>                                 Hsapiens$chr1  # same as Hsapiens[["chr1"]]
>
>                        in a seed file for clof obviously doesn't make
>         sense and your
>                        package won't pass 'R CMD check' because it will
>         contain
>                 broken
>                        examples.
>
>                     There might be other problems with your seed file.
>         All you
>                 need to do
>                     is read and follow carefully the instructions
>         described in the
>                     BSgenomeForge vignette. Let me know if things are
>         not clear
>                 in the
>                     vignette and I'll try to improve it. Thanks!
>
>                     H.
>
>
>
>                     On 01/18/2011 05:32 PM, Steve Shen wrote:
>
>                         Hi list,
>
>                         This wasn't sent out with a none registered ids.
>         I have a
>                         problem with
>                         forging a new BSgenome data package. The
>         sequence data
>                 file is
>                         clof.v3.fa
>                         and the mask file is clof.v3.fa.masked. Below
>         are seed file,
>                         command, error
>                         and sessioninfo. Your help will be much appreciated.
>
>                         Thanks a lot,
>                         Steve
>
>                         Package: BSgenome.Clof.yu.v3
>                         Title: Clof (insects) full genome (version 3)
>                         Description: Clof (insects) full genome as
>         provided by
>                 YU (V3.3,
>                         Jan. 2011)
>                         # and will store in Biostrings objects.
>                         Version: 1.0.0
>                         organism: clof
>                         species: insect
>                         provider: YU
>                         provider_version: Assembly V3.3
>                         release_date: Jan, 2011
>                         release_name: insects Genome Reference Consortium
>                         source_url: NA #
>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/
>                         organism_biocview: Clof
>                         BSgenomeObjname: Clof
>                         seqnames: "clof_v3.fa"
>                         mseqnames: NA #paste("upstream", c("1000", "2000",
>         "5000"), sep="")
>                         nmask_per_seq: 2
>                         #SrcDataFiles1: sequences: chromFa.zip,
>         upstream1000.zip,
>                         upstream2000.zip,
>                         upstream5000.zip
>                              #from
>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/
>                         #SrcDataFiles2: AGAPS masks:
>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gap.txt.gz
>                              #RM masks:
>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromOut.tar.gz
>                              #TRF masks:
>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromTrf.tar.gz
>                         PkgExamples: Hsapiens
>                              seqlengths(Hsapiens)
>                              Hsapiens$chr1  # same as Hsapiens[["chr1"]]
>                         seqs_srcdir: /home/steve/Data/Genomes
>                         masks_srcdir: /home/steve/Data/Genomes
>                         AGAPSfiles_name: clof_v3.fa.masked
>
>                         The command, error and sessionInfo are below
>
>                             forgeBSgenomeDataPkg("Clof_seed.R",
>         seqs_srcdir=".",
>                             masks_srcdir=".",
>
>                         destdir=".", verbose=TRUE)
>                         Loading required package: Biobase
>
>                         Welcome to Bioconductor
>
>                            Vignettes contain introductory material. To
>         view, type
>         'openVignette()'. To cite Bioconductor, see
>         'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>
>                         Attaching package: 'Biobase'
>
>                         The following object(s) are masked from
>         'package:IRanges':
>
>                              updateObject
>
>                         Error in forgeBSgenomeDataPkg(y, seqs_srcdir =
>         seqs_srcdir,
>                         masks_srcdir =
>                         masks_srcdir,  :
>                            values for symbols NMASKPERSEQ are not single
>         strings
>                         In addition: Warning message:
>                         In storage.mode(x$nmask_per_seq)<- "integer" : NAs
>                 introduced by
>                         coercion
>
>                             sessionInfo()
>
>                         R version 2.12.1 (2010-12-16)
>                         Platform: x86_64-pc-linux-gnu (64-bit)
>
>                         locale:
>                           [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>                           [3] LC_TIME=en_US.UTF-8
>           LC_COLLATE=en_US.UTF-8
>                           [5] LC_MONETARY=C
>           LC_MESSAGES=en_US.UTF-8
>                           [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>                           [9] LC_ADDRESS=C               LC_TELEPHONE=C
>                         [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>                         attached base packages:
>                         [1] stats     graphics  grDevices utils     datasets
>                   methods   base
>
>                         other attached packages:
>                         [1] Biobase_2.6.1       BSgenome_1.18.2
>                 Biostrings_2.18.0
>                         [4] GenomicRanges_1.2.1 IRanges_1.8.8
>
>                         loaded via a namespace (and not attached):
>                         [1] tools_2.12.1
>
>                                 [[alternative HTML version deleted]]
>
>                         _______________________________________________
>                         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>>>
>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>                         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>                     --
>                     Hervé Pagès
>
>                     Program in Computational Biology
>                     Division of Public Health Sciences
>                     Fred Hutchinson Cancer Research Center
>                     1100 Fairview Ave. N, M2-B876
>                     P.O. Box 19024
>                     Seattle, WA 98109-1024
>
>                     E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>>
>
>
>                     Phone:  (206) 667-5791
>                     Fax:    (206) 667-1319
>
>
>
>
>             --
>             Hervé Pagès
>
>             Program in Computational Biology
>             Division of Public Health Sciences
>             Fred Hutchinson Cancer Research Center
>             1100 Fairview Ave. N, M2-B876
>             P.O. Box 19024
>             Seattle, WA 98109-1024
>
>             E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>
>             Phone:  (206) 667-5791
>             Fax:    (206) 667-1319
>
>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M2-B876
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>     Phone:  (206) 667-5791
>     Fax:    (206) 667-1319
>
>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list