[BioC] gwSnpTests in GGtools

francesca casalino francy.casalino at gmail.com
Wed Nov 30 13:53:55 CET 2011


Thank you so much! It is much clearer now.
Also, the code you suggested for splitting the data into chromosomes
works perfectly.
Thanks again!

2011/11/29 Vincent Carey <stvjc at channing.harvard.edu>:
>
>
> On Tue, Nov 29, 2011 at 10:04 AM, francesca casalino
> <francy.casalino at gmail.com> wrote:
>>
>> Thank you so much for all your valuable information and help.
>>
>> The gwSnpTests now works, since I specified a function as you suggested:
>>
>>    f1 = gwSnpTests(probeId("10023813203")~GENDER, ss)
>>
>> However I still find myself having to limit the chromosome data at the
>> plink import stage to chromosome 19, and can't seem to be able to do
>> this after the creation of the smlSet. As you say, the gwSnpTeststakes
>> uses all the SNPs in the smList(smlSet) to run the association, but I
>> am not getting a list organized by chromosome, and I think my problem
>> is the original sml_Set list, which I created doing this:
>>
>>    ss<- make_smlSet(es, list("1"=snp.matrix$genotypes))
>>
>
> The above command will create an smlSet with smList of length one.  If you
> want to be
> able to separate out chromosomes of SNP, you should take
> snp.matrix$genotypes and do a
> little more work.  The columns of this entity are SNP loci.  You might be
> able to do the following.
> Make a table, say CMAP, that has column 1 the snp IDs (usually rs numbers,
> perhaps something else) and column
> 2 the chromosome to which each SNP belongs.  For simplicity make sure that
> each column is a character vector.
>
> then create the list SMAP = split(CMAP[,1], CMAP[,2])
>
> this will be a list with as many elements as there are chromosomes.  The
> contents of each list element are the
> snp residing on the corresponding chromosome, and the names(SMAP) is a
> vector of chromosome names
>
> Now, SLIST = lapply(SMAP, function(x) snp.matrix$genotypes[, x])  # errors
> can occur if SMAP has elements not in colnames ...
> names(SLIST) = names(SMAP)
>
> will create a list of SnpMatrix instances with one SnpMatrix per
> chromosome.  you can use SLIST as
> the second argument to make_smlSet, and you can use chrnum() to select
> specific element used in testing
> with gwSnpTests.
>
>>
>> While I have SNPs from all chromosomes in the snp.matrix$genotypes, I
>> guess I am indicating wrongly here that I only have one chromosome. Is
>> it possible to create an smlSet with all the chromosomes, and then
>> select only some using:
>>    chr19_20 <- getSS("GGtools", c("19", "20"))
>>
>> Also when I try to find the genesymbol APOE (which is present in the
>> original plink files) I get:
>>
>>    f2 = gwSnpTests(genesym("APOE")~GENDER, ss)
>>    character(0)
>
>
> This is a convenience feature that will succeed if you have annotation(ss) =
> "[name of a bioconductor .db chip annotation package]" and the symbol you
> ask for is mapped.
>
>>
>>    Failed with error:  ‘'package' must be of length 1’
>>    Error in revmap(get(paste(gsub(".db", "", annpack), "SYMBOL", sep =
>> ""))) :
>>     error in evaluating the argument 'x' in selecting a method for
>> function 'revmap': Error in get(paste(gsub(".db", "", annpack),
>> "SYMBOL", sep = "")) :
>>     object 'SYMBOL' not found
>>
>> Which could still be due to creating an incorrect sml_Set.
>>
>
> These error messages can be clarified, sorry about this.
>
>>
>> Thank you once more for your all your advices!
>> -f
>>
>> 2011/11/29 Vincent Carey <stvjc at channing.harvard.edu>:
>> >
>> >
>> > On Tue, Nov 29, 2011 at 5:20 AM, francy [guest] <guest at bioconductor.org>
>> > wrote:
>> >>
>> >>
>> >> Hi all,
>> >>
>> >> I am trying to find eQTLs in or around a particular gene with probe ID=
>> >> "10023813203" (gene is APOE). I have first selected the SNPs on only my
>> >> chromosome of interest (chr19), then imported the plink files only for
>> >> this
>> >> chromosome doing this:
>> >>
>> >>    snp.matrix<-read.plink("plink.bed", "plink.bim",
>> >> "plink.fam",select.snps=chr19)
>> >>
>> >> I was able to create an expression set (called 'es'), and a sml_Set by
>> >> doing this:
>> >>
>> >>    ss<- make_smlSet(es, list("1"=snp.matrix$genotypes))
>> >>
>> >> but I can't seem to go beyond and use this sml_Set to perform the
>> >> association.
>> >>
>> >> When I try this or other combinations of this (e.g. using 'APOE')
>> >>    f1 = gwSnpTests(probeId("10023813203"), ss)
>> >>
>> >> Error in function (classes, fdef, mtable)  :
>> >>  unable to find an inherited method for function "gwSnpTests", for
>> >> signature "character", "smlSet", "missing", "missing"
>> >>
>> >
>> > When you encounter an error of this sort, please check what signatures
>> > are
>> > supported:
>> >
>> >> showMethods("gwSnpTests")
>> > Function: gwSnpTests (package GGtools)
>> > sym="formula", sms="smlSet", cnum="cnumOrMissing", cs="missing"
>> > sym="formula", sms="smlSet", cnum="snpdepth", cs="missing"
>> >
>> > This shows that the first argument should be a formula, and suggests
>> > that
>> > you can have only two arguments if you like.
>> >
>> > If you change your call to
>> >
>> >  f1 = gwSnpTests(probeId("10023813203")~1, ss)
>> >
>> > I would expect it to succeed.  Note the example code, from R 2.14, which
>> > you
>> > really should be using at this time:
>> >
>> >> hmceuB36.2021 <- getSS("GGtools", c("20", "21"))    # construct smlSet
>> >> from prepackaged data
>> >>  hmFou = hmceuB36.2021[, which(hmceuB36.2021$isFounder)]  # filter
>> >> samples
>> >> to 'founders'
>> >>  f1 = gwSnpTests(genesym("CPNE1")~male, hmFou, chrnum(20))  # execute
>> >> simple set of tests
>> >> f1                   # there are 119921 tests, so have a concise report
>> > gwSnpScreenResult for gene  CPNE1  [probe  GI_23397697-A ]
>> >> topSnps(f1)   # get top results
>> >                   p.val
>> > rs17093026 3.735759e-10
>> > rs1118233  1.272958e-09
>> > rs12480408 1.360682e-09
>> > rs6060535  1.360682e-09
>> > rs11696527 1.360682e-09
>> > rs6058303  1.360682e-09
>> > rs6060578  1.360682e-09
>> > rs2425078  1.360682e-09
>> > rs1970357  1.360682e-09
>> > rs7273815  1.806197e-09
>> >
>> >
>> >
>> >
>> >>
>> >> My first question is, why is the 'gwSnpTests' not working?
>> >
>> >
>> > Please use a supported call sequence.
>> >
>> >>
>> >> My second question is, do I have to select the chromosome I am
>> >> interested
>> >> in before creating the sml test? I would have liked to select
>> >> chromosome 19
>> >> after so that I could analyse more than this one chromosome if I wanted
>> >> to…Is this possible?? It seems as the snp.matrix must be in the form
>> >> of a
>> >> list, so maybe I have to create a list of all the chromosomes?
>> >>
>> >> THANK YOU VERY VERY MUCH FOR ANY HELP YOU COULD GIVE ME!! I really
>> >> appreciate it!
>> >>
>> >
>> > The signatures show that you can omit the chromosome if you wish.  In
>> > this
>> > case,
>> >
>> >>  f2 = gwSnpTests(genesym("CPNE1")~male, hmFou)
>> >> f2
>> > gwSnpScreenResult for gene  CPNE1  [probe  GI_23397697-A ]
>> >> topSnps(f2)
>> > $`20`
>> >                   p.val
>> > rs17093026 3.735759e-10
>> > rs1118233  1.272958e-09
>> > rs12480408 1.360682e-09
>> > rs6060535  1.360682e-09
>> > rs11696527 1.360682e-09
>> > rs6058303  1.360682e-09
>> > rs6060578  1.360682e-09
>> > rs2425078  1.360682e-09
>> > rs1970357  1.360682e-09
>> > rs7273815  1.806197e-09
>> >
>> > $`21`
>> >                   p.val
>> > rs2823672  3.024310e-05
>> > rs4257464  4.340207e-05
>> > rs16994832 4.340207e-05
>> > rs2823676  4.340207e-05
>> > rs2823677  4.340207e-05
>> > rs8131686  4.340207e-05
>> > rs2823683  4.340207e-05
>> > rs238983   4.340207e-05
>> > rs2828436  4.340207e-05
>> > rs2828438  4.340207e-05
>> >
>> > gwSnpTests will operate on all the SNPs in the smList(smlSet) and return
>> > lists organized by
>> > chromosome.  Hence the "gw" -- it is possible to compute a genome-wide
>> > search for eQTL if the
>> > smlSet contains SNP from all chromosomes.  A few years ago this was
>> > quite
>> > reasonable when we
>> > handled, say, 4 million SNP.  Then it became less reasonable when we
>> > started
>> > to work with 8 million
>> > SNP.  So the infrastructure changed to deemphasize working with all
>> > chromosomes at once -- thus the
>> > introduction of getSS() to construct the smlSet for a selected set
>> > (typically only one) of chromosomes of SNP.
>> >
>> > Concisely computing and managing results from transcriptome x genome
>> > searches is addressed by
>> > the eqtlTests function and by genewiseFDRtab ... these functions are
>> > under
>> > development to simplify these
>> > tasks, which can be arduous as large imputed SNP panels come into play.
>> > Thus it is relevant to work with
>> > the devel branch as you start hitting limits.  I will provide more news
>> > as
>> > work proceeds.  As the initial discussion
>> > of this topic occurred on biostar list, I will note for other readers
>> > that a
>> > tutorial on using GGtools with R 2.14 is
>> > available at ismb11gg.wordpress.com, and that the ggtut experimental
>> > data
>> > package underlies the tutorial.  Of
>> > note is that ggtut will not pass check with R devel, because some
>> > serialized
>> > objects conflict with revised class
>> > definitions.  This will be sorted before too long.
>> >
>> >
>> >>
>> >>  -- output of sessionInfo():
>> >
>> >
>> > I strongly advise you to upgrade to R 2.14.  My sessionInfo for the runs
>> > above is
>> >
>> > R version 2.14.0 Patched (2011-11-09 r57622)
>> > Platform: x86_64-unknown-linux-gnu (64-bit)
>> >
>> > locale:
>> >  [1] LC_CTYPE=en_US.iso88591       LC_NUMERIC=C
>> >  [3] LC_TIME=en_US.iso88591        LC_COLLATE=en_US.iso88591
>> >  [5] LC_MONETARY=en_US.iso88591    LC_MESSAGES=en_US.iso88591
>> >  [7] LC_PAPER=C                    LC_NAME=C
>> >  [9] LC_ADDRESS=C                  LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.iso88591 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] splines   stats     graphics  grDevices datasets  tools     utils
>> > [8] methods   base
>> >
>> > other attached packages:
>> >  [1] illuminaHumanv1.db_1.12.1 GGtools_4.1.8
>> >  [3] ff_2.2-3                  bit_1.1-7
>> >  [5] GenomicRanges_1.6.2       org.Hs.eg.db_2.6.4
>> >  [7] rtracklayer_1.14.2        RCurl_1.7-0
>> >  [9] bitops_1.0-4.1            IRanges_1.12.1
>> > [11] annotate_1.32.0           AnnotationDbi_1.16.2
>> > [13] GGBase_3.15.2             genefilter_1.36.0
>> > [15] RSQLite_0.10.0            DBI_0.2-5
>> > [17] snpStats_1.4.0            Matrix_1.0-1
>> > [19] lattice_0.20-0            survival_2.36-10
>> > [21] BiocGenerics_0.1.0        Biobase_2.14.0
>> > [23] weaver_1.20.0             codetools_0.2-8
>> > [25] digest_0.5.1              BiocInstaller_1.2.1
>> >
>> > loaded via a namespace (and not attached):
>> > [1] Biostrings_2.22.0 BSgenome_1.22.0   grid_2.14.0
>> > Rsamtools_1.7.1
>> > [5] XML_3.4-3         xtable_1.6-0      zlibbioc_1.0.0
>> >
>> >>
>> >>
>> >> > sessionInfo()
>> >> R version 2.13.1 (2011-07-08)
>> >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>> >>
>> >> locale:
>> >> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>> >>
>> >> attached base packages:
>> >> [1] tools     splines   stats     graphics  grDevices utils
>> >> datasets
>> >> [8] methods   base
>> >>
>> >> other attached packages:
>> >>  [1] GGtools_3.10.2       ff_2.2-3             bit_1.1-7
>> >>  [4] GenomicRanges_1.4.8  org.Hs.eg.db_2.5.0   rtracklayer_1.12.5
>> >>  [7] RCurl_1.6-10         bitops_1.0-4.1       IRanges_1.10.6
>> >> [10] annotate_1.30.1      AnnotationDbi_1.14.1 GGBase_3.12.0
>> >> [13] RSQLite_0.10.0       DBI_0.2-5            snpStats_1.2.1
>> >> [16] Matrix_0.999375-50   lattice_0.19-33      survival_2.36-9
>> >> [19] Biobase_2.12.2
>> >>
>> >> loaded via a namespace (and not attached):
>> >> [1] Biostrings_2.20.4 BSgenome_1.20.1   grid_2.13.1       XML_3.4-3
>> >> [5] xtable_1.5-6
>> >>
>> >>
>> >> --
>> >> Sent via the guest posting facility at bioconductor.org.
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>> >
>
>



More information about the Bioconductor mailing list