[BioC] Problems selecting rows from dataframe (exprs) of GNF Atlas data....

Bas Jansen bjhjansen at gmail.com
Tue Jan 3 15:37:28 CET 2012


Hi Axel, hi Sebastian:

Thanks for the cookie, Axel. Anyway, I have done the following:

> exprs <- as.dataframe(exprs(eset))
> rownames(exprs)
    [1] "200000_s_at"                 "200001_at"
    [3] "200002_at"                   "200003_s_at"
    [5] "200004_at"                   "200005_at"
    [7] "200006_at"                   "200007_at"
    [9] "200008_s_at"                 "200009_at"
   [11] "200010_at"                   "200011_s_at"
   [13] "200012_x_at"                 "200013_at"
   [15] "200014_s_at"                 "200015_s_at"
   [17] "200016_x_at"                 "200017_at"
etc.

So I would argue that the 'numbers' are recognized as rownames here,
but I cannot select them as indicated in a previous email. Strange,
isn't it?
I still need to try Sebastian's suggestions though, so let's not run
off the cliff just yet. Below the sessionInfo.

Kind regards,
Bas

> sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C/UTF-8/C/C/C/C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] fortunes_1.4-2 Biobase_2.14.0

loaded via a namespace (and not attached):
[1] tcltk_2.14.0 tools_2.14.0


On Tue, Jan 3, 2012 at 2:33 PM,  <axel.klenk at actelion.com> wrote:
> Dear Bas,
>
> I think you'll need to show us your original code, in particular what your
> 'exprs' is
> and how you have obtained it. If you have "extracted the expression
> values" from
> an ExpressionSet ES like
>
> x <- exprs(ES)
>
> then x is a matrix and not a data.frame -- but then your output would look
> slightly
> different. If you have done something like
>
> x <- data.frame(exprs(ES))
>
> I can reproduce your output, including rows that are all NA -- for
> rownames that
> do not exist.
>
> So: how did you create 'exprs' and are you sure your rownames are ok?
>
> Cheers,
>
>  - axel
>
>
> BTW: try
>
> install.packages("fortunes")
> library("fortunes")
> fortune("dog")
>
> to see why 'exprs' may not be a good name for your object... :-)
>
>
>
> Axel Klenk
> Research Informatician
> Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil /
> Switzerland
>
>
>
>
> From:
> Bas Jansen <bjhjansen at gmail.com>
> To:
> Sebastian Thieme <thieme at mi.fu-berlin.de>
> Cc:
> bioconductor at r-project.org
> Date:
> 03.01.2012 13:48
> Subject:
> Re: [BioC] Problems selecting rows from dataframe (exprs) of GNF Atlas
> data....
> Sent by:
> bioconductor-bounces at r-project.org
>
>
>
> Dear Sebastian:
>
> Thanks for your swift reply. It works, but only for the probe ID that
> start with a character (only ~15 out of the > 100 probe IDs I want to
> investigate). Those that start with a number report back with "<0
> rows> (or 0-length row.names)". The motto for the New Year seems to be
> 'Solve a problem, only to find new ones'. Phew.
>
> Kind regards,
> Bas
>
> On Tue, Jan 3, 2012 at 11:19 AM, Sebastian Thieme
> <thieme at mi.fu-berlin.de> wrote:
>> Hello,
>>
>> happy new year too =)
>>
>> you can use exprs[ rownames(exprs) %in% "gnf1h00499_at",] or exprs[
>> rownames(exprs) %in% vectorOfNames,], where vectorOfNames is a list or
>> a vector of the names you are looking for. Important is that the
>> object you are search in has to be the first argument. If you want
>> requesting a high number of names use lists instead of dataframes.
>>
>> best
>>
>> Basti
>>
>> 2012/1/3 Bas Jansen <bjhjansen at gmail.com>:
>>> Dear fellow Bioconductor users:
>>>
>>> Happy New Year!
>>> At the moment I am analyzing the GNF Atlas data. I retrieved the data
>>> from the Gene Expression Omnibus using the package GEOquery, converted
>>> it to an expressionSet and extracted the expression values. So now I
>>> have a data frame from which I would like to extract the expression
>>> values of > 100 probe IDs for 79 tissues. Thing is, if I use a single
>>> probe ID, things go fine. However, whenever I use a string of probe
>>> IDs, things go awry.
>>>
>>> See below:
>>>
>>> ***
>>>> exprs[c("gnf1h00499_at"),]
>>>              GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781
> GSM18774
>>> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074
> 4.472488
>>> (abbreviated for reasons of clarity)
>>> ***
>>>
>>> As stated above: whenever I use a string of probe IDs (say, like 2
>>> probe IDs), things go awry:
>>>
>>> ***
>>>> exprs[c("gnf1h00499_at","gnf1h500_at"),]
>>>              GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781
> GSM18774
>>> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074
> 4.472488
>>> NA                  NA       NA       NA       NA       NA       NA
>  NA
>>> etc.
>>> ***
>>>
>>> The gnf1h00500 probe is reported as NA, and I'm pretty sure it has
>>> real expression values associated with it.
>>> The following just works fine:
>>>
>>> ***
>>>> exprs[c(1:20,30:70),]
>>>            GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781
> GSM18774
>>> 200000_s_at        0        0        0        0        0        0
>  0
>>> 200001_at          0        0        0        0        0        0
>  0
>>> 200002_at          0        0        0        0        0        0
>  0
>>> 200003_s_at        0        0        0        0        0        0
>  0
>>> etc.
>>> ***
>>>
>>> So, how do I select rows on the basis of probe IDs? Or better yet:
>>> what am I overlooking????
>>>
>>> Thanks & kind regards,
>>> Bas
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
> The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged.
> It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email.
> The content of this email is not legally binding unless confirmed by letter.
> Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com
>



More information about the Bioconductor mailing list