[BioC] Problems selecting rows from dataframe (exprs) of GNF Atlas data....

Bas Jansen bjhjansen at gmail.com
Tue Jan 3 13:47:36 CET 2012


Dear Sebastian:

Thanks for your swift reply. It works, but only for the probe ID that
start with a character (only ~15 out of the > 100 probe IDs I want to
investigate). Those that start with a number report back with "<0
rows> (or 0-length row.names)". The motto for the New Year seems to be
'Solve a problem, only to find new ones'. Phew.

Kind regards,
Bas

On Tue, Jan 3, 2012 at 11:19 AM, Sebastian Thieme
<thieme at mi.fu-berlin.de> wrote:
> Hello,
>
> happy new year too =)
>
> you can use exprs[ rownames(exprs) %in% "gnf1h00499_at",] or exprs[
> rownames(exprs) %in% vectorOfNames,], where vectorOfNames is a list or
> a vector of the names you are looking for. Important is that the
> object you are search in has to be the first argument. If you want
> requesting a high number of names use lists instead of dataframes.
>
> best
>
> Basti
>
> 2012/1/3 Bas Jansen <bjhjansen at gmail.com>:
>> Dear fellow Bioconductor users:
>>
>> Happy New Year!
>> At the moment I am analyzing the GNF Atlas data. I retrieved the data
>> from the Gene Expression Omnibus using the package GEOquery, converted
>> it to an expressionSet and extracted the expression values. So now I
>> have a data frame from which I would like to extract the expression
>> values of > 100 probe IDs for 79 tissues. Thing is, if I use a single
>> probe ID, things go fine. However, whenever I use a string of probe
>> IDs, things go awry.
>>
>> See below:
>>
>> ***
>>> exprs[c("gnf1h00499_at"),]
>>              GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774
>> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488
>> (abbreviated for reasons of clarity)
>> ***
>>
>> As stated above: whenever I use a string of probe IDs (say, like 2
>> probe IDs), things go awry:
>>
>> ***
>>> exprs[c("gnf1h00499_at","gnf1h500_at"),]
>>              GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774
>> gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488
>> NA                  NA       NA       NA       NA       NA       NA       NA
>> etc.
>> ***
>>
>> The gnf1h00500 probe is reported as NA, and I'm pretty sure it has
>> real expression values associated with it.
>> The following just works fine:
>>
>> ***
>>> exprs[c(1:20,30:70),]
>>            GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774
>> 200000_s_at        0        0        0        0        0        0        0
>> 200001_at          0        0        0        0        0        0        0
>> 200002_at          0        0        0        0        0        0        0
>> 200003_s_at        0        0        0        0        0        0        0
>> etc.
>> ***
>>
>> So, how do I select rows on the basis of probe IDs? Or better yet:
>> what am I overlooking????
>>
>> Thanks & kind regards,
>> Bas
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list