[BioC] [beadarray] Converting Probe IDs to ILMN Gene Names...

Matthew Ritchie mritchie at wehi.EDU.AU
Wed Sep 22 05:14:28 CEST 2010


Yes - try reading the help page

?match

for further details.  If there are any control probes in your summary
object, these will not match up, since they aren't annotated as regular
transcripts.

Best wishes,

Matt

> So, the "ord" vector contains the indices within
> anno$Array_Address_Id_0 where we find the corresponding elements of
> featureNames(summary)?
>
> In other words (or symbols):
>
> anno$Array_Address_Id_0[ord[n]] = featureNames(summary)[n]
>
> I hope I have the index notation correct there.
>
> On Tue, Sep 21, 2010 at 10:49 PM, Matthew Ritchie <mritchie at wehi.edu.au>
> wrote:
>> Hi James,
>>
>> Something like:
>>
>> anno =
>> read.table("Annotation_Illumina_Human-WG-V3_hg18_V1.0.0_Aug09.txt",
>>                   header=TRUE, sep="\t", as.is=TRUE, quote="",
>> fill=TRUE)
>>
>> ord = match(featureNames(summary), anno$Array_Address_Id_0)
>>
>> ilmnids = anno$Search_key[ord]
>>
>> # OR
>> ilmnids = anno$Probe_id[ord] # not sure which ILMN_* you require!
>>
>> should match things up for you.
>>
>> I assume you are analyzing HT-12 arrays since you have set
>> 'imagesPerArray=1'.
>>
>> Best wishes,
>>
>> Matt
>>
>>> I guess I'm struggling with how to "match up" (I'm an R newbie) the
>>> ids even if I have the file.  Is there an example somewhere of how to
>>> do this?  I know this is a stupid question, but I am a Java programmer
>>> and I'm just learning R and trying to get my mind around this whole
>>> "everything is a vector" approach. :)
>>>
>>> By the way, I tried reading in the manifest file using the readBGX,
>>> but it kept throwing errors, saying something like "link 336 does not
>>> contain 28 elements".
>>>
>>> On Tue, Sep 21, 2010 at 8:24 PM, Matthew Ritchie <mritchie at wehi.edu.au>
>>> wrote:
>>>> Hi James,
>>>>
>>>> If you just wanted to annotate the probes, this could be done in R
>>>> using
>>>> the annotation package 'illuminaHumanv3BeadID.db'
>>>>
>>>> If you want to convert the numeric probe IDs to ILMN_* ids, then you
>>>> can
>>>> use the information in the file
>>>>
>>>> http://www.compbio.group.cam.ac.uk/Resources/Annotation/final/Annotation_Illumina_Human-WG-V3_hg18_V1.0.0_Aug09.zip
>>>>
>>>> (unzip, read in
>>>> 'Annotation_Illumina_Human-WG-V3_hg18_V1.0.0_Aug09.txt'
>>>> and then match up the probe ids in your summary object with the values
>>>> in
>>>> the 'Array_Address_Id_0' column.  The corresponding columns in this
>>>> file
>>>> with ILMN_* ids are either 'Search_Key_0' or 'Probe_Id_0' (entries in
>>>> both
>>>> start with ILMN_ but end in different numbers - I'm not sure which one
>>>> you
>>>> are after).  This information can also be obtained from the manifest
>>>> files
>>>> at
>>>>
>>>> http://www.switchtoi.com/annotationfiles.ilmn
>>>>
>>>> (you will need to select the text version of chip type you are using)
>>>>
>>>> I hope this helps.  Best wishes,
>>>>
>>>> Matt
>>>>
>>>>> I am trying to get a summarized table from our Illumina data.  So far
>>>>> I
>>>> have:
>>>>>
>>>>> targets = read.table("/home/jcarman/targets.txt", header=TRUE,
>>>> as.is=TRUE) detail =
>>>>> readIllumina(arrayNames=targets$Id,useImages=FALSE,annoPkg="Humanv3",targets=targets)
>>>> summary=createBeadSummaryData(detail,imagesPerArray=1,method="illumina")
>>>>>
>>>>> How do I get the probe ids mapped to the ILMN_* gene ids for my
>>>>> output?
>>>>>
>>>>> sessionInfo() returns:
>>>>>
>>>>> R version 2.11.1 (2010-05-31)
>>>>> x86_64-redhat-linux-gnu
>>>>>
>>>>> locale:
>>>>>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>>>>>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>>>>>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>>>>>  [7] LC_PAPER=en_US.utf8       LC_NAME=C
>>>>>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>> other attached packages:
>>>>> [1] beadarray_1.16.0 Biobase_2.8.0
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] hwriter_1.2  limma_3.4.4  tools_2.11.1


______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list