[BioC] [beadarray] Converting Probe IDs to ILMN Gene Names...

Wed Sep 22 11:03:42 CEST 2010

Hi James,

The search_key ids tend to be more gene-specific. Probes that target
the same genes will often share the same Search_key but have different
Probe_ids. I prefer to work with the Probe_id as my identifier as
probes that target the same gene often behave very differently and you
really need to check their annotation carefully before combining them.

btw, at the moment there is no automatic way of converting bead-level
ids to illumina IDs within beadarray, but we are considering it for
future releases.

Best,

Mark

On Wed, Sep 22, 2010 at 4:16 AM, James Carman
<james at carmanconsulting.com> wrote:
> What's the difference between "Search_key" and "Probe_id"?  I want the
> ILMN ids that folks would most likely be using.  I think I saw it
> referred to before as the "ILMN Gene" or something like that.
>
> On Tue, Sep 21, 2010 at 10:49 PM, Matthew Ritchie <mritchie at wehi.edu.au> wrote:
>> Hi James,
>>
>> Something like:
>>
>> anno = read.table("Annotation_Illumina_Human-WG-V3_hg18_V1.0.0_Aug09.txt",
>>                   header=TRUE, sep="\t", as.is=TRUE, quote="", fill=TRUE)
>>
>> ord = match(featureNames(summary), anno$Array_Address_Id_0)
>>
>> ilmnids = anno$Search_key[ord]
>>
>> # OR
>> ilmnids = anno$Probe_id[ord] # not sure which ILMN_* you require!
>>
>> should match things up for you.
>>
>> I assume you are analyzing HT-12 arrays since you have set
>> 'imagesPerArray=1'.
>>
>> Best wishes,
>>
>> Matt
>>
>>> I guess I'm struggling with how to "match up" (I'm an R newbie) the
>>> ids even if I have the file.  Is there an example somewhere of how to
>>> do this?  I know this is a stupid question, but I am a Java programmer
>>> and I'm just learning R and trying to get my mind around this whole
>>> "everything is a vector" approach. :)
>>>
>>> By the way, I tried reading in the manifest file using the readBGX,
>>> but it kept throwing errors, saying something like "link 336 does not
>>> contain 28 elements".
>>>
>>> On Tue, Sep 21, 2010 at 8:24 PM, Matthew Ritchie <mritchie at wehi.edu.au>
>>> wrote:
>>>> Hi James,
>>>>
>>>> If you just wanted to annotate the probes, this could be done in R using
>>>> the annotation package 'illuminaHumanv3BeadID.db'
>>>>
>>>> If you want to convert the numeric probe IDs to ILMN_* ids, then you can
>>>> use the information in the file
>>>>
>>>> http://www.compbio.group.cam.ac.uk/Resources/Annotation/final/Annotation_Illumina_Human-WG-V3_hg18_V1.0.0_Aug09.zip
>>>>
>>>> (unzip, read in 'Annotation_Illumina_Human-WG-V3_hg18_V1.0.0_Aug09.txt'
>>>> and then match up the probe ids in your summary object with the values
>>>> in
>>>> the 'Array_Address_Id_0' column.  The corresponding columns in this file
>>>> with ILMN_* ids are either 'Search_Key_0' or 'Probe_Id_0' (entries in
>>>> both
>>>> start with ILMN_ but end in different numbers - I'm not sure which one
>>>> you
>>>> are after).  This information can also be obtained from the manifest
>>>> files
>>>> at
>>>>
>>>> http://www.switchtoi.com/annotationfiles.ilmn
>>>>
>>>> (you will need to select the text version of chip type you are using)
>>>>
>>>> I hope this helps.  Best wishes,
>>>>
>>>> Matt
>>>>
>>>>> I am trying to get a summarized table from our Illumina data.  So far I
>>>> have:
>>>>>
>>>>> targets = read.table("/home/jcarman/targets.txt", header=TRUE,
>>>> as.is=TRUE) detail =
>>>>> readIllumina(arrayNames=targets$Id,useImages=FALSE,annoPkg="Humanv3",targets=targets)
>>>> summary=createBeadSummaryData(detail,imagesPerArray=1,method="illumina")
>>>>>
>>>>> How do I get the probe ids mapped to the ILMN_* gene ids for my output?
>>>>>
>>>>> sessionInfo() returns:
>>>>>
>>>>> R version 2.11.1 (2010-05-31)
>>>>> x86_64-redhat-linux-gnu
>>>>>
>>>>> locale:
>>>>>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>>>>>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>>>>>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>>>>>  [7] LC_PAPER=en_US.utf8       LC_NAME=C
>>>>>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>> other attached packages:
>>>>> [1] beadarray_1.16.0 Biobase_2.8.0
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] hwriter_1.2  limma_3.4.4  tools_2.11.1
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ______________________________________________________________________
>>>> The information in this email is confidential and intend...{{dropped:4}}
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the addressee.
>> You must not disclose, forward, print or use it without the permission of the sender.
>> ______________________________________________________________________
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>