[BioC] analyzing HumanHT12 with lumi

Paul Leo p.leo at uq.edu.au
Wed Sep 9 09:34:48 CEST 2009


Not sure if this will help you ... 
Have you tried using the annotations file at:

http://www.switchtoi.com/annotationprevfiles.ilmn

get the text version. see if that works for lumiR ?

Personally I don't bother.
x.lumi<-lumiR(filenames,convertNuID=FALSE,inputAnnotation=FALSE)
and annotate later with Bioconductor libraries via the probe ID s or the
illumina annotation file via the Array_Addresss_Id (do with what makes
sense to you with multi-mappers)....

ann<-read.delim("HumanHT-12_V3_0_R1_11283641_T.txt",header=T,skip=8,sep="\t",fill=TRUE)
dim(ann)

[1] 48803    28
> colnames(ann)
 [1] "Species"               "Source"                "Search_Key"
"Transcript"
 [5] "ILMN_Gene"             "Source_Reference_ID"   "RefSeq_ID"
"Unigene_ID"           
 [9] "Entrez_Gene_ID"        "GI"                    "Accession"
"Symbol"               
[13] "Protein_Product"       "Probe_Id"              "Array_Address_Id"
"Probe_Type"           
[17] "Probe_Start"           "Probe_Sequence"        "Chromosome"
"Probe_Chr_Orientation"
[21] "Probe_Coordinates"     "Cytoband"              "Definition"
"Ontology_Component"   
[25] "Ontology_Process"      "Ontology_Function"     "Synonyms"
"Obsolete_Probe_Id"
length(unique(ann[,"Array_Address_Id"]))
[1] 48803

-----Original Message-----
From: die_stevie at web.de
To: amit491 at gmail.com
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] analyzing HumanHT12 with lumi
Date: Wed, 09 Sep 2009 09:01:53 +0200

Hello Amit,
 
 first thank you very much for your response!
 
 I included the the TargetID column and tried to run the lumiR function 
 with all options available but the result is still the same.

 
 test <- lumiR(file = "D:/Programme/eclipse/tmp/FinalReport.txt", sep =
 "\t", detectionTh = 0.01, na.rm = TRUE, convertNuID = TRUE, lib.
 mapping = NULL, dec = '.', parseColumnName = TRUE, checkDupId = TRUE, 
 QC = TRUE, columnNameGrepPattern = list(exprs='AVG_SIGNAL', se.exprs='
 BEAD_STD', detection='DETECTION', beadNum='Avg_NBEADS'), inputAnnotatio
 n=TRUE, annotationColumn=c('PROBE_SEQUENCE'), verbose = TRUE)

 
 I also pre-processed a Mouse WG-6 chip (V2) and everything is fine 
 there: no duplicated IDs or “Inf” in the quality control.
 
 Maybe there is a problem with the HumanHT12 chip?
 
 Does anyone else have any advice?
 
 Thanks again!
 
 Kind regards,
 
 Steffi

 
> Von: amit mandal [mailto:amit491 at gmail.com] 
>  *Gesendet:* Tuesday, September 08, 2009 5:30 PM
>  *An:* stefanie.figura
>  *Cc:* BioC_mail
>  *Betreff:* Fwd: [BioC] analyzing HumanHT12 with lumi
> 
> hello Steffi,
>  In 'lumi' one needs to import the data out of BeadStudio in a 
> particular order of various columns (it can be arranged outside BS 
> also). The columns in order are-
> 1) TargetID
> 2) ProbeID (this is different from the Probe_ID col)
> 3) Avg_Signal
> 4) BEAD_STDER
> 5) Detection Pval
>  These are the cols. that are mandatory. Apart from them, annotation 
> cols. can also be added. Info. about them is given in the "Using lumi.
> ." pdf that comes as vignette with the package.
>  Also while importing the data using lumiR command, one needs to 
> specify the grep pattern of the column headers by which lumiRwould 
> recognize which col. contains what. Though the deafult output has the 
> columns in order for lumiR to work in default settings, but just in 
> case.
>  I haven't analyzed HT-12 but WG-6. And above method works fine. 
> lumitakes "ProbeID" as the unique identifier (v 3.0 chips onward) and 
> I didn't encounter a 'duplicate ID..' message.
>  I'm also unsure of the 'Inf' message. Maybe try importing the data 
> with specifications for most of the options, i.e. col. grep pattern.
> 
> regards
> amit mandal
> 
> Graduate student
> Genomics & Molecular Medicine lab
> IGIB, Delhi
> 
> On Tue, Sep 8, 2009 at 7:30 PM, stefanie.figura <figura at uni-muenster.
> de> wrote:
> 
> Dear all!
> 
> I tried to analyse the Illumina HumanHT12 chip with the lumi package 
> and I
> have some questions about the import and the results of the quality 
> control.
> 
> My first question is which columns have to be exported from 
> BeadStudio at
> least? I am not sure because in the .pdf manual for the lumi package 
> the
> figure is not completely represented.
> 
> I only exported ProbeID, PROBE_SEQUENCE (for nuID mapping with
> biocLite("lumiHumanAll.db")), AVG_Signal, BEAD_STDERR, Avg_NBEADS and
> Detection Pval from Group Gene Profile for all samples. Is there 
> anything I
> missed which is
> <http://dict.leo.org/ende?lp=ende&p=thMx..&search=important> 
> important for
> the analysis?
> 
> I am not sure if I did a mistake in the code because of the results 
> of the
> quality control:
> 
> > importData <- lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.
> txt")
> 
> Perform Quality Control assessment of the LumiBatch object ...
> 
> Directly converting probe sequence to nuIDs ...
> 
> Duplicated IDs found and were merged!
> 
> > importData
> 
> Summary of data information:
> 
>  Data File Information:
> 
>  BSGX Version 3.2.3
> 
>  Report Date 9/8/2009 1:41:49 PM
> 
>  Project tmp
> 
>  Group Set all_seperated
> 
>  Analysis all_seperated_nonorm
> 
>  Normalization none
> 
> Major Operation History:
> 
>  submitted finished
> 
> 1 2009-09-08 15:44:37 2009-09-08 15:45:12
> 
> 2 2009-09-08 15:45:12 2009-09-08 15:45:14
> 
> 3 2009-09-08 15:45:34 2009-09-08 15:45:34
> 
> 4 2009-09-08 15:45:14 2009-09-08 15:45:35
> 
> command
> 
> 1 lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.txt")
> 
> 2 lumiQ(x.lumi = x.lumi, detectionTh = detectionTh, verbose =
> verbose)
> 
> 3 Subsetting 48803
> features.
> 
> 4 addNuID2lumi(x.lumi = x.lumi, lib.mapping = lib.mapping, verbose =
> verbose)
> 
>  lumiVersion
> 
> 1 1.10.1
> 
> 2 1.10.1
> 
> 3 1.10.1
> 
> 4 1.10.1
> 
> Object Information:
> 
> LumiBatch (storageMode: lockedEnvironment)
> 
> assayData: 48802 features, 24 samples
> 
>  element names: beadNum, detection, exprs, se.exprs
> 
> phenoData
> 
>  sampleNames: 4433719067_A, 4433719067_B, ..., 4433719068_L (24 total)
> 
>  varLabels and varMetadata description:
> 
>  sampleID: The unique Illumina microarray Id
> 
> featureData
> 
>  featureNames: Ku8QhfS0n_hIOABXuE, fqPEquJRRlSVSfL.8A, ...,
> N8t5EuJCr0Tk9.zHno (48802 total)
> 
>  fvarLabels and fvarMetadata description:
> 
>  ProbeID: The Illumina microarray identifier
> 
> experimentData: use 'experimentData(object)'
> 
> Annotation:
> 
> Control Data: Available
> 
> QC information: Please run summary(x, 'QC') for details!
> 
> > summary(importData, 'QC')
> 
> Data dimension: 48802 genes x 24 samples
> 
> Summary of Samples:
> 
>  4433719067_A 4433719067_B 4433719067_C 4433719067_D
> 
> mean 6.8010 6.7230 6.6660 6.6870
> 
> standard deviation 1.6760 1.6360 1.6370 1.6550
> 
> detection rate(0.01) 0.3367 0.3432 0.3459 0.3436
> 
> distance to sample mean Inf Inf Inf Inf
> 
>  4433719067_E 4433719067_F 4433719067_G 4433719067_H
> 
> mean 6.7220 6.6060 6.623 6.5730
> 
> standard deviation 1.6470 1.6440 1.675 1.6440
> 
> detection rate(0.01) 0.3531 0.3318 0.346 0.3378
> 
> distance to sample mean Inf Inf Inf Inf
> 
>  4433719067_I 4433719067_J 4433719067_K 4433719067_L
> 
> mean 6.5400 6.5390 6.5470 6.4570
> 
> standard deviation 1.6420 1.6740 1.6790 1.6390
> 
> detection rate(0.01) 0.3316 0.3424 0.3464 0.3518
> 
> distance to sample mean Inf Inf Inf Inf
> 
>  4433719068_A 4433719068_B 4433719068_C 4433719068_D
> 
> mean 6.3170 6.3000 6.304 6.213
> 
> standard deviation 1.5630 1.5970 1.619 1.566
> 
> detection rate(0.01) 0.3348 0.3257 0.336 0.320
> 
> distance to sample mean Inf Inf Inf Inf
> 
>  4433719068_E 4433719068_F 4433719068_G 4433719068_H
> 
> mean 6.253 6.2510 6.169 6.2380
> 
> standard deviation 1.600 1.6170 1.579 1.6590
> 
> detection rate(0.01) 0.347 0.3434 0.335 0.3455
> 
> distance to sample mean Inf Inf Inf Inf
> 
>  4433719068_I 4433719068_J 4433719068_K 4433719068_L
> 
> mean -Inf 6.191 6.1420 6.0510
> 
> standard deviation NaN 1.642 1.6150 1.5360
> 
> detection rate(0.01) 0.3319 0.346 0.3429 0.3462
> 
> distance to sample mean 62.4000 Inf Inf Inf
> 
> I wonder about the "Inf" and "NaN" and I really think something was 
> going
> wrong.
> 
> Any advice is welcome, because I just started to learn R.
> 
> Thank you very much in advance!
> 
> Kind regards,
> 
> Steffi
> 
>  [[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.
> informatics.conductor
> 
> -- 
> ---------------------------------------------------------------
> The robbed that smiles, steals something
> from the thief.
> - Shakespeare
> ---------------------------------------------------------------
> 
> -- 
> ---------------------------------------------------------------
> The robbed that smiles, steals something
> from the thief.
> - Shakespeare
> ---------------------------------------------------------------
> 
> 


______________________________________________________
GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://movieflat.web.de

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list