[BioC] analyzing HumanHT12 with lumi

Wed Sep 9 10:36:34 CEST 2009

Hei!

This also does not work. 
Loading the BeadStudio export always causes these "Inf" and "NA" values in quality control even with skipping the annotation at this point of the analysis.

My export looks like this:

[Header]	
BSGX Version 	3.2.3
Report Date	9/8/2009 13:41
Project	tmp
Group Set	all_seperated
Analysis	all_seperated_nonorm
Normalization	none
[Group Probe Profile]

TargetID	ProbeID	4433719067_A:AVG_Signal	4433719067_A:BEAD_STDERR	4433719067_A:Avg_NBEADS	4433719067_A:Detection Pval
7A5	6450255	51.57291	2.277424	28	0.9106901

(just for the first slide on the array)

Is the layout correct?
I wonder about the ":" in ArrayID_Slide":"AVG_Signal

x.lumi<-lumiR(filename, convertNuID=FALSE, inputAnnotation=FALSE) should work, shouldnŽt it?

Kind regards,
Steffi

> -----Ursprüngliche Nachricht-----
> Von: "Paul Leo" <p.leo at uq.edu.au>
> Gesendet: 09.09.09 09:35:14
> An: die_stevie at web.de
> CC: bioconductor at stat.math.ethz.ch
> Betreff: Re: [BioC] analyzing HumanHT12 with lumi

> Not sure if this will help you ... 
> Have you tried using the annotations file at:
> 
> http://www.switchtoi.com/annotationprevfiles.ilmn
> 
> get the text version. see if that works for lumiR ?
> 
> Personally I don't bother.
> x.lumi<-lumiR(filenames,convertNuID=FALSE,inputAnnotation=FALSE)
> and annotate later with Bioconductor libraries via the probe ID s or the
> illumina annotation file via the Array_Addresss_Id (do with what makes
> sense to you with multi-mappers)....
> 
> ann<-read.delim("HumanHT-12_V3_0_R1_11283641_T.txt",header=T,skip=8,sep="\t",fill=TRUE)
> dim(ann)
> 
> [1] 48803    28
> > colnames(ann)
>  [1] "Species"               "Source"                "Search_Key"
> "Transcript"
>  [5] "ILMN_Gene"             "Source_Reference_ID"   "RefSeq_ID"
> "Unigene_ID"           
>  [9] "Entrez_Gene_ID"        "GI"                    "Accession"
> "Symbol"               
> [13] "Protein_Product"       "Probe_Id"              "Array_Address_Id"
> "Probe_Type"           
> [17] "Probe_Start"           "Probe_Sequence"        "Chromosome"
> "Probe_Chr_Orientation"
> [21] "Probe_Coordinates"     "Cytoband"              "Definition"
> "Ontology_Component"   
> [25] "Ontology_Process"      "Ontology_Function"     "Synonyms"
> "Obsolete_Probe_Id"
> length(unique(ann[,"Array_Address_Id"]))
> [1] 48803
> 
> -----Original Message-----
> From: die_stevie at web.de
> To: amit491 at gmail.com
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] analyzing HumanHT12 with lumi
> Date: Wed, 09 Sep 2009 09:01:53 +0200
> 
> Hello Amit,
>  
>  first thank you very much for your response!
>  
>  I included the the TargetID column and tried to run the lumiR function 
>  with all options available but the result is still the same.
> 
>  
>  test <- lumiR(file = "D:/Programme/eclipse/tmp/FinalReport.txt", sep =
>  "\t", detectionTh = 0.01, na.rm = TRUE, convertNuID = TRUE, lib.
>  mapping = NULL, dec = '.', parseColumnName = TRUE, checkDupId = TRUE, 
>  QC = TRUE, columnNameGrepPattern = list(exprs='AVG_SIGNAL', se.exprs='
>  BEAD_STD', detection='DETECTION', beadNum='Avg_NBEADS'), inputAnnotatio
>  n=TRUE, annotationColumn=c('PROBE_SEQUENCE'), verbose = TRUE)
> 
>  
>  I also pre-processed a Mouse WG-6 chip (V2) and everything is fine 
>  there: no duplicated IDs or â€œInfâ€ in the quality control.
>  
>  Maybe there is a problem with the HumanHT12 chip?
>  
>  Does anyone else have any advice?
>  
>  Thanks again!
>  
>  Kind regards,
>  
>  Steffi
> 
>  
> > Von: amit mandal [mailto:amit491 at gmail.com] 
> >  *Gesendet:* Tuesday, September 08, 2009 5:30 PM
> >  *An:* stefanie.figura
> >  *Cc:* BioC_mail
> >  *Betreff:* Fwd: [BioC] analyzing HumanHT12 with lumi
> > 
> > hello Steffi,
> >  In 'lumi' one needs to import the data out of BeadStudio in a 
> > particular order of various columns (it can be arranged outside BS 
> > also). The columns in order are-
> > 1) TargetID
> > 2) ProbeID (this is different from the Probe_ID col)
> > 3) Avg_Signal
> > 4) BEAD_STDER
> > 5) Detection Pval
> >  These are the cols. that are mandatory. Apart from them, annotation 
> > cols. can also be added. Info. about them is given in the "Using lumi.
> > ." pdf that comes as vignette with the package.
> >  Also while importing the data using lumiR command, one needs to 
> > specify the grep pattern of the column headers by which lumiRwould 
> > recognize which col. contains what. Though the deafult output has the 
> > columns in order for lumiR to work in default settings, but just in 
> > case.
> >  I haven't analyzed HT-12 but WG-6. And above method works fine. 
> > lumitakes "ProbeID" as the unique identifier (v 3.0 chips onward) and 
> > I didn't encounter a 'duplicate ID..' message.
> >  I'm also unsure of the 'Inf' message. Maybe try importing the data 
> > with specifications for most of the options, i.e. col. grep pattern.
> > 
> > regards
> > amit mandal
> > 
> > Graduate student
> > Genomics & Molecular Medicine lab
> > IGIB, Delhi
> > 
> > On Tue, Sep 8, 2009 at 7:30 PM, stefanie.figura <figura at uni-muenster.
> > de> wrote:
> > 
> > Dear all!
> > 
> > I tried to analyse the Illumina HumanHT12 chip with the lumi package 
> > and I
> > have some questions about the import and the results of the quality 
> > control.
> > 
> > My first question is which columns have to be exported from 
> > BeadStudio at
> > least? I am not sure because in the .pdf manual for the lumi package 
> > the
> > figure is not completely represented.
> > 
> > I only exported ProbeID, PROBE_SEQUENCE (for nuID mapping with
> > biocLite("lumiHumanAll.db")), AVG_Signal, BEAD_STDERR, Avg_NBEADS and
> > Detection Pval from Group Gene Profile for all samples. Is there 
> > anything I
> > missed which is
> > <http://dict.leo.org/ende?lp=ende&p=thMx..&search=important> 
> > important for
> > the analysis?
> > 
> > I am not sure if I did a mistake in the code because of the results 
> > of the
> > quality control:
> > 
> > > importData <- lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.
> > txt")
> > 
> > Perform Quality Control assessment of the LumiBatch object ...
> > 
> > Directly converting probe sequence to nuIDs ...
> > 
> > Duplicated IDs found and were merged!
> > 
> > > importData
> > 
> > Summary of data information:
> > 
> >  Data File Information:
> > 
> >  BSGX Version 3.2.3
> > 
> >  Report Date 9/8/2009 1:41:49 PM
> > 
> >  Project tmp
> > 
> >  Group Set all_seperated
> > 
> >  Analysis all_seperated_nonorm
> > 
> >  Normalization none
> > 
> > Major Operation History:
> > 
> >  submitted finished
> > 
> > 1 2009-09-08 15:44:37 2009-09-08 15:45:12
> > 
> > 2 2009-09-08 15:45:12 2009-09-08 15:45:14
> > 
> > 3 2009-09-08 15:45:34 2009-09-08 15:45:34
> > 
> > 4 2009-09-08 15:45:14 2009-09-08 15:45:35
> > 
> > command
> > 
> > 1 lumiR("D:/Programme/eclipse/tmp/tmp_GroupProbeProfile.txt")
> > 
> > 2 lumiQ(x.lumi = x.lumi, detectionTh = detectionTh, verbose =
> > verbose)
> > 
> > 3 Subsetting 48803
> > features.
> > 
> > 4 addNuID2lumi(x.lumi = x.lumi, lib.mapping = lib.mapping, verbose =
> > verbose)
> > 
> >  lumiVersion
> > 
> > 1 1.10.1
> > 
> > 2 1.10.1
> > 
> > 3 1.10.1
> > 
> > 4 1.10.1
> > 
> > Object Information:
> > 
> > LumiBatch (storageMode: lockedEnvironment)
> > 
> > assayData: 48802 features, 24 samples
> > 
> >  element names: beadNum, detection, exprs, se.exprs
> > 
> > phenoData
> > 
> >  sampleNames: 4433719067_A, 4433719067_B, ..., 4433719068_L (24 total)
> > 
> >  varLabels and varMetadata description:
> > 
> >  sampleID: The unique Illumina microarray Id
> > 
> > featureData
> > 
> >  featureNames: Ku8QhfS0n_hIOABXuE, fqPEquJRRlSVSfL.8A, ...,
> > N8t5EuJCr0Tk9.zHno (48802 total)
> > 
> >  fvarLabels and fvarMetadata description:
> > 
> >  ProbeID: The Illumina microarray identifier
> > 
> > experimentData: use 'experimentData(object)'
> > 
> > Annotation:
> > 
> > Control Data: Available
> > 
> > QC information: Please run summary(x, 'QC') for details!
> > 
> > > summary(importData, 'QC')
> > 
> > Data dimension: 48802 genes x 24 samples
> > 
> > Summary of Samples:
> > 
> >  4433719067_A 4433719067_B 4433719067_C 4433719067_D
> > 
> > mean 6.8010 6.7230 6.6660 6.6870
> > 
> > standard deviation 1.6760 1.6360 1.6370 1.6550
> > 
> > detection rate(0.01) 0.3367 0.3432 0.3459 0.3436
> > 
> > distance to sample mean Inf Inf Inf Inf
> > 
> >  4433719067_E 4433719067_F 4433719067_G 4433719067_H
> > 
> > mean 6.7220 6.6060 6.623 6.5730
> > 
> > standard deviation 1.6470 1.6440 1.675 1.6440
> > 
> > detection rate(0.01) 0.3531 0.3318 0.346 0.3378
> > 
> > distance to sample mean Inf Inf Inf Inf
> > 
> >  4433719067_I 4433719067_J 4433719067_K 4433719067_L
> > 
> > mean 6.5400 6.5390 6.5470 6.4570
> > 
> > standard deviation 1.6420 1.6740 1.6790 1.6390
> > 
> > detection rate(0.01) 0.3316 0.3424 0.3464 0.3518
> > 
> > distance to sample mean Inf Inf Inf Inf
> > 
> >  4433719068_A 4433719068_B 4433719068_C 4433719068_D
> > 
> > mean 6.3170 6.3000 6.304 6.213
> > 
> > standard deviation 1.5630 1.5970 1.619 1.566
> > 
> > detection rate(0.01) 0.3348 0.3257 0.336 0.320
> > 
> > distance to sample mean Inf Inf Inf Inf
> > 
> >  4433719068_E 4433719068_F 4433719068_G 4433719068_H
> > 
> > mean 6.253 6.2510 6.169 6.2380
> > 
> > standard deviation 1.600 1.6170 1.579 1.6590
> > 
> > detection rate(0.01) 0.347 0.3434 0.335 0.3455
> > 
> > distance to sample mean Inf Inf Inf Inf
> > 
> >  4433719068_I 4433719068_J 4433719068_K 4433719068_L
> > 
> > mean -Inf 6.191 6.1420 6.0510
> > 
> > standard deviation NaN 1.642 1.6150 1.5360
> > 
> > detection rate(0.01) 0.3319 0.346 0.3429 0.3462
> > 
> > distance to sample mean 62.4000 Inf Inf Inf
> > 
> > I wonder about the "Inf" and "NaN" and I really think something was 
> > going
> > wrong.
> > 
> > Any advice is welcome, because I just started to learn R.
> > 
> > Thank you very much in advance!
> > 
> > Kind regards,
> > 
> > Steffi
> > 
> >  [[alternative HTML version deleted]]
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.
> > informatics.conductor
> > 
> > -- 
> > ---------------------------------------------------------------
> > The robbed that smiles, steals something
> > from the thief.
> > - Shakespeare
> > ---------------------------------------------------------------
> > 
> > -- 
> > ---------------------------------------------------------------
> > The robbed that smiles, steals something
> > from the thief.
> > - Shakespeare
> > ---------------------------------------------------------------
> > 
> > 
> 
> 
> ______________________________________________________
> GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://movieflat.web.de
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 

________________________________________________________________
Neu: WEB.DE Doppel-FLAT mit Internet-Flatrate + Telefon-Flatrate
für nur 19,99 Euro/mtl.!* http://produkte.web.de/go/02/