[BioC] codelink analysis
Diego Diez
diez at kuicr.kyoto-u.ac.jp
Wed Aug 27 04:26:15 CEST 2008
Dear Lixia,
On Wed, Aug 27, 2008 at 7:00 AM, Diao,Lixia <ldiao at mdanderson.org> wrote:
> Dear Dr. Diez,
>
> I am trying to use codelink package to do analysis with Codelink arrays. It is indicated that codelink package only recognizes the text files exported from the Codelink software. We have 32 txt files, when I used :
> files<-list.files(pattern="txt")
> data<-readCodelink(files=files)
>
> it returns:
>
> Error in readHeader(files[n], dec = TRUE) : Not a Codelink exported file.
>
> The first several rows of files are:
>
> PROJECT
> EXPERIMENT
> SAMPLE Sample001
> DATE 2007-10-26T17:19:48
>
>
> GENEID NCBI_ACCESSION TYPE_FLAG LO_ARRAY_ID EXPRESSIONVALUE NORMALIZEDEXPRESSIONVALUE GENESPRINGFLAG CODELINKFLAG SPOT_COL SPOT_ROW SPOTMEAN SPOTMEDIAN BKGMEAN BKGMEDIAN
> .......
>
>
> I emailed to the person from the company. They claimed that this txt file is from Codelink
> system directly without any manipulation. I checked the code of readCodelink, it seems
> there should be product, number of genes..... fields. Would you like to help me about this?
> Will the file needs more header to be recognized by the codelink package?
Well, this looks like data coming from a codelink analysis but not in
the expected format that should be something similar to:
<--- file start --->
CodeLink Expression Analysis 4.1.0.29054
D Diez Report for Slide (T00298850)
LAYOUT EXP287X128-950.22.ID
PROJECT RATA BEATRIZ
EXPERIMENT
PRODUCT Rat Whole Genome
Sample Name Array 1 T3-5(3)
Median Array 1 31,0905990600586
Report( 1 ): Adultos
--------------------------------------------------------------------------------
Idx Probe_name Probe_type Feature_id Raw_intensity Normaliz
ed_intensity Quality_flag Signal_strength Logical_row Logical_col
Center_X Center_Y Spot_mean Spot_median Spot_stdev
Spot_area Spot_diameter Spot_noise_level Bkgd_mean Bkgd_med
ian Bkgd_stdev Bkgd_area Array Sample_name
1 GE200017 FIDUCIAL 1001 1614,4359 51,9268 G
37,7907 1 1 135 137 1645,4359 1622,0000 1261,696
4 119 12,3091630935669 43,5407773523152 32,7917 31,0000
8,3605 124 1 T3-5(3)
<--- snip --->
As you see, in your case all fields are uppercase. And some field
names don't correspond. There are some mandatory fields but the header
is mostly unnecessary, although it is used to check the file format
(in particular the first line). One possibility is that your format
correspond to a new codelink format, since codelink passed from GE
Healthcare to Applied i haven't being able to follow the potential
changes they may have introduced into the file format. In particular,
the new Codelink Software (v5) which is used to export the text files
may have changed the original format. On the other hand, the field
GENESPRINGFLAG looks suspicious, like the data was processed with
GeneSpring, but again i am just guessing. Did you just copy/paste the
header sample you included?
In case your data corresponds to a genuinely true codelink format i
will try to give support for it but i will need some extra
information.
Best,
Diego.
--
Diego Diez, Ph
Bioinformatics center,
Institute for Chemical Research,
Kyoto University.
Gokasho, Uji, Kyoto 611-0011 JAPAN
diez at kuicr.kyoto-u.ac.jp
>
> Thanks a lot,
> Lixia
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list