[BioC] Read.maimages Error reading 'Genepix results file 2' version files

Matthew Ritchie mritchie at wehi.EDU.AU
Thu Aug 13 06:20:45 CEST 2009


Dear Krupa,

In the example file you sent me, there appears to be an extra tab at the
end of each line of intensity data, which is not present in the header row
(see below).  This mismatch (82 columns in the header versus 83 columns
thereafter) causes a problem in read.table().

If you remove the final tab from these lines, the code I emailed earlier
should work fine (I managed this in excel by deleting the next few columns
that appear after the 'Normalize' column and saving the file - I'm sure
you could do this in R as well).

Best wishes,

Matt

> RG = read.maimages("slide 149.gpr", source="genepix",
columns=list(R="F633 Mean", G="F543 Mean",
+                                  Rb="B633 Median", Gb="B543 Median"))
Error in read.table(file = file, header = TRUE, col.names = allcnames,  :
  duplicate 'row.names' are not allowed
# first line of intensity data
> readLines("slide 149.gpr", n=30)[25]
[1] "1\t1\t1\tSpot Report 1 - Cab\tSpot Report 1 -
Cab\t2467\t6862\t80\t301\t501\t487\t203\t354\t350\t28\t13\t0\t2095\t2131\t561\t229\t243\t98\t100\t100\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t19.041\t6.383\t7.399\t27.002\t27.002\t0.869\t0.569\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t0.000\t39\t236\t1964\t2200\t4.251\t0.000\t0.000\t98\t1866\t0\t0\t298\t1902\t0\t0\t100\t1\t"
# header row
> readLines("slide 149.gpr", n=30)[24]
[1]
"\"Block\"\t\"Column\"\t\"Row\"\t\"Name\"\t\"ID\"\t\"X\"\t\"Y\"\t\"Dia.\"\t\"F543
Median\"\t\"F543 Mean\"\t\"F543 SD\"\t\"B543 Median\"\t\"B543
Mean\"\t\"B543 SD\"\t\"% > B543+1SD\"\t\"% > B543+2SD\"\t\"F543 %
Sat.\"\t\"F633 Median\"\t\"F633 Mean\"\t\"F633 SD\"\t\"B633
Median\"\t\"B633 Mean\"\t\"B633 SD\"\t\"% > B633+1SD\"\t\"% >
B633+2SD\"\t\"F633 % Sat.\"\t\"F3 Median\"\t\"F3 Mean\"\t\"F3 SD\"\t\"B3
Median\"\t\"B3 Mean\"\t\"B3 SD\"\t\"% > B3+1SD\"\t\"% > B3+2SD\"\t\"F3 %
Sat.\"\t\"F4 Median\"\t\"F4 Mean\"\t\"F4 SD\"\t\"B4 Median\"\t\"B4
Mean\"\t\"B4 SD\"\t\"% > B4+1SD\"\t\"% > B4+2SD\"\t\"F4 % Sat.\"\t\"Ratio
of Medians (633/543)\"\t\"Ratio of Means (633/543)\"\t\"Median of Ratios
(633/543)\"\t\"Mean of Ratios (633/543)\"\t\"Ratios SD (633/543)\"\t\"Rgn
Ratio (633/543)\"\t\"Rgn R\xb2 (633/543)\"\t\"Ratio of Medians
(Ratio/2)\"\t\"Ratio of Means (Ratio/2)\"\t\"Median of Ratios
(Ratio/2)\"\t\"Mean of Ratios (Ratio/2)\"\t\"Ratios SD (Ratio/2)\"\t\"Rgn
Ratio (Ratio/2)\"\t\"Rgn R\xb2 (Ratio/2)\"\t\"Ratio of Medians
(Ratio/3)\"\t\"Ratio of Means (Ratio/3)\"\t\"Median of Ratios
(Ratio/3)\"\t\"Mean of Ratios (Ratio/3)\"\t\"Ratios SD (Ratio/3)\"\t\"Rgn
Ratio (Ratio/3)\"\t\"Rgn R\xb2 (Ratio/3)\"\t\"F Pixels\"\t\"B
Pixels\"\t\"Sum of Medians\"\t\"Sum of Means\"\t\"Log Ratio
(633/543)\"\t\"Log Ratio (Ratio/2)\"\t\"Log Ratio (Ratio/3)\"\t\"F543
Median - B543\"\t\"F633 Median - B633\"\t\"F3 Median - B3\"\t\"F4 Median -
B4\"\t\"F543 Mean - B543\"\t\"F633 Mean - B633\"\t\"F3 Mean - B3\"\t\"F4
Mean - B4\"\t\"Flags\"\t\"Normalize\""

# After removing final tab
> RG = read.maimages("slide 149edit2.gpr", source="genepix",
columns=list(R="F633 Mean", G="F543 Mean",
+                                  Rb="B633 Median", Gb="B543 Median"))
Read slide 149edit2.gpr


>    Re: [BioC] Read.maimages Error reading 'Genepix results file 2'
> version files
>  Hi Matt,
>
>  Thanks for your response. However ‘read.maimages’ is still
> unable to read my files. I have used the  command line shown below.
>  -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  > RG
>  > I have a problem in reading the Genepix files using 'read.maimages'
>  > function
>  > of limma package. My GPR files are in 'Genepix Results 2' format.
> However
>  > another set of example GPR files in 'Genepix Results 3' format can be
>  > read.
>  > Please find the details below. I will appreciate feedback. Thanks
> -Krupa
>  >
>  > --------------------------
>  > R-version:
>  >> sessionInfo()
>  > R version 2.9.1 (2009-06-26)
>  > i386-apple-darwin8.11.1
>  >
>  > locale:
>  > en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>  >
>  > attached base packages:
>  > [1] stats     graphics  grDevices utils     datasets  methods   base
>  >
>  > other attached packages:
>  > [1] limma_2.18.2
>  > -----------------------------
>  > Function: readTargets works fine.
>  >> Targets > show (Targets)
>  >   Slide.Number     FileName Cy3 Cy5
>  > 1          149 slide149.gpr 10A 30A
>  > 2          150 slide159.gpr 30A 10A
>  > 3          152 slide152.gpr 10B 30B
>  > 4          153 slide153.gpr 30B 10B
>  > 5          154 slide154.gpr 10C 30C
>  > 6          155 slide155.gpr 30C 10C
>  > 7          156 Slide156.gpr 10D 30D
>  > 8          157 Slide157.gpr 30D 10D
>  > --------------------------------------
>  > Function: read.maimages does not work
>  >> RG  Error in read.table(file = file, header = TRUE, col.names =
> allcnames,  :
>  >   duplicate 'row.names' are not allowed
>  > -----------------------------------------
>  >
>  > Example GPR file header
>  >
>  >  ATF 1
>  >  21 82
>  >  Type=GenePix Results 2
>  >  DateTime=2009/08/07 09:05:51
>  >  Settings=C. elegans oligo
>  >  GalFile=C:\scananal\Gal files for ScanArray\Celegans_oligo_GAL.gal
>  >  Scanner=Model: ScanArray Express HT Serial No.: 680078
>  >  Comment=Cyanine 3Cyanine 50,01,0
>  >  PixelSize=10
>  >  Wavelengths=543 nm 633 nm
>  >  ImageFiles=C:\scananal\Images\C. elegans oligo batch 26\Slide
>  > 149_Cyanine3_Celegans_oligo_1347.tif C:\scananal\Images\C. elegans
> oligo
>  > batch 26\Slide 149_Cyanine5_Celegans_oligo_1347.tif
>  >  PMTGain=80 62
>  >  NormalizationMethod=LOWESS
>  >  NormalizationFactors=0.000 0.000
>  >  JpegImage=
>  >  RatioFormulations=W2/W1(633/543)
>  >  Barcode=
>  >  ImageOrigin=0 4481
>  >  JpegOrigin=0 0
>  >  Creator=ScanArray Express, Microarray Analysis System 3.0.0.16
>  >  Temperature=0.0
>  >  LaserPower=90 90 0 0
>  >  LaserOnTime=0 0 0 0
>  >  Block Column Row Name ID X Y Dia. F543 Median F543 Mean F543 SD B543
>  > Median
>  > B543 Mean B543 SD % > B543+1SD % > B543+2SD F543 % Sat. F633
> Median F633
>  > Mean F633 SD B633 Median B633 Mean B633 SD % > B633+1SD % >
> B633+2SD F633
>  > %
>  > Sat. F3 Median F3 Mean F3 SD B3 Median B3 Mean B3 SD % > B3+1SD % >
> B3+2SD
>  > F3 % Sat. F4 Median F4 Mean F4 SD B4 Median B4 Mean B4 SD % >
> B4+1SD % >
>  > B4+2SD F4 % Sat. Ratio of Medians (633/543) Ratio of Means
> (633/543)
>  > Median
>  > of Ratios (633/543) Mean of Ratios (633/543) Ratios SD (633/543) Rgn
> Ratio
>  > (633/543) Rgn R≤ (633/543) Ratio of Medians (Ratio/2) Ratio of
> Means
>  > (Ratio/2) Median of Ratios (Ratio/2) Mean of Ratios (Ratio/2) Ratios SD
>  > (Ratio/2) Rgn Ratio (Ratio/2) Rgn R≤ (Ratio/2) Ratio of Medians
> (Ratio/3)
>  > Ratio of Means (Ratio/3) Median of Ratios (Ratio/3) Mean of Ratios
>  > (Ratio/3)
>  > Ratios SD (Ratio/3) Rgn Ratio (Ratio/3) Rgn R≤ (Ratio/3) F Pixels
> B Pixels
>  > Sum of Medians Sum of Means Log Ratio (633/543) Log Ratio (Ratio/2) Log
>  > Ratio (Ratio/3) F543 Median - B543 F633 Median - B633 F3 Median - B3 F4
>  > Median - B4 F543 Mean - B543 F633 Mean - B633 F3 Mean - B3 F4 Mean - B4
>  > Flags Normalize
>  >  1 1 1 Spot Report 1 - Cab Spot Report 1 - Cab 2467 6862 80 301 501 487
>  > 203
>  > 354 350 28 13 0 2095 2131 561 229 243 98 100 100 0 0 0 0 0 0 0 0 0 0 0
> 0 0
>  > 0
>  > 0 0 0 0 0 19.041 6.383 7.399 27.002 27.002 0.869 0.569 0 0 0 0 0 0 0 0
> 0 0
>  > 0
>  > 0 0 0 39 236 1964 2200 4.251 0 0 98 1866 0 0 298 1902 0 0 100 1
>  >  1 2 1 C25A1.8 cea2.c.00914 2678 6848 100 10420 10445 2870 197 347 317
> 100
>  > 100 0 9139 9172 2403 245 288 145 100 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0
>  > 0
>  > 0 0.87 0.871 0.87 0.903 0.903 0.638 0.58 0 0 0 0 0 0 0 0 0 0 0 0 0 0 64
>  > 180
>  > 19117 19175 -0.201 0 0 10223 8894 0 0 10248 8927 0 0 100 1
>  >  1 3 1 F21F3.6 cea2.c.02677 2877 6855 100 13934 14087 4202 258 393 368
> 100
>  > 100 0 20164 20572 5319 275 323 171 100 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0
>  > 0
>  > 0 0 1.454 1.468 1.46 1.663 1.663 1.023 0.653 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0
>  > 62
>  > 180 33565 34126 0.54 0 0 13676 19889 0 0 13829 20297 0 0 100 1
>  > ---------------------------------------------------------------------------
>  >
>  > However another set of gpr files (from Limma examples) can be read
> using
>  > 'read.maimages' function. Header of the example file that works.
>  >
>  >
>  >  ATF 1
>  >  29 52
>  >  Type=GenePix Results 3
>  >  DateTime=2003/08/07 10:08:35
>  >  Settings=E:\Madeleine's Data\Microarray\Human\GPR\6Hs.166.gps
>  >  GalFile=E:\Madeleine's Data\Microarray\Gal File\6Hs.gal
>  >  PixelSize=10
>  >  Wavelengths=635 532
>  >  ImageFiles=E:\Madeleine's Data\Microarray\Human\SCANS\6Hs.166.tif 2
>  > E:\Madeleine's Data\Microarray\Human\SCANS\6Hs.166.tif 3
>  >  NormalizationMethod=None
>  >  NormalizationFactors=1 1
>  >  JpegImage=E:\Madeleine's Data\Microarray\Human\GPR\6Hs.166.jpg
>  >  StdDev=Type 1
>  >  RatioFormulations=W1/W2 (635/532)
>  >  FeatureType=Circular
>  >  Barcode=
>  >  BackgroundSubtraction=LocalFeature
>  >  ImageOrigin=1360, 14520
>  >  JpegOrigin=0, 0
>  >  Creator=GenePix Pro 5.0.1.2
>  >  Scanner=GenePix 4000B [83306]
>  >  FocusPosition=0
>  >  Temperature=21.44
>  >  LinesAveraged=2
>  >  Comment=
>  >  PMTGain=760 570
>  >  ScanPower=100 100
>  >  LaserPower=3.25 3.54
>  >  Filters=
>  >  ScanRegion=136,1452,1924,6860
>  >  Supplier=
>  >  Block Column Row Name ID X Y Dia. F635 Median F635 Mean F635 SD B635
> B635
>  > Median B635 Mean B635 SD % > B635+1SD % > B635+2SD F635 % Sat.
> F532 Median
>  > F532 Mean F532 SD B532 B532 Median B532 Mean B532 SD % > B532+1SD %
> >
>  > B532+2SD F532 % Sat. Ratio of Medians (635/532) Ratio of Means
> (635/532)
>  > Median of Ratios (635/532) Mean of Ratios (635/532) Ratios SD (635/532)
>  > Rgn
>  > Ratio (635/532) Rgn R≤ (635/532) F Pixels B Pixels Circularity
> Sum of
>  > Medians (635/532) Sum of Means (635/532) Log Ratio (635/532) F635
> Median -
>  > B635 F532 Median - B532 F635 Mean - B635 F532 Mean - B532 F635 Total
>  > Intensity F532 Total Intensity SNR 635 SNR 532 Flags Normalize Autoflag
>  >  1 1 1 OVGP1 - Oviductal glycoprotein 1, 120kD (mucin 9, oviductin)
>  > H200000297 1700 14710 120 139 154 70 137 137 148 66 18 6 0 189 201 66
> 191
>  > 191 194 46 18 11 0 -1 1.7 0.975 1.17 3.532 2.323 0.061 120 734 100 0 27
>  > Error 2 -2 17 10 18463 24089 0.091 0.152 -50 0 0
>  >  1 2 1 TAF1 - TAF1 RNA polymerase II, TATA box binding protein
>  > (TBP)-associated factor, 250 kD H200000303 1880 14710 70 181 207 109
> 142
>  > 142
>  > 149 57 37 21 0 255 260 60 194 194 196 42 65 34 0 0.639 0.985 0.886 0.86
>  > 3.551 4.153 0.031 32 236 100 100 131 -0.645 39 61 65 66 6623 8328 1.018
>  > 1.524 0 0 0
>  >
>  > Thank you.
>  >
>  > Krupa
>
>
>  --
>
>  Dr. Krupa Deshmukh
>  Post-doctoral Research Associate
>  Dr. Kerry Kornfeld's Laboratory
>  Washington University School of Medicine
>  Department of Developmental Biology
>  660 South Euclid Avenue, Campus Box 8103
>  St. Louis, MO 63110
>  Phone: +1 314-747-2004
>  E-mail:kdeshmukh at wustl.edu
>
>   The materials in this message are private and may contain Protected
> Healthcare Information or other information of a sensitive nature. If
> you are not the intended recipient, be advised that any unauthorized
> use, disclosure,  copying or the taking of any action in reliance on the
> contents of this information is strictly prohibited. If you have
> received this email in error, please immediately notify the sender via
> telephone or return mail.
>



More information about the Bioconductor mailing list