[BioC] FeatureExpressionSet using list.files() in place of read.xysfiles()

Benilton Carvalho beniltoncarvalho at gmail.com
Thu Jun 20 17:29:43 CEST 2013


I'll reply on your initial thread as it seems more appropriate.

b

2013/6/20 Johnson, Franklin Theodore <franklin.johnson at email.wsu.edu>:
> Dr. Carvalho,
> Sorry to get back to you first.
> According to previous posts, http://comments.gmane.org/gmane.science.biology.informatics.conductor/35367
> I think it the XYS file error above may be related to the pdInfoBuilder annotation package, pd.pdinfo.gpl11164.ndf.txt.
>
> I went back to make the build the package again using the XYS file.
> I get the [[SIGNAL]] error although PROBE_ID <-> SEQ_ID in the NDF file as per previous email in this thread.
> Perhaps this is because XYS is sorted 1,1; 1,2; 2,1; 2,2; 3,1; 3,2....but the NDF is not?
> ======================================================================================================================
> Building annotation package for Nimblegen Expression Array
> NDF: GPL11164.ndf
> XYS: 01.xys
> ======================================================================================================================
> Parsing file: GPL11164.ndf... OK
> Parsing file: 01.xys... OK
> Merging NDF and XYS files... OK
> Preparing contents for featureSet table... OK
> Preparing contents for bgfeature table... OK
> Preparing contents for pmfeature table... OK
> Error in parseNgsPair(object at ndfFile, object at xysFile, verbose = !quiet) :
>   Control probe possibly identified as Experimental
> In addition: Warning message:
> In is.na(ndfdata[["SIGNAL"]]) :
>   is.na() applied to non-(list or vector) of type 'NULL'
>
> As I take a closer look at the ndf file, I do see control probes in the ndf file that are not present in the merged PAIR-XYS files. These are held in CONTAINER= 'NGS_CONTROLS' with PROBE_ID = XENOSYNTH0040, MISMATCH=10007.
> However, !is.na()=TRUE throughout the NDF file. Here is another line representing the CONTROLS:
> PROBE_DESIGN_ID CONTAINER DESIGN_NOTE SELECTION_CRITERIA SEQ_ID PROBE_SEQUENCE MISMATCH MATCH_INDEX FEATURE_ID ROW_NUM COL_NUM PROBE_CLASS PROBE_ID POSITION DESIGN_ID X Y
> 7552_0712_1000 NGS_CONTROLS EMPTY dark EMPTY N 0 62990761 62990761 1000 712 control EMPTY 0 7552 712 1000
>
> I will look at previous threads that may have mentioned just removing these CONTROLS from the NDF file.
> In addition to CONTROLS, there are BLOCK1 and RANDOM types as well. These additional two types can be found in the PAIR file.
>
> Hope to hear from you soon.
> Bittersweet!
> Franklin
>
> Great minds discuss ideas. Average minds discuss events. Small minds discuss people. -Eleanor Roosevelt
>
> ________________________________________
> From: Johnson, Franklin Theodore
> Sent: Wednesday, June 19, 2013 11:25 PM
> To: Benilton Carvalho
> Cc: bioconductor at r-project.org
> Subject: RE: FeatureExpressionSet using list.files() in place of read.xysfiles()
>
> Dr. Carvalho,
> Sorry that was my fault.
> Yea. I tried that way at first. But got the message below.
>> rawData=read.xysfiles(filelist, pkgname="pd.pdinfo.gpl11164.ndf.txt")
> All the XYS files must be of the same type.
> Error: checkChipTypes(filenames, verbose, "nimblegen") is not TRUE
> The logical for check.names is giving the error.
>
> Perhaps one of my files is incorrect, which I checked and will triple check...
> I tried check.names=FALSE as the error seemed to be a logical on the platform type:
>> rawData=read.xysfiles(filelist, pkgname="pd.pdinfo.gpl11164.ndf.txt", check.names=F)
> Error: These do not exist:
>          FALSE
>
> The XYS files did not work either and gave the same result as xys.txt.
>> xys.files=list.xysfiles(getwd(), full.names=TRUE)
>> xys.files
>  [1] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/01.xys"
>  [2] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/02.xys"
>  [3] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/03.xys"
>  [4] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/04.xys"
>  [5] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/05.xys"
>  [6] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/06.xys"
>  [7] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/07.xys"
>  [8] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/08.xys"
>  [9] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/09.xys"
> [10] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/10.xys"
> [11] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/11.xys"
> [12] "C:/Users/ZHUGRP/Desktop/New folder/XYS.xys files/12.xys"
>> basename(xys.files)
>  [1] "01.xys" "02.xys" "03.xys" "04.xys" "05.xys" "06.xys" "07.xys" "08.xys" "09.xys" "10.xys" "11.xys" "12.xys"
>> ripe=read.xysfiles(xys.files, phenoData=pd)
> All the XYS files must be of the same type.
> Error: checkChipTypes(filenames, verbose, "nimblegen") is not TRUE
>
> All XYS files are approximately the same size. So, I see no error with making the XYS files.
> Given the error, again, it seems like a logical argument on the check.names for the platform type.
> What do you think?
> Regards,
> Franklin
>
> ________________________________________
> From: Benilton Carvalho [beniltoncarvalho at gmail.com]
> Sent: Wednesday, June 19, 2013 8:53 PM
> To: Johnson, Franklin Theodore
> Cc: bioconductor at r-project.org
> Subject: Re: FeatureExpressionSet using list.files() in place of read.xysfiles()
>
> Dear Franklin,
>
> I'm not sure I follow your message... my most sincere apologies...
>
> Now that you have your XYS files, I was expecting you to simply use:
>
> library(oligo)
> rawData = read.xysfiles(filelist, pkgname='pd.pdinfo.gpl11164.ndf.txt')
>
> doesn't this work for you? (the 'filelist' variable above contains the
> names of your converted XYS files)
>
> Let me know what are your findings...
>
> b
>
> 2013/6/19 Johnson, Franklin Theodore <franklin.johnson at email.wsu.edu>:
>> Dear Dr. Carvalho,
>>
>> Thanks for the reply.
>> I saw the thread of FAQs how to read in the annotation package made using pdInfoBuilder.
>> For anyone having issues, it seems as straight forward as:
>> #install pdinfo.gpl11164.ndf.txt
>> install.packages("pd.pdinfo.gpl11164.ndf.txt", type="source", repos=NULL)
>> Installing package into ‘C:/Users/ZHUGRP/Documents/R/win-library/3.0’
>> (as ‘lib’ is unspecified)
>> * installing *source* package 'pd.pdinfo.gpl11164.ndf.txt' ...
>> ** R
>> ** data
>> ** inst
>> ** preparing package for lazy loading
>> ** help
>> *** installing help indices
>> ** building package indices
>> ** testing if installed package can be loaded
>> *** arch - i386
>> *** arch - x64
>> * DONE (pd.pdinfo.gpl11164.ndf.txt)
>> ############################################################################################################
>> I am currently trying to make the FeatureExpressionSet with my converted PAIR -> XYS.txt files unfortunately obtaining X/Y/S only.
>> NimbleScan expected .tiff files to read into the software. These files were not available from NCBI/GEO. NimbleGen also did not respond to my inquiry regarding this matter to be able to obtain XYS files from available PAIR files. Using R, I'm testing 12 of 24 tab-delimited XYS files, to also test the annotation package made using pdInfoBuilder.
>> #read in files from wd()
>> filelist=list.files(pattern=".*.txt")
>>> filelist
>>  [1] "GSM01.txt" "GSM02.txt" "GSM03.txt" "GSM04.txt" "GSM05.txt" "GSM06.txt" "GSM07.txt" "GSM08.txt" "GSM09.txt" "GSM10.txt" "GSM11.txt" "GSM12.txt"
>> #read in each data file in filelist as a matrix to make EFS object
>>> datalist=lapply(filelist, function(x)as.matrix(read.table(x, header=T, sep="\t", as.is=T)))
>> #construct phenoData frame
>>> theData=data.frame(Key=rep(c("Week0","Week-2","Week-4"), each=4))
>>> rownames(theData)=basename(filelist)
>>> pd=new("AnnotatedDataFrame", data=theData)
>> ....
>> However, I fail the EFS construction:
>> hardline=new("ExpressionFeatureSet", datalist, phenoData=pd, annotation=library(pd.pdinfo.gpl11164.ndf.txt))
>> Error in .names_found_unique(names(value), names(object)) :
>>   'sampleNames' replacement list must have unique named elements corresponding to assayData element names
>> To confirm,
>>> sampleNames(datalist)
>> [1] "X"  "Y"  "PM"
>> So, it seems EFS is expecting unique sampleNames for each file in filelist?
>> How to read in multiple files into an efs object, as is done with read.xysfiles? Is this doable?
>>
>> Is it necessary to execute datalist=lapply(filelist, function(x)as.matrix(read.table(x, header=T, sep="\t", as.is=T))) surrounded with Booleans to make the object TRUE, per se?
>> i.e. (datalist=lapply(filelist, function(x)as.matrix(read.table(x, header=T, sep="\t", as.is=T))) )
>> Best Regards,
>> Franklin
>>
>> Great minds discuss ideas. Average minds discuss events. Small minds discuss people. -Eleanor Roosevelt
>>
>>
>>
>>
>> ________________________________________
>> From: Benilton Carvalho [beniltoncarvalho at gmail.com]
>> Sent: Thursday, June 13, 2013 4:43 PM
>> To: Johnson, Franklin Theodore
>> Cc: bioconductor at r-project.org
>> Subject: Re: [BioC] PAIR files -- feature set table
>>
>> dont worry about that particular warning.... just install the package
>> and try to read your XYS files.
>>
>> 2013/6/13 Johnson, Franklin Theodore <franklin.johnson at email.wsu.edu>:
>>> Dr. Carvalho,
>>>
>>> Yes. I see what you mean.
>>> Switching the columns helped in the FeatureSet table loading inserted more
>>> that 2 rows:
>>>
>>> Inserting 198661 rows into table featureSet... OK
>>> However, the warning message did print again.
>>>
>>>
>>> Warning message:
>>> In is.na(ndfdata[["SIGNAL"]]) :
>>>   is.na() applied to non-(list or vector) of type 'NULL'
>>>
>>> Below is the output + sessionInfo(), as I upgraded to R 3.0.1.
>>>
>>> #Begin R command line code:
>>>
>>>> makePdInfoPackage(arrays, destDir = getwd(), unlink=TRUE)
>>> ==============================================================================================================================================================
>>>
>>>
>>> Building annotation package for Nimblegen Expression Array
>>> NDF: pdinfo_GPL11164.ndf.txt <-new .ndf file with PROBE_ID<->SEQ_ID
>>> XYS: XYS.txt
>>> ==============================================================================================================================================================
>>> Parsing file: pdinfo_GPL11164.ndf.txt... OK
>>>
>>> Parsing file: XYS.txt... OK
>>> Merging NDF and XYS files... OK
>>> Preparing contents for featureSet table... OK
>>> Preparing contents for bgfeature table... OK
>>> Preparing contents for pmfeature table... OK
>>> Creating package in E:/RANDOM/Test/Yanmin's Microarray Paper/Yanmin
>>> Microarray RAW/pd.pdinfo.gpl11164.ndf.txt
>>> Inserting 198661 rows into table featureSet... OK
>>> Inserting 770599 rows into table pmfeature... OK
>>>
>>> Counting rows in featureSet
>>> Counting rows in pmfeature
>>> Creating index idx_pmfsetid on pmfeature... OK
>>> Creating index idx_pmfid on pmfeature... OK
>>> Creating index idx_fsfsetid on featureSet... OK
>>> Saving DataFrame object for PM.
>>> Done.
>>> Warning message:
>>> In is.na(ndfdata[["SIGNAL"]]) :
>>>   is.na() applied to non-(list or vector) of type 'NULL'
>>>
>>>
>>>> sessionInfo()
>>> R version 3.0.1 (2013-05-16)
>>> Platform: i386-w64-mingw32/i386 (32-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>> States.1252    LC_MONETARY=English_United States.1252
>>> [4] LC_NUMERIC=C                           LC_TIME=English_United
>>> States.1252
>>>
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>> base
>>>
>>> other attached packages:
>>> [1] pdInfoBuilder_1.24.0 oligo_1.24.0         oligoClasses_1.22.0
>>> affxparser_1.32.1    RSQLite_0.11.4       DBI_0.2-7
>>> Biobase_2.20.0
>>> [8] BiocGenerics_0.6.0   BiocInstaller_1.10.2
>>>
>>> loaded via a namespace (and not attached):
>>>  [1] affyio_1.28.0         Biostrings_2.28.0     bit_1.1-10
>>> codetools_0.2-8       ff_2.2-11             foreach_1.4.1
>>> GenomicRanges_1.12.4
>>>  [8] IRanges_1.18.1        iterators_1.0.6       preprocessCore_1.22.0
>>> splines_3.0.1         stats4_3.0.1          tools_3.0.1
>>> zlibbioc_1.6.0
>>>
>>>
>>>
>>>>q()
>>>
>>>
>>>
>>> The built pdInfopackage loaded in Destdir is identical to previous message.
>>>
>>> However the featureSet table now has more than 2 rows...
>>>
>>> Lastly, I did multiple combos, as my merged file has (X.x, Y.x)<-seems to be
>>> identifiers for the 'probe IDs' on the array as well as (X.y, Y.y) <- seems
>>> to be the sequence identifiers for the "SEQ_ID". I used X.x, Y.x and PM
>>> which gave the result I pasted above. All others had errors. I'm close, but
>>> that Warning Message is annoying...
>>>
>>>
>>>
>>> Regards,
>>>
>>> Franklin
>>>
>>>
>>> Great minds discuss ideas. Average minds discuss events. Small minds discuss
>>> people. -Eleanor Roosevelt
>>>
>>>
>>>
>>>
>>> ________________________________________
>>> From: Benilton Carvalho [beniltoncarvalho at gmail.com]
>>> Sent: Wednesday, June 12, 2013 8:25 PM
>>>
>>> To: Johnson, Franklin Theodore
>>> Cc: bioconductor at r-project.org
>>> Subject: Re: [BioC] PAIR files -- feature set table
>>>
>>> That does not look ok.
>>>
>>> The problem is the count for the featureSet table... This table stores
>>> the information for "genes" (or whatever the target for this
>>> particular array is)... so, it is unlikely that you have a microarray
>>> with only 2 "target units"... I'd expect something around the
>>> thousands...
>>>
>>> pdInfoBuilder uses the information in SEQ_ID (in the NDF) to get the
>>> target information (i.e., the contents for featureSet).
>>>
>>> Given that this is a custom array, I believe that the best idea is to
>>> contact the person who designed it and ask more details about the
>>> design (in particular, how many probesets and average number of probes
>>> per probeset)...
>>>
>>> I've seen some designs in which the information that was expected to
>>> be in SEQ_ID was actually stored in PROBE_ID (in such cases, the user
>>> needs to create a backup copy of the NDF, and then move the contents
>>> of PROBE_ID to SEQ_ID - and vice-versa).
>>>
>>> b
>>>
>>> 2013/6/12 Johnson, Franklin Theodore <franklin.johnson at email.wsu.edu>:
>>>> Dear Dr. Carvalho,
>>>>
>>>> Recently, we had cooresponence regaring makePDInfoPackage for an NimbleGen
>>>> apple microarray.
>>>> I was able to merge the ndf design and XYS files using PROBE_ID.
>>>> As a reminder this is a custom array, and there are no SIGNAL==NAs for
>>>> control probes.
>>>> It seemed to work:
>>>>> makePdInfoPackage(seed, destDir(""))
>>>>
>>>> ============================================================================================================================================================
>>>> Building annotation package for Nimblegen Expression Array
>>>> NDF: GPL11164.ndf
>>>> XYS: XYS.txt
>>>>
>>>> ============================================================================================================================================================
>>>> Parsing file: GPL11164.ndf... OK
>>>> Parsing file: XYS.txt... OK
>>>> Merging NDF and XYS files... OK
>>>> Preparing contents for featureSet table... OK
>>>> Preparing contents for bgfeature table... OK
>>>> Preparing contents for pmfeature table... OK
>>>> Creating package in
>>>> C:/Users/franklin.johnson.PW50-WEN/Desktop/Test/Yanmin's Microarray
>>>> Paper/Yanmin Microarray RAW/pd.gpl11164
>>>> Inserting 2 rows into table featureSet... OK
>>>> Inserting 765524 rows into table pmfeature... OK
>>>> Inserting 5075 rows into table bgfeature... OK
>>>> Counting rows in bgfeature
>>>> Counting rows in featureSet
>>>> Counting rows in pmfeature
>>>> Creating index idx_bgfsetid on bgfeature... OK
>>>> Creating index idx_bgfid on bgfeature... OK
>>>> Creating index idx_pmfsetid on pmfeature... OK
>>>> Creating index idx_pmfid on pmfeature... OK
>>>> Creating index idx_fsfsetid on featureSet... OK
>>>> Saving DataFrame object for PM.
>>>> Saving DataFrame object for BG.
>>>> Done.
>>>> Warning message:
>>>> In is.na(ndfdata[["SIGNAL"]]) :
>>>> is.na() applied to non-(list or vector) of type 'NULL'
>>>>>
>>>>
>>>> In contrast to this warning message, I see a pdinfopackage directory with
>>>> 4 subdirectories: c=("data", "inst", "man", R"), as well as
>>>> subsubdirectories in "inst"=c("extdata", and "Unit Tests"), in addition to
>>>> two text files in the main directory: c=("DESCRIPTION", "NAMESPACE") were
>>>> created in my destination folder.
>>>> Before using "oligo", if possible, I wanted to confirm with you that this
>>>> package is viable to use with "oligo" although a warning message that may
>>>> not pertain to my custom designed microarray was printed.
>>>>
>>>> Regards,
>>>> Franklin
>>>>
>>>> Great minds discuss ideas. Average minds discuss events. Small minds
>>>> discuss people. -Eleanor Roosevelt
>>>>
>>>>
>>>>
>>>>
>>>> ________________________________________
>>>> From: Johnson, Franklin Theodore
>>>> Sent: Friday, June 07, 2013 10:39 AM
>>>> To: Benilton Carvalho
>>>> Cc: bioconductor at r-project.org
>>>> Subject: RE: [BioC] PAIR files -- feature set table
>>>>
>>>> Resending to bioconductor message thread:
>>>>
>>>> Dear Dr. Carvalho,
>>>> Thanks for the response.
>>>> As you suggested, I will look into the merge function using "Probe_ID".
>>>> After reading in the data, I will start here: merge.datasets(dataset1,
>>>> dataset2, by="key").
>>>> Best Regards,
>>>> Franklin
>>>>
>>>> Great minds discuss ideas. Average minds discuss events. Small minds
>>>> discuss people. -Eleanor Roosevelt
>>>>
>>>> ________________________________________
>>>> From: Benilton Carvalho [beniltoncarvalho at gmail.com]
>>>> Sent: Thursday, June 06, 2013 8:11 PM
>>>> To: Johnson, Franklin Theodore
>>>> Cc: bioconductor at r-project.org; franklin.johnson at wsu.edu
>>>> Subject: Re: [BioC] PAIR files -- feature set table
>>>>
>>>> You will need to merge the PAIR and the NDF using the PROBE_ID column
>>>> as key. This will allow you to get the X/Y coordinates needed to
>>>> create the XYS as described on the other messages.
>>>>
>>>> Regarding annotation, you may need to contact NimbleGen to request
>>>> this information directly from them...
>>>>
>>>> benilton
>>>>
>>>> 2013/6/6 Johnson, Franklin Theodore <franklin.johnson at email.wsu.edu>:
>>>>> Dear Dr. Carvalho,
>>>>>
>>>>> Muchos grasias for the reply.
>>>>>
>>>>> Actually, this is what my .ndf file looks like:
>>>>>> head(ndf)
>>>>>   PROBE_DESIGN_ID   CONTAINER DESIGN_NOTE SELECTION_CRITERIA SEQ_ID
>>>>> 1  7552_0343_0009 Duplicate_1
>>>>> 2  7552_0345_0009 Duplicate_2
>>>>> 3  7552_0347_0009 Duplicate_1
>>>>> 4  7552_0349_0009 Duplicate_2
>>>>> 5  7552_0351_0009 Duplicate_2
>>>>> 6  7552_0353_0009 Duplicate_1
>>>>>                                                PROBE_SEQUENCE MISMATCH
>>>>> MATCH_INDEX FEATURE_ID ROW_NUM COL_NUM PROBE_CLASS
>>>>> 1  cttgactcttctaagttcaaaggtaactcaagtgaagctgtcagatatgatccttcca        0
>>>>> 64535488   64535488       9     343
>>>>> 2 cccaagcattaaaccttactcatatacttataatgcagccatcaagagtttgtgcaagg        0
>>>>> 64799310   64799310       9     345
>>>>> 3          agggaggctgaaagagagagtgaatggtccagctgggcataattgctgca        0
>>>>> 64476989   64476989       9     347
>>>>> 4          ttgttggtgggggtgttgcccttagtaccccagaccttgaagcagttaaa        0
>>>>> 64862794   64862794       9     349
>>>>> 5          gtgtggggccccctttctttaactggaacctttctttgaagcaatttggg        0
>>>>> 64832726   64832726       9     351
>>>>> 6          ttgtccaattccaacatgccgagacggcagggattgtgatcgtgttgttc        0
>>>>> 64435686   64435686       9     353
>>>>>                       PROBE_ID POSITION DESIGN_ID   X Y
>>>>> 1    Contig19819_1_f_28_10_535        0      7552 343 9
>>>>> 2 Malus_CN899188_2_f_147_1_755        0      7552 345 9
>>>>> 3  Contig20738_8_r_1179_2_1432        0      7552 347 9
>>>>> 4 Malus_CN880097_2_r_336_2_536        0      7552 349 9
>>>>> 5 Malus_CN918117_2_f_632_1_781        0      7552 351 9
>>>>> 6     Contig1991_1_f_71_2_1239        0      7552 353 9
>>>>>
>>>>> The pair files, .532 pair files only (one-color arrays), only obtain the
>>>>> probe ID and signal; after some text at the top describing the experiment.
>>>>> My real issue is that I can further normalize and analyze the RMA files with
>>>>> sva and limma, etc. However, I cannot annotate the probes without the array
>>>>> annotation, as there are duplicates in the ndf file which are removed in the
>>>>> RMA.pair files available on NCBI/GEO. So they will not match in any
>>>>> annotation package I've failed at trying.
>>>>> So, I' tried to go back and start from the raw pair files...this custom
>>>>> array is really a "custom" array without
>>>>> NimbleScan.
>>>>>
>>>>> Salud,
>>>>> Franklin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Great minds discuss ideas. Average minds discuss events. Small minds
>>>>> discuss people. -Eleanor Roosevelt
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: Benilton Carvalho [beniltoncarvalho at gmail.com]
>>>>> Sent: Wednesday, June 05, 2013 6:42 PM
>>>>> To: FRANKLIN JOHNSON [guest]
>>>>> Cc: bioconductor at r-project.org; franklin.johnson at wsu.edu; pdInfoBuilder
>>>>> Maintainer
>>>>> Subject: Re: [BioC] PAIR files -- feature set table
>>>>>
>>>>> It's an unfortunate mistake to have the pairFile *argument* in the
>>>>> call (not in the slots session, but I see your point). :-( I'll make
>>>>> sure that this is fixed.
>>>>>
>>>>> You need to convert the PAIR files to XYS...
>>>>>
>>>>> Some refs that should help you in the process:
>>>>>
>>>>> https://stat.ethz.ch/pipermail/bioconductor/2012-January/043186.html
>>>>>
>>>>> http://comments.gmane.org/gmane.science.biology.informatics.conductor/27547
>>>>>
>>>>> b
>>>>>
>>>>> 2013/6/5 FRANKLIN JOHNSON [guest] <guest at bioconductor.org>:
>>>>>>
>>>>>> Dear Maintainer,
>>>>>>
>>>>>> I downloaded available NimbleGen 'single channel' 532.PAIR files for a
>>>>>> custom built expression microarray from NCBI/GEO (GPL11164). However, I get
>>>>>> an error message when I try to make the annotation for this platform using
>>>>>> pdInfoBuild.
>>>>>>
>>>>>> In pdInfoBuilder Reference Manual (June 5, 2013), under the
>>>>>> NgsExpressionPDInfoPkgSeed method, there is a slot for pairFile, although,
>>>>>> showClasses("Ngs.."), does not show a slot for this, only, XYS. Thus, I
>>>>>> changed the .pair file extension to .xys.
>>>>>>
>>>>>> (ndf<- list.files(getwd(), pattern=".ndf", full.names=TRUE)) # read
>>>>>> annotation file
>>>>>> [1] "C:/Users/franklin.johnson.PW99-WEN/Desktop/Test/Yanmin's Microarray
>>>>>> Paper/Yanmin Microarray RAW/GPL11164.ndf"
>>>>>>
>>>>>> (xys <- list.files(getwd(), pattern = ".xys", full.names = TRUE)[1])
>>>>>> [1] "C:/Users/franklin.johnson.PW99-WEN/Desktop/Test/Yanmin's Microarray
>>>>>> Paper/Yanmin Microarray RAW/GSM618107_14418002_532.xys"
>>>>>>
>>>>>> But, doing this resulted in an error message:
>>>>>> seed <- new("NgsExpressionPDInfoPkgSeed", ndfFile = ndf, xysFile = xys,
>>>>>> author = "FJ", organism = "Apple", species = "Malus x Domestica cv.GD")
>>>>>>
>>>>>> makePdInfoPackage(arrays, destDir = getwd())
>>>>>>
>>>>>> ============================================================================================================================================
>>>>>> Building annotation package for Nimblegen Expression Array
>>>>>> NDF: GPL11164.ndf
>>>>>> XYS: GSM618107_14418002_532.xys
>>>>>>
>>>>>> ============================================================================================================================================
>>>>>> Parsing file: GPL11164.ndf... OK
>>>>>> Parsing file: GSM618107_14418002_532.xys... OK
>>>>>> Merging NDF and XYS files... OK
>>>>>> Preparing contents for featureSet table... Error in
>>>>>> `[.data.frame`(ndfdata, , colsFS) : undefined columns selected
>>>>>> In addition: Warning message:
>>>>>> In is.na(ndfdata[["SIGNAL"]]) :
>>>>>>   is.na() applied to non-(list or vector) of type 'NULL'
>>>>>>
>>>>>> The only files available from NCBI/GEO are 24 PAIR files and 1 ndf. It
>>>>>> seems .xys has a different arrangement than .pair, thus .ndf is not
>>>>>> applicable to annotate the .pair file? Any suggestions?
>>>>>> Hope to hear from you soon.
>>>>>> Franklin
>>>>>>
>>>>>>  -- output of sessionInfo():
>>>>>>
>>>>>>> sessionInfo()
>>>>>> R version 3.0.1 (2013-05-16)
>>>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>>>
>>>>>> locale:
>>>>>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>>>>> States.1252    LC_MONETARY=English_United States.1252
>>>>>> [4] LC_NUMERIC=C                           LC_TIME=English_United
>>>>>> States.1252
>>>>>>
>>>>>> attached base packages:
>>>>>>  [1] tcltk     grid      parallel  stats     graphics  grDevices utils
>>>>>> datasets  methods   base
>>>>>>
>>>>>> other attached packages:
>>>>>>  [1] pdInfoBuilder_1.24.0 oligo_1.24.0         oligoClasses_1.22.0
>>>>>> affxparser_1.32.1    RSQLite_0.11.4       DBI_0.2-7
>>>>>>  [7] Mfuzz_2.18.0         DynDoc_1.38.0        widgetTools_1.38.0
>>>>>> e1071_1.6-1          class_7.3-7          gplots_2.11.0.1
>>>>>> [13] KernSmooth_2.23-10   caTools_1.14         gdata_2.12.0.2
>>>>>> gtools_2.7.1         timecourse_1.32.0    MASS_7.3-26
>>>>>> [19] Biobase_2.20.0       BiocGenerics_0.6.0   limma_3.16.5
>>>>>> ggplot2_0.9.3.1      BiocInstaller_1.10.1
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>>  [1] affyio_1.28.0         Biostrings_2.28.0     bit_1.1-10
>>>>>> bitops_1.0-5          codetools_0.2-8       colorspace_1.2-2
>>>>>>  [7] dichromat_2.0-0       digest_0.6.3          ff_2.2-11
>>>>>> foreach_1.4.0         GenomicRanges_1.12.4  gtable_0.1.2
>>>>>> [13] IRanges_1.18.1        iterators_1.0.6       labeling_0.1
>>>>>> marray_1.38.0         munsell_0.4           plyr_1.8
>>>>>> [19] preprocessCore_1.22.0 proto_0.3-10          RColorBrewer_1.0-5
>>>>>> reshape2_1.2.2        scales_0.2.3          splines_3.0.1
>>>>>> [25] stats4_3.0.1          stringr_0.6.2         tkWidgets_1.38.0
>>>>>> tools_3.0.1           zlibbioc_1.6.0
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives:
>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>



More information about the Bioconductor mailing list